Loading the player...

Technical Tips For Secure Apache Hadoop Cluster

  • It is well known that Apache Hadoop is not secure by default, and it is easy to impersonate a privileged user if Kerberos is not enabled. In this talk, we will introduce further technical details about security. We will talk about the current “open-source” software capability for some security features in a vendor-neutral manner. We mainly introduce SSL/TLS for Hadoop and its related products (Apache Spark, Apache Hive, Apache ZooKeeper, etc.) and introduce HDFS data encryption. Regarding HDFS data encryption, we implemented Hadoop Credential Provider API to integrate Hadoop KMS with our internal credential store to store the secrets for encrypting/decrypting data encryption key. We also introduce the internals of HDFS data encryption and how to create your custom credential providers.
    In addition, we will introduce the automatic keystore reloading, which is the latest feature of Apache Hadoop, and it enables updating SSL certificates without stopping the server. The feature is useful for operation.
    Akira Ajisaka: Akira Ajisaka is a senior software engineer at Yahoo Japan Corporation. He develops and validates some new features of Apache Hadoop for our use. Also, he troubleshoots and improves management/operations in our Hadoop clusters. He maintenances Apache Hadoop to improve its quality as an Apache Hadoop/Yetus committer and PMC member.

    Category : Apache Hadoop


    0 Comments and 0 replies
Privacy policyAccept and close