Securing HDFS for Intelligence

HDFS (Apache Hadoop Distributed File System) is a distributed file system, which is deployed on the worker nodes of the CDF cluster by default. The Intelligence analytics platform uses HDFS as a temporary storage to push analytics data to the ArcSight database. HDFS stores analytics data only when the write process is active.

Intelligence now allows you to secure access to HDFS with SASL (Simple Authentication and Security Layer). To secure HDFS, you can enable and configure Kerberos authentication services. When you configure HDFS to run in a secure mode, Kerberos authenticates each HDFS service and user. For authentication, Intelligence uses the Kerberos protocol, which is built on a trusted third-party encryption server, known as Key Distribution Center (KDC).

The secure data transfer between HDFS and the database is disabled by default. Enabling the secure data transfer between HDFS and the database will increase the run time of the analytics jobs.