Using encryption with Hadoop

Cloudera documentation says that Hadoop does not support disk encryption . Can I use hardware encrypted hard drives with Hadoop?

+4
source share
4 answers

If you installed the file system on a disk, Hadoop may use the disk. HDFS saves its data in a normal OS file system. Hadoop will not know if the drive is encrypted or not, and it doesn’t care.

+4
source

eCryptfs can be used to encrypt each file for each individual Hadoop node. It is rather tedious to configure, but it can certainly be done.

Gazzang offers a turnkey commercial solution built on top of eCryptfs to protect big data through encryption and partners with several Hadoop and NoSQL providers.

Gazzang Cloud Computing Platform for Big Data Organizations transparently encrypt data stored in the cloud or on the premises using advanced key management and access control based on list processes and ensuring compliance with security and compliance requirements.

Full disclosure: I am one of the authors and current developers of eCryptfs . I am also the chief architect of Gazzang and the lead developer.

+7
source

Hadoop does not directly support encryption, although a compression codec can be used for encryption / decryption. Here's more information on encryption and HDFS.

Regarding h / w based encryption, I think Hadoop should work on this. As Spike noted, HDFS is like any other Java application and saves it on regular OS file systems. FYI, MapR uses Direct I / O to improve HDFS performance.

+3
source

All Articles