I am new to hadoop to learn more about backup and restore. I reworked backup and restore oracle, will this help in hadoop? Where should I start
There are several backup and restore options. As s.singh points out, data replication is not DR.
HDFS supports snapshot. This can be used to prevent user errors, restore files, etc. This suggests that this is not DR in the event of a complete failure of the Hadoop cluster. ( http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html )
It is best to back up off-site. It could be another cluster of Hadoop, S3, etc. And it can be done using distcp. ( http://hadoop.apache.org/docs/stable1/distcp2.html ), ( https://wiki.apache.org/hadoop/AmazonS3 )
Here is a slideshow from Cloudera discussing DR ( http://www.slideshare.net/cloudera/hadoop-backup-and-disaster-recovery )
Hadoop 1000 . , , . .
Namenode, namenode Hadoop
Namenode
namenode namnode. namenode , ( ) namenode.
- . namenode , - . . namenode , , .
. @brandon.bell.
HDFS DataTorrent DR HDFS .
https://www.datatorrent.com/apphub/hdfs-sync/
Apache Apex .
: HdfsUserGuide
SE:
Hadoop 2.0
Hadoop: HDFS
Hadoop 2.0 Node, Node Node
Hadoop Namenode?
Recovery_Mode:
, . , , .
, , ? NameNode, , .
You can run NameNode in recovery mode as follows: namenode -recover