/
Java - HDFS with High Availability

Java - HDFS with High Availability

Some plateforms are configured with High Availability, providing 2 Namenodes : 1 active, 1 passive in order to provide high availability in case of failure. Learn more here.

If you wan to write a Java client that fully benefits from this feature, you need to specifiy in the Hadoop configuration of your application the following lines : 


Configuration conf = new Configuration();
conf.set("fs.hdfs.impl", org.apache.hadoop.hdfs.DistributedFileSystem.class.getName());
conf.set("fs.file.impl", org.apache.hadoop.fs.LocalFileSystem.class.getName());
conf.set("fs.defaultFS", "hdfs://cluster");
conf.set("fs.default.name", conf.get("fs.defaultFS"));
conf.set("dfs.nameservices", "cluster");
conf.set("dfs.ha.namenodes.cluster", "nn1,nn2");
conf.set("dfs.namenode.rpc-address.cluster.nn1", "<url_of_your_namenode_1>:8020");
conf.set("dfs.namenode.rpc-address.cluster.nn2", "<url_of_your_namenode_2>:8020");
conf.set("dfs.client.failover.proxy.provider.cluster","org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider");

The above configuration can be used as is. Simply replace the dfs.namenode.rpc-address.cluster.nn1 and dfs.namenode.rpc-address.cluster.nn2 parameters with the hostnames of your two namenodes. 

Related content

Java - Read & Write files with HDFS
Java - Read & Write files with HDFS
More like this
Talend - HDFS with high availability
Talend - HDFS with high availability
More like this
R - HDFS with high availability
R - HDFS with high availability
More like this
Python - HDFS with high availability
Python - HDFS with high availability
More like this
Manual upload of big files to HDFS
Manual upload of big files to HDFS
More like this
R - Read & Write files from HDFS
R - Read & Write files from HDFS
More like this