R - HDFS with high availability


Preamble

This article is for using Python on an HDFS with high availability option. The particularity of high availability is to have two namenodes for one HDFS, in case of failure.

Libraries dependency

library(httr)


Namenode search

hdfsUriNn1 <- 'http://nn1:50070/webhdfs/v1'
hdfsUriNn2 <- 'http://nn2:50070/webhdfs/v1'
hdfsUri <- ''
tryCatch(
  {
    GET(hdfsUriNn1)
    hdfsUri<-hdfsUriNn1
  },
  error=function(cond) {
    tryCatch(
      {
        GET(hdfsUriNn2)
        hdfsUri<-hdfsUriNn2
      },
      error=function(cond) {
        message(paste("No namenode connection available. ",cond))
        return(NA)
      }
    )
  }
)