R - Upload H2O library to HDFS

Upload

If you want a faster / more reliable installation and make sure everyone works with the same version of H2O on your platform, you can upload the library to HDFS.

Note: This code must be launched from a capsule and not from a notebook in order to work.

# H2O install zip
download.file('http://h2o-release.s3.amazonaws.com/h2o/rel-wright/2/h2o-3.20.0.2.zip', destfile = 'h2o-3.20.0.2.zip')
system('unzip h2o-3.20.0.2.zip', intern = T)

# Create folder and change permissions
system('hdfs dfs -mkdir -p /user/h2o/', intern = T)
system('hdfs dfs -chown -R root:hadoop /user/h2o/', intern = T)

# R library
system('hdfs dfs -mkdir -p /user/h2o/install_R', intern = T)
system('hdfs dfs -put h2o-3.20.0.2/R/h2o_3.20.0.2.tar.gz /user/h2o/install_R/', intern = T)

# Python library
# Uncomment if you also want to upload python library to hdfs
# system('hdfs dfs -mkdir -p /user/h2o/install_python', intern = T)
# system('hdfs dfs -put h2o-3.20.0.2/python/h2o-3.20.0.2-py2.py3-none-any.whl /user/h2o/install_python/', intern = T)

Install

Once your library is on HDFS, you can install the R library with this line of code:

install.packages('http://nn1:50070/webhdfs/v1/user/h2o/install_R/h2o_3.20.0.2.tar.gz?op=OPEN', repos = NULL, type = 'source')