Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Current »

Github project : https://github.com/saagie/Create_Table_Hive_R


The R script to automatically create SQL tables Gross from an HDFS directory.

The script will create the database if it does not exist, then the script goes through all subdirectories of files, to create the raw Hive tables associated with the gz file of each subfolder.

To run the script:

- must upload 'Create_Table_Hive.tar' directly on the platform
- add this command:

Rscript Create_Table.R "http://IP_HDFS:PORT_HDFS/webhdfs/v1" "jdbc:hive2://IP_HIVE:PORT_HIVE/;ssl=false" "USER_HDFS" "PWD_HDFS" "NAME_BDD" "PATH_DIRECTORY" "SEPARATOR_FILE" "QUOTE_FILE"


Parameters

  • IP_HDFS: Internet Protocol of HDFS
  • PORT_HDFS: Port of HDFS
  • IP_HIVE: Internet Protocol of Hive
  • PORT_HIVE: Port of Hive
  • USER_HDFS: User of HDFS
  • PWD_HDFS: Password of HDFS
  • NAME_BDD: Name of database
  • PATH_DIRECTORY: path of the directoy
  • SEPARATOR_FILE: separator field in the files
  • QUOTE_FILE: quote field in the files



  • No labels