Github project : example-R-Query-Insert-From-Hive
Dependencies
R package
rjdbc : https://cran.r-project.org/web/packages/RJDBC/index.html
This allows R to connect to any DBMS that has a JDBC driver.
Hive JDBC drivers
Download the Hive JDBC drivers
Parameters
- IP : Internet Protocol
- Port
- User_HDFS
- Password_HDFS
- Hive_Jdbc_Folder_Path : Path where the folder Hive JDBC is here.
Code explanation for Query and Insert from Hive
Loading JDBC driver and connection
Code Block |
---|
drv <- JDBC(driverClass = "org.apache.hive.jdbc.HiveDriver", classPath = list.files("Hive_Jdbc_Folder_Path",pattern="jar$",full.names=T), identifier.quote="`") hiveConnectionUrl <- "jdbc:hive2://IP:Port/;ssl=false" conn <- dbConnect(drv, hiveConnectionUrl, "User_HDFS", "Password_HDFS") |
Query and Insert : Examples
Show Tables
Code Block |
---|
# All databases dbListTables(conn) # The database "default" dbGetQuery(conn, "show tables") |
Get all elements of a table
Code Block |
---|
# In database "default" d <- dbReadTable(conn, "table_name") # OR d <- dbGetQuery(conn, "select * from table_name") # Other that the database "default" d <- dbReadTable(conn, nameBDD.table_name) |
Create a table in parquet format
Code Block |
---|
dbSendUpdate(conn, "CREATE TABLE table_name (attribute1 string, attribute2 int) STORED AS PARQUET") |
Insert data into a table
Code Block |
---|
dbGetQuery(conn, "INSERT INTO table_name VALUES ('test', 1)") |