Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Current »

Dependencies

R package

rjdbc : https://cran.r-project.org/web/packages/RJDBC/index.html

This allows R to connect to any DBMS that has a JDBC driver.

Hive JDBC drivers

Download the Hive JDBC drivers

Parameters

  • IP : Internet Protocol
  • Port
  • User_HDFS
  • Password_HDFS
  • Hive_Jdbc_Folder_Path : Path where the folder Hive JDBC is here.

Code explanation for Query and Insert from Hive

Loading JDBC driver and connection

drv <- JDBC(driverClass = "org.apache.hive.jdbc.HiveDriver",
            classPath = list.files("Hive_Jdbc_Folder_Path",pattern="jar$",full.names=T),
            identifier.quote="`")
hiveConnectionUrl <- "jdbc:hive2://IP:Port/;ssl=false"
conn <- dbConnect(drv, hiveConnectionUrl, "User_HDFS", "Password_HDFS")

Query and Insert : Examples

Show Tables

# All databases
dbListTables(conn) 
# The database "default"
dbGetQuery(conn, "show tables")

Get all elements of a table

# In database "default"
d <- dbReadTable(conn, "table_name")
# OR
d <- dbGetQuery(conn, "select * from table_name")
# Other that the database "default"
d <- dbReadTable(conn, nameBDD.table_name)

Create a table in parquet format 

dbSendUpdate(conn, "CREATE TABLE table_name (attribute1 string, attribute2 int) STORED AS PARQUET")

Insert data into a table

dbGetQuery(conn, "INSERT INTO table_name VALUES ('test', 1)")
  • No labels