SQOOP - Import data from SQL Server

Preamble

This article describe how import data from Microsoft SQL Server database.

Be sure to have enought rights to access database.

Download data from table

Sqoop import
#Driver
driver=sqlserver					
#IP or DNS
ip=127.0.0.1
#Port
port=1433
#User
username=myuser
#Password
password=mypwd
#Database
database=mydb
#Table
table=mytable
#Folder in HDFS
hdfsdest=/user/hdfs/$table

#Importation of tables
sqoop import --connect "jdbc:$driver://$ip:$port;DatabaseName=$database"	 --username $username --password $password \
--target-dir $hdfsdest \
--num-mappers 1 \
--table $table

 

 

Options

OptionOperation
--verboseShow more logs
--delete-target-dirDelete HDFS directory before import
--query "SELECT * FROM $table WHERE \$CONDITIONS"Import data from a request instead of a table. The option --table must be removed from command line
--num-mappers n (replace n by a number)Cut the data flow in n data flow, n is the number of datanodes. This option must be used with option --split-by 'column' witch specify the column to split.
--as-textfileImport data in text mode
--null-non-string '\\N'Change null values of non-string type in '\\N'. Used for Impala and Hive
--null-string '\\N'Change null values of string type in '\\N'. Used for Impala and Hive
--hive-importCreate a Hive/Impala table with imported data
--hive-table $tableSpecify the name of the Hive/Impala table
--hive-overwriteOverwrite the data of the table. If this option is not present data are concatenate
--hive-database db (change db by your database)Specify the name of the Hive/Impala database
--hive-drop-import-delimsDrops \n, \r, and \01 from string fields when importing to Hive