SQOOP - Import data from SQL Server
Preamble
This article describe how import data from Microsoft SQL Server database.
Be sure to have enought rights to access database.
Download data from table
Sqoop import
#Driver driver=sqlserver #IP or DNS ip=127.0.0.1 #Port port=1433 #User username=myuser #Password password=mypwd #Database database=mydb #Table table=mytable #Folder in HDFS hdfsdest=/user/hdfs/$table #Importation of tables sqoop import --connect "jdbc:$driver://$ip:$port;DatabaseName=$database" --username $username --password $password \ --target-dir $hdfsdest \ --num-mappers 1 \ --table $table
Options
Option | Operation |
---|---|
--verbose | Show more logs |
--delete-target-dir | Delete HDFS directory before import |
--query "SELECT * FROM $table WHERE \$CONDITIONS" | Import data from a request instead of a table. The option --table must be removed from command line |
--num-mappers n (replace n by a number) | Cut the data flow in n data flow, n is the number of datanodes. This option must be used with option --split-by 'column' witch specify the column to split. |
--as-textfile | Import data in text mode |
--null-non-string '\\N' | Change null values of non-string type in '\\N'. Used for Impala and Hive |
--null-string '\\N' | Change null values of string type in '\\N'. Used for Impala and Hive |
--hive-import | Create a Hive/Impala table with imported data |
--hive-table $table | Specify the name of the Hive/Impala table |
--hive-overwrite | Overwrite the data of the table. If this option is not present data are concatenate |
--hive-database db (change db by your database) | Specify the name of the Hive/Impala database |
--hive-drop-import-delims | Drops \n, \r, and \01 from string fields when importing to Hive |