Manual upload of big files to HDFS
Your browser might prevent you from manually uploading files over a few gigabytes to HDFS via HUE.
A workaround is:
- Split file as a zip archive on your computer using 7-Zip (Windows) or the
splitutility (Linux) into multiples parts under 1 gigabyte (resulting in files namedmyfile.zip.001,myfile.zip.002,myfile.zip.003...) - Upload the parts to HDFS via HUE
Create and run a SQOOP job as follows (replace the values for the
mypathandmyzipvariables accordingly):mypath="/hdfspath/to/data/" myzip="name of myfile" # without the .zip extension hadoop fs -chmod 777 "$mypath" hadoop fs -ls "$mypath$myzip.zip".* hadoop fs -cat "$mypath$myzip.zip".* > file.zip ls -la unzip file.zip -d "$myzip" ls -la "$myzip/" hadoop fs -put -f $( echo "$myzip/" | sed s/\ /\%20/g ) "$mypath"
, multiple selections available,