Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Split file as a zip archive on your computer using 7-Zip (Windows) or the split utility (Linux) into multiples parts under 1 gigabyte (resulting in files named myfile.zip.001myfile.zip.002myfile.zip.003 ...)
  2. Upload the parts to HDFS via HueHUE
  3. Create and run a SQOOP job as follows (replace the values for the mypath and myzip variables accordingly):

    Code Block
    languagebash
    themeRDark
    linenumberstrue
    mypath="/hdfspath/to/data/"
    myzip="name of myfile" # without the .zip extension
    
    hadoop fs -chmod 777 "$mypath"
    
    hadoop fs -ls "$mypath$myzip.zip".*
    
    hadoop fs -cat "$mypath$myzip.zip".* > file.zip
    ls -la
    unzip file.zip -d "$myzip"
    ls -la "$myzip/"
    
    hadoop fs -put -f $( echo "$myzip/" | sed s/\ /\%20/g ) "$mypath"