To create the different jobs displayed in this article, you have to create a repository: With VALUES
Download a file Open Data and write file with HDFS (with all versions of Data Fabric)
Create a new job
Add the component "HDFSConnection" : Allows the creation of a HDFS connection.
Add the component "tREST" : Send a message for RESTful webservice and retrieve the response.
Add the component "tHDFSOutput" : Writes data to HDFS.
Create links:
"HDFSConnection" is connected with "tREST" (through "OnSubjobOk")
"tREST" is connected with "tHDFSOutput" (through "Main")
Double click on "tHDFSConnection" and set its properties:
Add a "Cloudera" distribution and select the latest version of Cloudera
Enter the Name Node URL. The URL has to respect this format : "hdfs://ip_hdfs:port_hdfs/" Use context variables if possible : "hdfs://"+context.IP_HDFS+":"+context.Port_HDFS+"/"