View Source

Preamble

To create the different jobs displayed in this article, you have to create a repository : With VALUES

Create a new job
Add the component "tHDFSConnection" : Allows the creation of a HDFS connection.
Add the component "tHDFSList": List the different files contents in the hdfs folder.
Add the component "tHDFSProperties": Display the properties of the different files (Example : mode, time, directory name...)
Add the component "tLogRow': Display the result.
Create links:
- "tHDFSConnection" is connected with "tHDFSList" (through "OnSubjobOk")
- "tHDFSList" is connected with "tHDFSProperties" (through "Iterate")
- "tHDFSProperties" is connected with "tLogRun" (through "Main")

Saagie User Group Wiki > Talend - List file in HDFS > Lister_File_HDFS.PNG

Saagie User Group Wiki > Talend - List file in HDFS > tHDFSConnection.PNG

Saagie User Group Wiki > Talend - List file in HDFS > tHDFSList.PNG

Double click on "tHDFSProperties" :
- Tick "Use an existing connection"
- Add a file: ((String)globalMap.get("tHDFSList_1_CURRENT_FILEPATH"))
  This command use the current file of the component tHDFS_List.

Saagie User Group Wiki > Talend - List file in HDFS > tHDFSProperties.PNG