R - Install package "Youtube" in your platform
The package "Youtube" allow to recover data in a channel Youtube.
Data add directly in HDFS.
Step 1 : Download "youtube.tar"
Step 2 : Upload "youtube.tar" in a platform Saagie
Example "Command to launch the job" :
Rscript install.R $KEY_API_YOUTUBE $IDCHANNEL_API_YOUTUBE $WEBHDFS_API_YOUTUBE
$KEY_API_YOUTUBE : Create your KEY API with https://developers.google.com/youtube/v3/getting-started
$WEBHDFS_API_YOUTUBE : http://IP_HDFS:PORT_HDFS/webhdfs/v1
$IDCHANNEL_API_YOUTUBE : Look URL
Example with the channel Saagie : UCvCLUuHrHVovgWL6DgRdiAQ
:
Step 3 : Run a job
The job create two files in Hadoop File Distributed (Explore datalake in a directory "user/hdfs/youtube/").
The first file recover data activities :
Kind, idActivity, publishedAt, channelId, title, description, channelTitle, type, upload_videoId, subscription_resourceId_kind, subscription_resourceId_channelId
- https://developers.google.com/youtube/v3/docs/activities
The second file recover data videos :
- kind, etag, id, publisedAt, channelId, title, description, tags, duration, privacyStatus, viewCount, likeCount, dislikeCount, favoriteCount, commentCount, timeRecoverData
- https://developers.google.com/youtube/v3/docs/videos/
Warning : The separator of files is "\001".