...
WEBHDFS URI are like that : http://namenodedns:port/user/hdfs/folder/file.csv
Default port is 50070
Impala Connection
...
Code Block | ||
---|---|---|
| ||
# ====== Reading table ====== # Selecting data with a SQL query #limit=None to get the whole table, otherwise will only get 10000 first lines requete = client.sql('select * from helloworld') df = requete.execute(limit=None) |
How to write an Impala table with Impala tables sources in Python ?
Code example
Code Block | ||
---|---|---|
| ||
# Write in table C the join between tables A and B
client.raw_sql('CREATE TABLE c STORED AS PARQUET AS SELECT a.col1, b.col2 FROM a INNER JOIN b ON (a.id=b.id)')
# No data is incomming in Python |