Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

WEBHDFS URI are like that : http://namenodedns:port/user/hdfs/folder/file.csv

Default port is 50070

Impala Connection

...

Code Block
languagepy
# ====== Reading table ======
# Selecting data with a SQL query
#limit=None to get the whole table, otherwise will only get 10000 first lines
requete = client.sql('select * from helloworld')
df = requete.execute(limit=None)

How to write an Impala table with Impala tables sources in Python ?

Code example

Code Block
languagepy
# Write in table C the join between tables A and B
client.raw_sql('CREATE TABLE c STORED AS PARQUET AS SELECT a.col1, b.col2 FROM a INNER JOIN b ON (a.id=b.id)')
# No data is incomming in Python