Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Current »

Gist Page : example-python-read-and-write-from-impala-with-security

Common part

Thrift-sasl

The script bellow don't work with thrift-sasl 0.3.0 but only with thrift-sasl 0.2.1.

Add thrift-sasl==0.2.1 to your requirement.txt file.

Libraries dependency

import ibis
import pandas as pd
import os

WEBHDFS URI

WEBHDFS URI are like that : http://namenodedns:port/user/hdfs/folder/file.csv

Default port is 50070

Impala Connection

Default port is 21050.

Connection

# Connecting to Impala by providing Impala host ip and port (21050 by default),credentials and a Webhdfs client
hdfs = ibis.hdfs_connect(host=os.environ['IP_HDFS'], port=50070)
client = ibis.impala.connect(host=os.environ['IP_IMPALA'], port=21050, hdfs_client=hdfs, user=os.environ['LDAP_USER'], password=os.environ['LDAP_PASSWORD'], auth_mechanism='PLAIN')

How to write an Impala table with Python ?

Code example

# Creating a simple pandas DataFrame with two columns
liste_hello = ['hello1','hello2']
liste_world = ['world1','world2']
df = pd.DataFrame(data = {'hello' : liste_hello, 'world': liste_world})

# Writing Dataframe to Impala if table name doesn't exist
db = client.database('default')
if not client.exists_table('helloworld'):
    db.create_table('helloworld', df)
    t = db['helloworld']
    t.execute()

How to query an Impala table with Python ?

Code example

# ====== Reading table ======
# Selecting data with a SQL query
#limit=None to get the whole table, otherwise will only get 10000 first lines
requete = client.sql('select * from helloworld')
df = requete.execute(limit=None)

How to write an Impala table with Impala tables sources in Python ?

Code example

# Write in table C the join between tables A and B
client.raw_sql('CREATE TABLE c STORED AS PARQUET AS SELECT a.col1, b.col2 FROM a INNER JOIN b ON (a.id=b.id)')
# No data is incomming in Python


  • No labels