...
Global Schema
Network Architecture
Available Technologies ("Capsules")
Extraction & Processing
- SQOOP
- Java, Scala, Kotlin
- Apache Spark (version 1.5, 1.6, 2.0, 2.1)
- R
- Python (branch 2.x and 3.X)
- Docker
- Notebooks :
- Jupyter : Python
- Jupyter: Python with PySpark 2.1
- Jupyter: R
- Jupyter: Julia
- Jupyter: Haskell
- Spark Notebook : 1.5
- Spark Notebook : 1.6
Datalake
- HDFS : 2.6
- Hive
- Impala : 2.5
- Drill
- Kafka : 0.10
Datamart
- Mongo DB
- MySQL
Dataviz
- Docker
Ressource Management
Zoom on the hardware architecture
How jobs impact the available servers
Schema Full Saagie
Schema Saagie on Top of a Datalake
Rules
...
HDFS
Yarn/Map reduce (aslo Hive)
Impala
Drill
...
Docker
Spark
R
Python
Sqoop
Talend
Java-Scala
Datascience Notebook (depends of your settings)
...
Mongo Db
MySQL
PostGreSQL (1.5)
Elastic Search (1.5)
...
Docker
Datascience Notebook (depends of your settings)
...