Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Current »

Global Architecture

Global Schema

Network Architecture

Available Technologies ("Capsules")

Extraction & Processing

  • SQOOP
  • Java, Scala, Kotlin
  • Apache Spark (version 1.5, 1.6, 2.0, 2.1)
  • R
  • Python (branch 2.x and 3.X)
  • Docker
  • Notebooks :
    • Jupyter : Python
    • Jupyter: Python with PySpark 2.1
    • Jupyter: R
    • Jupyter: Julia
    • Jupyter: Haskell
    • Spark Notebook : 1.5
    • Spark Notebook : 1.6

Datalake

  • HDFS : 2.6
  • Hive
  • Impala : 2.5
  • Drill
  • Kafka : 0.10

Datamart

  • Mongo DB
  • MySQL

Dataviz

  • Docker

Ressource Management

Zoom on the hardware architecture

How jobs impact the available servers

Schema Full Saagie

Schema Saagie on Top of a Datalake

Rules

Node typesResident ServicesScheduled or Streaming JobsComments
Data Node

HDFS

Yarn/Map reduce (aslo Hive)

Impala

Drill

Docker

Spark

R

Python

Sqoop

Talend

Java-Scala

Datascience Notebook (depends of your settings)


Datamart

Mongo Db

MySQL

PostGreSQL (1.5)

Elastic Search (1.5)



Dataviz

Docker

Datascience Notebook (depends of your settings)


Kafka NodeKafka

Compute Edge Node
Datascience Notebook (depends of your settings)
GPU Edge Node


  • No labels