Lexicon
Anonymization
"Anonymization" of data means processing it with the aim of irreversibly preventing the identification of the individual to whom it relates. Data can be considered anonymised when it does not allow identification of the individuals to whom it relates, and it is not possible that any individual could be identified from the data by any further processing of that data or by processing it together with other information which is available or likely to be available.
Attribute name
In Saagie data governance, there are names given to a field.
Category
In Saagie Data Governance, a category means that the field has a limited set of values.
Consent
Any freely given, specific, informed and unambiguous indication of his or her wishes by which the data subject, either by a statement or by a clear affirmative action, signifies agreement to personal data relating to them being processed.
Data status
In Saagie Data Governance, Data status allows to indicate status of dataset :
- Raw data
- Intermediate data
- Final data
- Not specified
Database
A database is a collection of information that is organized so that it can be easily accessed, managed and updated.
Data is organized into rows, columns and tables, and it is indexed to make it easier to find relevant information.
Dataset
A dataset is a collection of related, discrete items of data that may be accessed individually or in combination or managed as a whole entity. A dataset is organized into some type of data structure.
Dataset can have 3 types :
- TABLE
- DIRECTORY
- FILE
Domain
In Saagie Data Governance, domains are used to group a set of datasets by theme. They can correspond to the departments in the company for example. They will facilitate the exploration of the data lake.
Entity name(s)
In Saagie Data Governance, there are names given to a dataset.
Entry Date
In Saagie Data Governance, Entry data of personal data is registration date, date the data was created.
Field
In a database table, a field is a data structure for a single piece of data.
GDPR
The General Data Protection Regulation (GDPR) is a legal framework that sets guidelines for the collection and processing of personal information of individuals within the European Union (EU). The GDPR sets out the principles for data management and the rights of the individual, while also imposing fines that can be revenue based. The General Data Protection Regulation covers all companies that deal with the data of EU citizens, so it is a critical regulation for corporate compliance officers at banks, insurers, and other financial companies. GDPR will come into effect across the EU on May 25, 2018.
Official Journal of the European Union
Journal officiel de l'Union Européenne
Global ranking
In Saagie data governance, it represents quality rank on a dataset = Trust Tag x Status Tag x Named Entity
Trust Tag :
- Verified Good : 2
- Verified Bad : 0.4
- In verification : 1
- N/A : 0.8
Status Tag :
- Final : 1.2
- Intermediate : 1.0
- Raw : 0.8
- N/A : 0.7
Named Entity :
- Input by user : 1.1
- No, empty, null named : 1.0
Master data
Master data means that for a table, the field/name attribute is the master, so the reference value.
Personal data
According to the law, personal data means any information relating to an identified or identifiable individual; an identifiable person is one who can be identified, directly or indirectly, in particular by reference to an identification number (e.g. social security number) or one or more factors specific to his physical, physiological, mental, economic, cultural or social identity (e.g. name and first name, date of birth, biometrics data, fingerprints, DNA…).
Primary key
A primary key is a special relational database table column (or combination of columns) designated to uniquely identify all table records.
Provenance
Provenance is a source from which the dataset comes.
Pseudonymization
Secondary key
The fields in a table which have not been selected to be the primary key, but are considered to be the candidate keys for the primary key are referred to as Secondary Keys.”
Sensitive Data
Personal data that reveals, directly or indirectly, the racial and ethnic origins, the political, philosophical, religious opinions or trade union affiliation of persons, or related to their health or sexual life.
Trust level
In Saagie Data Governance, trust level allows to indicate trust level of dataset :
- Verified good
- Verified bad
- In verification
- Not verified