Azure Synapse data lakehouse
Customer presentation
OQuila helps organisations to transform to
a data-driven organisation.
About OQuila
Data & Analytics, Internet of Things and Application
Innovation solutions
Joining forces with established IT company
Innovation & transformation with trusted technologies
Evolution of data platforms
Data Lake vs Data Warehouse
Data Lake
Schema on read; answers also the
questions of tomorrow
Scales without limits
Can hold any type of data
Data Warehouse
Schema on write; answers the
questions of today
Mainly for relational data (tables
and rows)
Can be part of an Enterprise data
lake or lakehouse
General principles OQuila Achitecture
Use of standard components
100% Cloud Services: PaaS or
SaaS. No installations or Virtual
No custom development
Use of components within the
same ecosystem: e.g. Microsoft
Azure Synapse
Minimize maintenance by using
Services (maintained by
Dynamic and scalable
Agile Data Model
No traditional schema or fixed model
No rework when adding additional sources
RAW and CURATED stores data separately
Preparations/calculations are done in STAGED environnment and are reusable
Supports changes to business rules with ease
Schema on read; answers also the questions of tomorrow
Data Sources
Azure Synapse Analytics
Data Lake
Gen 2
Cleansing and Transformations via Spark clusters
Synapse Pipelines
On demand
SQL pool
Power BI
Synapse Data Flow: Monitoring Quality of Data
Power Apps
Azure Machine Learning
Synapse components
Data pipelines:
A lot of standard connectors (SQL, Oracle, CSV, API, …)
Data extraction from online and on-prem systems
Add new systems easily
Data Lake:
RAW, STAGE and CURATED folders (level maturity en correctness data)
Parquet files to be able to work efficiently with large amounts of data
Spark Cluster:
Performant transformation and cleansing actions via notebooks
Transfers “edited” data to the next stage (RAW, STAGE, CURATED)
Synapse Data flows:
Definition business rules via graphical designer (missing values, inconsistencies, …)
Puts anomalies in a separate STAGE environment
Synapse components
On demand SQL Pool:
Build in in Azure Synapse
Links directly to Parquet files in CURATED zone (without having to copy data to tables).
Row level security
Allows to access data via:
Power BI
Automation tools
Synapse Data Flow
Our PoV/PoC approach
Dream Big, Start Small, Grow Fast
Synapse based Data
Proof of Value
Rollout 2
Rollout 3
Rollout 4
Proof of Concept Project approach
Make smart choices about the scope
Define the ‘low hanging fruit’ data sources eligible for the PoC
Define a quick-win report
Define a lean & mean project team
After kick-off OQuila will
Set-up the Azure environment
Set-up the OQuila’s Synapse Data lakehouse framework
Set-up and deploy the selected data pipeline(s)
Build the report
Document the solution
Present the solution
Ready for use and grow!
Thank you !