PowerPoint template

Azure Synapse data lakehouse

Customer presentation

Intro

| 2

OQuila helps organisations to transform to

a data-driven organisation.

About OQuila

• Data & Analytics, Internet of Things and Application

Innovation solutions

• Joining forces with established IT company

• Innovation & transformation with trusted technologies

Evolution of data platforms

Data Lake vs Data Warehouse

Data Lake

Schema on read; answers also the

questions of tomorrow

Scales without limits

Can hold any type of data

Data Warehouse

Schema on write; answers the

questions of today

Mainly for relational data (tables

and rows)

Can be part of an Enterprise data

lake or lakehouse

≠

Overview

| 7

General principles OQuila Achitecture

| 8© OQuila 2021

Use of standard components

100% Cloud Services: PaaS or

SaaS. No installations or Virtual

Machines

No custom development

Use of components within the

same ecosystem: e.g. Microsoft

Azure Synapse

Minimize maintenance by using

Services (maintained by

Microsoft)

Dynamic and scalable

Agile Data Model

• No traditional schema or fixed model

• RAW, STAGED, CURATED:

• No rework when adding additional sources

• RAW and CURATED stores data separately

• Preparations/calculations are done in STAGED environnment and are reusable

• Supports changes to business rules with ease

• Schema on read; answers also the questions of tomorrow

Data Sources

Azure Synapse Analytics

RAW STAGE

CURATED

Data Lake

Gen 2

Cleansing and Transformations via Spark clusters

Synapse Pipelines

On demand

SQL pool

Power BI

Synapse Data Flow: Monitoring Quality of Data

Validated

Anomaly

Excel

Power Apps

Automation

Flows

Azure Machine Learning

Synapse components

• Data pipelines:

• A lot of standard connectors (SQL, Oracle, CSV, API, …)

• Data extraction from online and on-prem systems

• Add new systems easily

• Data Lake:

• RAW, STAGE and CURATED folders (level maturity en correctness data)

• Parquet files to be able to work efficiently with large amounts of data

• Spark Cluster:

• Performant transformation and cleansing actions via notebooks

• Transfers “edited” data to the next stage (RAW, STAGE, CURATED)

• Synapse Data flows:

• Definition business rules via graphical designer (missing values, inconsistencies, …)

• Puts anomalies in a separate STAGE environment

Synapse components

• On demand SQL Pool:

• Build in in Azure Synapse

• Links directly to Parquet files in CURATED zone (without having to copy data to tables).

• Row level security

• Allows to access data via:

• Queries

• Power BI

• Excel

• Automation tools

• …

Synapse Data Flow

Our PoV/PoC approach

| 14

Dream Big, Start Small, Grow Fast

Synapse based Data

Platform

Proof of Value

Rollout 2

Rollout 3

Rollout 4

...

Proof of Concept Project approach

• Make smart choices about the scope

• Define the ‘low hanging fruit’ data sources eligible for the PoC

• Define a quick-win report

• Define a lean & mean project team

• After kick-off – OQuila will

• Set-up the Azure environment

• Set-up the OQuila’s Synapse Data lakehouse framework

• Set-up and deploy the selected data pipeline(s)

• Build the report

• Document the solution

• Present the solution

• Ready for use and grow!

Thank you !