Implementing a Lakehouse with Microsoft Fabric
kod szkolenia: DP-601 / ENG DL 1dThis course will explore the powerful capabilities of Apache Spark for distributed data processing and the essential techniques for efficient data management, versioning, and reliability by working with Delta Lake tables. This course will also explore data ingestion and orchestration using Dataflows Gen2 and Data Factory pipelines.
This course is designed to build your foundational skills in data engineering on Microsoft Fabric, focusing on the Lakehouse concept
You should be familiar with basic data concepts and terminology. It is suggested that you familiarize yourself with the AZ-900 and DP-900 training materials in advance, or attend these trainings.
To increase the comfort of work and training’s effectiveness we suggest using an additional monitor. The lack of additional monitor does not exclude participation in the training, however, it significantly influences the comfort of work during classes.
- Training: English
- Materials: English
- manual in electronic form available on the platform:
- access to Altkom Akademia's student portal
This course includes a combination of lectures and hands-on exercises that will prepare you to work with lakehouses in Microsoft Fabric.
Module 1: Introduction to end-to-end analytics using Microsoft Fabric
- Introduction to Microsoft Fabric
- Data teams and Fabric
- Enable and use Microsoft Fabric
Module 2: Get started with lakehouses in Microsoft Fabric
- What is a Lakehouse
- Work with a Fabric Lakehouse
- Explore, transform and visualize data in the Lakehouse
Module 3: Use Apache Spark in Microsoft Fabric
- Prepare to use Apache Spark
- Run Spark in Fabric
- Load data in a Spark DataFrame
- Transform data in a Spark DataFrame
- Partition the output file
- Work with data using Spark SQL
- Query Data using Spark SQL API
- Visualize Data
Module 4: Work with Delta Lake tables in Microsoft Fabric
- Understand Delta Lake
- Create delta tables using code in Spark
- Managed vs External Tables
- Work with delta tables in Spark
- Data versioning and Time Travel
- Use delta tables with Streaming data
Module 5: Ingest Data with Dataflows Gen2 in Microsoft Fabric
- Understand Dataflows (Gen2)
- Dataflow (Gen2) benefits and limitations
- Explore Dataflows (Gen2) in Microsoft Fabric
- Integrate Dataflows (Gen2) and Pipelines in Microsoft Fabric
Module 6: Use Data Factory pipelines in Microsoft Fabric
- Pipelines in Microsoft Fabric
- Common Activities – Copy Data
- Common Activities – pipeline templates
- Run and monitor pipelines