Databricks Data Transformation

training code: DBX-PFE / ENG DL 1d / EN

The Databricks Transformation training is the final step in the structured training path Fundamental → Explorer → Lakehouse → Transformation. Participants will learn how to design modular pipelines, combine batch and streaming, apply advanced transformations, use Delta Live Tables, and integrate workflows with Git and CI/CD.

Date

mode Distance Learning

level Advanced

duration 1 day | 7h| 08.06

2,500.00 PLN + 23% VAT (3,075.00 PLN with TAX)

Previous lowest price:

mode Distance Learning

level Advanced

duration 1 day | 7h| 09.11

2,500.00 PLN + 23% VAT (3,075.00 PLN with TAX)

Previous lowest price:

The training is designed for data engineers and DataOps teams responsible for implementing and maintaining production data processing workflows in the Lakehouse architecture.

– can design modular Silver → Gold pipelines

– understand how to combine batch and stream in a single workflow

– can apply PySpark window functions for data transformations

– can use Delta Live Tables to automate pipelines

– know best practices for orchestration and CI/CD in Databricks

– can ensure data quality with expectations and monitor lineage

– are prepared to maintain production workflows in Databricks

1.Data processing architecture

• Recap of Bronze–Silver–Gold in the context of transformation pipelines

• Designing data flows in Silver and Gold layers

• Modularity and separation of processing logic (load → transform → save)

2.Batch and stream load in practice

• Differences between batch and streaming processing

• Batch ingest using COPY INTO and writing to Delta tables

• Streaming ingest with Auto Loader (cloudFiles)

• Structured Streaming: readStream, writeStream, checkpointing, and fault tolerance

• Integrating batch and stream

3.Advanced data transformations

• Creating numerical, text, and binary features

• Logical transformations (case when, when, otherwise)

• Window functions (lag, lead, row_number, rolling average)

• Creating time-based and session features

4.Delta Live Tables – pipeline automation

• Declarative processing approach: CREATE LIVE TABLE

• Creating DAGs and scheduling in DLT

• Integrating DLT with Auto Loader and Structured Streaming

• Expectations – real-time data quality control

• Monitoring and lineage in the DLT interface

5.Orchestration and automation

• Databricks Workflows – multi-task jobs, dependencies, retries

• Pipeline parameterization (dbutils.widgets, dbutils.notebook.run)

• Best practices for CI/CD and code maintenance (Repos, versioning notebooks)

6.CI/CD – practical Git (Repos) demo

• Cloning a repository in Databricks Repos

• Commit and push notebooks to Git

• Running a pipeline from Workflows based on a repo

• DevOps best practices for Databricks

7.Final project

• Design and run a Silver → Gold pipeline using batch and stream load, Delta Live Tables, quality control rules, and Git integration

– Completion of Databricks Lakehouse or equivalent knowledge

– Knowledge of SQL and PySpark

– Basic experience implementing data pipelines

access to Altkom Akademia student

Training method:

The training is conducted in the Databricks cloud environment. Each participant receives their own workspace with access to Unity Catalog, SQL Editor, Notebooks, and a catalog with test data.

Training: English

Materials: English

Date

mode Distance Learning

level Advanced

duration 1 day | 7h| 08.06

2,500.00 PLN + 23% VAT (3,075.00 PLN with TAX)

Previous lowest price:

mode Distance Learning

level Advanced

duration 1 day | 7h| 09.11

2,500.00 PLN + 23% VAT (3,075.00 PLN with TAX)

Previous lowest price:

Contact our consultant

Your name*

Phone*

Email*

Company name

Message*

I consent to the storage and processing of my personal data for the purposes of the application process by Altkom Akademia S.A., 51 Chłodna St., 00-867 Warsaw, in accordance with the Regulation of the European Parliament and of the Council (EU) 2016/679 of April 27, 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data ("RODO"). The provision of data is voluntary, but necessary for the execution of the application. I am aware of the fact that I have the right to withdraw my consent to the processing of my data, to rectify, delete or restrict the processing. Contact the Altkom Data Protection Officer: iodo@altkom.pl If you want to know more about the protection of personal data at Altkom Akademia SA, please visit: https://www.altkomakademia.pl/en/privacy-policy/.

I consent to the processing of my personal data for marketing purposes. Expressing consent is voluntary. I am aware of the fact that I have the right to withdraw consent to the processing of my data, rectify it, delete it or limit its processing. The administrator of your data is Altkom Akademia S.A., ul. Chłodna 51, 00-867 Warsaw. Contact to the personal data protection officer at Altkom: iodo@altkom.pl If you want to know more about the protection of personal data at Altkom Akademia SA, please visit: https://www.altkomakademia.pl/en/privacy-policy/.

Databricks Data Transformation

Databricks Lakehouse Architecture

Databricks Fundamentals

Databricks (Explorer) Data Exploration

Databricks Data Transformation

AI & ML development with Databricks

Microsoft Azure Administrator

Overview of the most important tools in PowerPivot, Power Query & Power Map 2019 Add-ins

Enhance endpoint security with Microsoft Intune and Microsoft Security Copilot

Kanban Systems Improvement (KSI)

MS Excel - Power Query –advanced data retrieval and transformation

Professional Scrum Master™ - Advanced

Active Digital Learning - Understanding Cisco Collaboration Foundations

LinkedIn as a Job Search Tool

Active Directory Services with Windows Server 2019/2025

Securing the Web with Cisco Web Security Appliance v3.1