Databricks Data Transformation

kod szkolenia: DBX-PFE / ENG DL 1d

The Databricks Transformation training is the final step in the structured training path Fundamental → Explorer → Lakehouse → Transformation. Participants will learn how to design modular pipelines, combine batch and streaming, apply advanced transformations, use Delta Live Tables, and integrate workflows with Git and CI/CD.

Termin

tryb Distance Learning

poziom Zaawansowany

czas trwania 1 dzień | 7h| 08.06

2 500,00 PLN + 23% VAT (3 075,00 PLN brutto)

Poprzednia najniższa cena:

tryb Distance Learning

poziom Zaawansowany

czas trwania 1 dzień | 7h| 09.11

2 500,00 PLN + 23% VAT (3 075,00 PLN brutto)

Poprzednia najniższa cena:

The training is designed for data engineers and DataOps teams responsible for implementing and maintaining production data processing workflows in the Lakehouse architecture.

– can design modular Silver → Gold pipelines

– understand how to combine batch and stream in a single workflow

– can apply PySpark window functions for data transformations

– can use Delta Live Tables to automate pipelines

– know best practices for orchestration and CI/CD in Databricks

– can ensure data quality with expectations and monitor lineage

– are prepared to maintain production workflows in Databricks

1.Data processing architecture

• Recap of Bronze–Silver–Gold in the context of transformation pipelines

• Designing data flows in Silver and Gold layers

• Modularity and separation of processing logic (load → transform → save)

2.Batch and stream load in practice

• Differences between batch and streaming processing

• Batch ingest using COPY INTO and writing to Delta tables

• Streaming ingest with Auto Loader (cloudFiles)

• Structured Streaming: readStream, writeStream, checkpointing, and fault tolerance

• Integrating batch and stream

3.Advanced data transformations

• Creating numerical, text, and binary features

• Logical transformations (case when, when, otherwise)

• Window functions (lag, lead, row_number, rolling average)

• Creating time-based and session features

4.Delta Live Tables – pipeline automation

• Declarative processing approach: CREATE LIVE TABLE

• Creating DAGs and scheduling in DLT

• Integrating DLT with Auto Loader and Structured Streaming

• Expectations – real-time data quality control

• Monitoring and lineage in the DLT interface

5.Orchestration and automation

• Databricks Workflows – multi-task jobs, dependencies, retries

• Pipeline parameterization (dbutils.widgets, dbutils.notebook.run)

• Best practices for CI/CD and code maintenance (Repos, versioning notebooks)

6.CI/CD – practical Git (Repos) demo

• Cloning a repository in Databricks Repos

• Commit and push notebooks to Git

• Running a pipeline from Workflows based on a repo

• DevOps best practices for Databricks

7.Final project

• Design and run a Silver → Gold pipeline using batch and stream load, Delta Live Tables, quality control rules, and Git integration

– Completion of Databricks Lakehouse or equivalent knowledge

– Knowledge of SQL and PySpark

– Basic experience implementing data pipelines

access to Altkom Akademia student

Training method:

The training is conducted in the Databricks cloud environment. Each participant receives their own workspace with access to Unity Catalog, SQL Editor, Notebooks, and a catalog with test data.

Training: English

Materials: English

Termin

tryb Distance Learning

poziom Zaawansowany

czas trwania 1 dzień | 7h| 08.06

2 500,00 PLN + 23% VAT (3 075,00 PLN brutto)

Poprzednia najniższa cena:

tryb Distance Learning

poziom Zaawansowany

czas trwania 1 dzień | 7h| 09.11

2 500,00 PLN + 23% VAT (3 075,00 PLN brutto)

Poprzednia najniższa cena:

Skontaktuj się z naszym doradcą

Imię i nazwisko*

Telefon kontaktowy*

Email*

Województwo*

Firma

Treść wiadomości*

Wyrażam zgodę na przetwarzanie moich danych osobowych podanych w formularzu w celu realizacji zgłoszenia (przygotowania odpowiedzi, oferty, kontaktu) przez Altkom Akademia S.A., ul. Chłodna 51, 00-867 Warszawa.

Wyrażam zgodę na przetwarzanie moich danych osobowych w celach marketingowych przez Altkom Akademia S.A., ul. Chłodna 51, 00-867 Warszawa.

Administratorem Państwa danych osobowych jest: Altkom Akademia S.A. 00-867 Warszawa ul. Chłodna 51, KRS: 0000859378, NIP: 527-267-43-24, REGON: 146032998. Dane kontaktowe Administratora: - adres do korespondencji: Chłodna 51, 00-867 Warszawa - adres e-mail: szkolenia@altkom.pl.3. Administrator powołał Inspektora Ochrony Danych Pana Mariusz Zajkiewicza z którym można się skontaktować przy pomocy poczty elektronicznej pisząc na adres email: iodo@altkom.pl. lub bezpośrednio na adres email: mariusz.zajkiewicz@altkom.pl Więcej o RODO na: https://www.altkomakademia.pl/polityka-prywatnosci/

Databricks Data Transformation

Databricks Lakehouse Architecture

Databricks Fundamentals

Databricks (Explorer) Data Exploration

Databricks Data Transformation

AI & ML development with Databricks

Financial Contract Accounting

SAP Grants Management - Grantee

Zarządzanie mikro zespołami na uczelni wyższej – efektywna praca lidera zespołu, motywowanie pracowników, delegowanie zadań, rozwój kompetencji

Wdrażanie usług w Windows Server 2022/2025

Modele biznesowe i tworzenie strategii – podstawy zarządzania dla kadry uczelni wyższych

Power Automate

Contracts and Conditions in SAP Contract and Lease Management for SAP S/4HANA

CompTIA Network+ - szkolenie autoryzowane wraz z egzaminem N10-009

SAP Learning Hub

Basic Customizing in SAP S/4HANA EWM