Databricks (Explorer) Data Exploration

kod szkolenia: DBX-EXP / ENG DL 1d

The Databricks Explorer training is the second step in the structured training path Fundamental → Explorer → Lakehouse → Transformation. Participants dive deeper into data analysis and exploration using SQL and PySpark. They will learn to combine declarative (SQL) and imperative (PySpark) approaches, perform aggregations, joins, data profiling, and present insights in interactive Databricks notebooks.

Termin

tryb Distance Learning

poziom Średnio zaawansowany

czas trwania 1 dzień | 7h| 27.04

2 500,00 PLN + 23% VAT (3 075,00 PLN brutto)

Poprzednia najniższa cena:

tryb Distance Learning

poziom Średnio zaawansowany

czas trwania 1 dzień | 7h| 05.10

2 500,00 PLN + 23% VAT (3 075,00 PLN brutto)

Poprzednia najniższa cena:

The training is designed for data engineers, analysts, and BI specialists who want to explore and analyze data in Databricks using SQL and PySpark. It is the natural next step after completing Databricks Fundamentals.

– understand the differences between SQL and the PySpark DataFrame API

– can load and explore data from multiple sources

– are able to perform aggregations, joins, and logical transformations

– can profile data quality and detect missing values, outliers, and inconsistencies

– understand the trade-offs of different file formats (CSV, JSON, Parquet, Delta)

– can create visualizations and simple dashboards in Databricks

– are prepared for the next stage of the training path – Databricks Lakehouse

1.Data analysis in the Databricks environment

• Environment recap: Unity Catalog, notebooks, SQL Editor

• Loading data from different sources: CSV, JSON, Parquet, Delta

• Creating tables and views in Unity Catalog

• Data exploration: display(), show(), summary(), describe()

2.SQL and PySpark in data analysis

• Differences between SQL and DataFrame API

• Creating SQL queries in Databricks notebooks

• Combining SQL and PySpark in a single analysis (mixed cells)

• Common mistakes and query optimization techniques

3.Data operations and transformations

• Filtering (WHERE, filter) and sorting (ORDER BY, sort)

• Grouping and aggregations (GROUP BY, agg, count, sum, avg)

• Creating and modifying columns (withColumn, alias)

• Joins: INNER, LEFT, RIGHT, FULL, SEMI, ANTI

• Logical transformations (case when, when, otherwise)

4.Data quality analysis and profiling

• Analyzing missing values, uniqueness, and data types (na, distinct, count)

• Data validation and typing (cast, printSchema)

• AI Functions – assisting in data analysis and cleansing

• Generating statistics and column profiling

5.File formats trade-offs (light)

• CSV, JSON, Parquet, Delta – differences and use cases

• Delta vs Parquet: ACID, schema evolution, time travel

• Costs and performance of each format

6.Visualization and presentation of results

• Creating charts and dashboards in the Databricks GUI

• Comparing distributions and key values

• Documenting results and insights in notebooks

7.Final project

• Prepare a data quality analysis and transformation combining SQL and PySpark, supplemented with a simple visualization of results in a Databricks notebook

– Completion of Databricks Fundamentals or equivalent knowledge

– Basic SQL knowledge

– Basic experience working with data

access to Altkom Akademia student

Training method:

The training is conducted in the Databricks cloud environment. Each participant receives their own workspace with access to Unity Catalog, SQL Editor, Notebooks, and a catalog with test data.

Training: English

Materials: English

Termin

tryb Distance Learning

poziom Średnio zaawansowany

czas trwania 1 dzień | 7h| 27.04

2 500,00 PLN + 23% VAT (3 075,00 PLN brutto)

Poprzednia najniższa cena:

tryb Distance Learning

poziom Średnio zaawansowany

czas trwania 1 dzień | 7h| 05.10

2 500,00 PLN + 23% VAT (3 075,00 PLN brutto)

Poprzednia najniższa cena:

Skontaktuj się z naszym doradcą

Imię i nazwisko*

Telefon kontaktowy*

Email*

Województwo*

Firma

Treść wiadomości*

Wyrażam zgodę na przetwarzanie moich danych osobowych podanych w formularzu w celu realizacji zgłoszenia (przygotowania odpowiedzi, oferty, kontaktu) przez Altkom Akademia S.A., ul. Chłodna 51, 00-867 Warszawa.

Wyrażam zgodę na przetwarzanie moich danych osobowych w celach marketingowych przez Altkom Akademia S.A., ul. Chłodna 51, 00-867 Warszawa.

Administratorem Państwa danych osobowych jest: Altkom Akademia S.A. 00-867 Warszawa ul. Chłodna 51, KRS: 0000859378, NIP: 527-267-43-24, REGON: 146032998. Dane kontaktowe Administratora: - adres do korespondencji: Chłodna 51, 00-867 Warszawa - adres e-mail: szkolenia@altkom.pl.3. Administrator powołał Inspektora Ochrony Danych Pana Mariusz Zajkiewicza z którym można się skontaktować przy pomocy poczty elektronicznej pisząc na adres email: iodo@altkom.pl. lub bezpośrednio na adres email: mariusz.zajkiewicz@altkom.pl Więcej o RODO na: https://www.altkomakademia.pl/polityka-prywatnosci/

Databricks (Explorer) Data Exploration

Databricks Fundamentals

Databricks (Explorer) Data Exploration

Databricks Data Transformation

AI & ML development with Databricks

Databricks Lakehouse Architecture

MS Project - Zarządzanie projektem

Wprowadzenie do zarządzania Windows Server 2019

SAP Fieldglass: Advanced Analytics

Nowoczesna rekrutacja pracowników

Agile Project Management® v3 Foundation – accredited training with exam

Event Storming – warsztat

SAP Fieldglass Services Procurement

Certified Information Systems Security Professional

Plan, configure, and manage collaboration communications systems with Microsoft Teams

Microsoft Azure Administrator