Pentaho Data Integration Fundamentals (PDI1000L)

Pentaho Data Integration Fundamentals (PDI1000L)


(3 days / 3 credits) Scheduled, instructor-led training with virtual lab environment for hands-on practice.


This instructor-led course introduces the Pentaho Data Integration (PDI) platform. It covers the basic functions of the platform, explains its capabilities, and describes the best practices to use the platform successfully. Course demonstrations and practice sessions prepare you to employ PDI for real world cases.

Pentaho Data Integration prepares and blends data to create a complete picture of your business that drives actionable insights. The complete data integration platform delivers accurate, analytics ready data to end users from any source. With visual tools to eliminate coding and complexity, Pentaho puts big data and all data sources at the fingertips of business and IT users alike.

Students will benefit from engaging and learning from an experienced instructor coupled with hands-on practice using a full implementation of Pentaho in a virtual Lab Environment. The course can be run as a physical class at one of our training sites or as an online session.


This course will help students to:

  • Describe the Pentaho Data Integration (PDI) Platform and its components and their common uses.
  • List the pieces that make up transformations and how they execute.
  • Create, preview, run, and troubleshoot a transformation using best practices and modular design principles.
  • Read and write data to and from various file formats.
  • Perform calculations, merges, and lookups.
  • Use PDI’s enterprise repository, scheduling, and monitoring capabilities.
  • Log execution metrics to database tables.



Prior experience of Pentaho is not required however, some experience using ETL (Extract, Transform and Load) for building data pipelines is preferred.


This course includes:


  • Introduction to Pentaho Data Integration
    • Objectives and class logistics
    • Pentaho Platform and architecture
  • Transformations
    • Transformation concepts
    • Learning the PDI user interface
    • Creating and running transformations
    • Introduction to repositories
  • Reading and writing files
    • Input and output steps
    • PDI's home directory
    • Parameterization
  • Working with databases
    • Connecting to and exploring a database
    • Table input and output steps
    • Insert / update and delete steps
    • Filtering and sorting data
    • Variables and unnamed parameters in SQL
  • Data flow and lookups
    • Data movement and steps copies
    • Lookups and merge
  • Calculations
    • Grouping
    • Calculation and scripting steps
  • Jobs orchestration
    • Introduction to Jobs
    • Explore common job entries
  • Exploring the Pentaho Repository
    • The Pentaho Repository
  • Scheduling and monitoring
    • Setting up the scheduler
    • Monitoring scheduled tasks
  • Logging
    • Introduction to logging
    • File-based logging
    • Logging execution metrics to databases


3 Days

Upcoming Classes


Instructor-led online training

Location Oct 2021 Nov 2021 Dec 2021 Jan 2022 Feb 2022 Mar 2022 Apr 2022
Virtual - Americas Oct 25 – Oct 27
Jan 17 – Jan 19
Virtual - EMEA Nov 1 – Nov 3
Jan 24 – Jan 26

Classes in bold are guaranteed to run!

Onsite Training

For groups of three or more

Request Quote

Public Training

Virtual - Americas

Virtual - EMEA

Classes marked with Confirmed are guaranteed to run. Sign up now while there is still space available!

Don't see a date that works for you?

Request Class