Pentaho Tools :

Pentaho C-Tools(CDE,CDF,CDA),Pentaho CE & EE Server,OLAP-Cubes,Analysis using Pivot4J, Saiku Analytics, Saiku Reporting, Ad-hoc Reporting using Interactive Reporting Tool,Dashboards,Reports using PRD, PDD,Data Integration using Kettle ETL,Data Mining usign WEKA,Integration of Servers with Databases,Mobile/iPad compatible Dashboards using Bootstrap Css,Drilldown dashboards,Interactive Dashboards

Thursday, 25 September 2014

Kettle : Stream Lookup Step explained with a Sample Transformation in Pentaho Kettle - Pentaho Data Integration

Hi Guys,

This simple transformation explains the "Stream Look Up" step example in Pentaho Kettle as part of my learning & documenting here for community developers.

In next articles I'll be sharing more interesting topics with End to End examples.

Example developed on : 
5.1.0 Kettle
Reference :
 D:\pdi-ce-\data-integration\samples\transformations\Stream lookup - basics.ktr

Scenario : 
There are 2 tables lets say Employee and Department and the data is as follows in both of the tables.
There is a common field lets say DeptID_Emp and DeptID_Dept in both the tables. 

Lookup the DeptID_Emp field with DeptID_Dept and get the records of all employees with department names.

Employee Table:
EmpID    Name    DeptID_Emp
110    Sadakar    10
111    Hasini    10
112    Dolly    20
113    Kutti    20
114    Jikky    30

Department Table:
DeptID_Dept    DeptName
 10    Mathematics
 20    Computers

1. Drag and drop Data Grid step (Input->Data Grid) and Meta(Column names) and Data(Insert the employee data shown above) name it as "Employee".

2. Drag and drop Data Grid step (Input->Data Grid) and double click the step to open it's properties.
    Give meta data (i.e., column names with length) Insert the data shown in Department table.

3. Drag and drop Stream Lookup step (Look Up -> Stream Lookup).
   Connect Employee -> Stream Lookup and Department -> Stream Lookup
and open the properties of Stream Value Lookup.
 As shown in below figure set the configuration.

Lookup Step : Department
Filed : DeptID_Emp ( Lookup field from the source stream).
LookupFiled : DeptID_Dept (lookup the DeptID_Emp field with DeptID_Dept).

i.e., matching the values of the fields ( Internally it'll compared with = operator).

Get Fields : If you click on it, it will fetch all the fields from source stream (here it is Employee table) on.
Get Lookup fields: It'll fetch all the rows from lookup file(here it is Departmet table).

We can also provide default values to the fields that are coming from lookup file/table. 

4. Connect "Stream lookup" to a dummy step (Flow->Dummy) and have a preview.(right click on the dummy step and see the preview).

What the Stream Lookup do here in the transformation is : It'll look for a match from the source stream. If the values matches from the lookup file then associated department name is added to the dummy step else it'll take NULL value.

If we add select values step we can get the desired fields with new names.

Output is shown below image.

The sample transformation shown in below image :

Download the .ktr file here : Click me