Pentaho Tools :

Pentaho C-Tools(CDE,CDF,CDA),Pentaho CE & EE Server,OLAP-Cubes,Analysis using Pivot4J, Saiku Analytics, Saiku Reporting, Ad-hoc Reporting using Interactive Reporting Tool,Dashboards,Reports using PRD, PDD,Data Integration using Kettle ETL,Data Mining usign WEKA,Integration of Servers with Databases,Mobile/iPad compatible Dashboards using Bootstrap Css,Drilldown dashboards,Interactive Dashboards

Sunday, 8 May 2016

Stream lookup Step sample example using oracle tables in PDI


This is re-blogging from below my previous post :

Just thought of doing the same example using Oracle 11g Express database and here are the steps how I did.
You will learn , how to connect to Oracle 11g database and Stream lookup step capability with tables. 

Software Used for this example :  
1) Oracle 11g Express Edition
2) Pentaho Data Integration 6.1 CE

First, we will see how to connect to Oracle 11g Express edition in Pentaho data integration. 

Before connecting oracle xe database(or database of your choice) download ojdbc6 driver from site and paste in "lib" folder that can found at kettle installation folder at E:\2_Archive_Installed\Pentaho\6.1\pdi-ce-\data-integration\lib
Now,  restart the kettle if it is already running. 
1) File -> New Transformation
2) Right click on "Database Connections" node and then click on "New"
3) As shown in below image give all the properties
Host Name : localhost, database name : xe, username and password : sadakar/sadakar
Click on "Test" button to confirm the connection :
Click on "Share Connection" for future usage.

Aim of the post : 
 lookup the data coming from employee(source) table with department (lookup) table  and insert the location field from department table in empolyee_with_locations (target) table.

Source table : employee
Target table : employee_with_locations
Lookup table : department

 DDL and sample data for employee and department table is as follows 
employee table
  EmpID INTEGER primary key
, Name VARCHAR2(30)

SELECT * FROM employee
empid name    deptid_emp
110    Sadakar    10
111    Hasini    10
112    Dolly    20
113    Kutti    20
114    Jikky    30

department table
  DeptID_Dept INTEGER primary key
, DeptName VARCHAR2(30)
, Location VARCHAR2(2000)
SELECT * FROM department
deptid_dept deptname location
10    Mathematics    Netherlands
20    Computers    USA

Also create DDL for target table 
CREATE TABLE SADAKAR.employee_with_locatioins
  empid INTEGER primary key
, name VARCHAR2(30)
, dept_id INTEGER
, location VARCHAR2(2000)

Now, design the transformation as shown in below image. 
Double click on Stream lookup step and specify the lookup ids and get the retrieve field as shown in below image

Now save the transformation and preview the output. 
Output should be :

Lastly, the difference between "Stream lookup" and "database lookup" is here. 

Stream lookup stores the data in memory and then it will start doing all of the lookups, data can come from anywhere (from file or table)
Database lookup looks up data from a database, and only touches the data you lookup.

References :

I hope it helps some one.! 

Sadakar Pochampalli