Pentaho Tools :

Pentaho C-Tools(CDE,CDF,CDA),Pentaho CE & EE Server,OLAP-Cubes,Analysis using Pivot4J, Saiku Analytics, Saiku Reporting, Ad-hoc Reporting using Interactive Reporting Tool,Dashboards,Reports using PRD, PDD,Data Integration using Kettle ETL,Data Mining usign WEKA,Integration of Servers with Databases,Mobile/iPad compatible Dashboards using Bootstrap Css,Drilldown dashboards,Interactive Dashboards

Tuesday 17 February 2015

Pentaho Data Integration : Basics-2 : Sending an Email from Pentaho Kettle ETL (Attach Files, Send Log Files to email )

Kettle : Basics-2 : Sending an Email from Pentaho Kettle ETL (Attach Files, Send Log Files to email )


In this post we will see the basics of sending an email with attached files and logging files using Kettle ETL tool.
Version : 5.2.0 CE & this is tutorial is written for beginners. 

 Problem Statement : 

Send the below shown files to email along with log file.
Target Email : gmail account  (username : test4mail4kettle@gmail.com and password : Takeitall )

*You can access this account for testing purpose, kept open



 Sample Output:



Download Example transformation & Job : Click Me..!!

* Change your directory & the files


Step by Step procedure :
Design transformation for data flow -> i.e., file(s) information in the form of  rows.


 Design job for high level flow control:  sending mails on failure, sending files to mail and etc.


Transformation Design & how the flow of rows will be with steps
1) Drag and drop "Get File Names" step from "Input" category.
2) Browse for the folder location where you kept the files & click on "Add" button . Use .* [dot with star] wildcard to get all the files.

4) Preview the step 
  

5) 
* Drag & drop "Filter rows" step from "Flow" category & give below condition to get what type of files you want to pick from the folder ( folder may contain different types files. for ex : .txt, .docx , .csv and etc. 

* Make hop connection from "Get File Inputs" to "Filter rows"  

Condition to fetch txt file (text type files) from the folder (In this example I kept only txt files but there could be another type of files also in the folder)


 
 6) Now, drag & drop "Set files in result" step from "Job" category and give below shown settings

   
Filename field = filename
Type of file to = General

7) Now save the transformation & and have a preview. ( You will not find any result).

 Job Design & how the high level flow control is stetted with job entries 
1) Drag & release the "Start" job entry on to the canvas.
2) Drag & drop transformation entry & locate the transformation file just now created.
3) Connect "start" & "transformation entries".
4) Take a "Mail validator" entry from "Mail" category connect "transformation entry to it.
5) Inside "Mail validator" entry provide the email - Whether you are giving the correct format or not - it will check.
6) Now, take "Mail" entry from "Mail" category and connect "Mail validator" to this entry.
7) Open "Mail" entry settings by double clicking it.
8) You will find 4 tabs namely "Address", "Server", "Email Message", "Attached files".
Now, we will be setting properties in all the tabs which ever required for our purpose.

Address tab : 

Provide Target email address details(Destination) & sender email details with Name.

* Target email in this example is :  test4mail4kettle@gmail.com
* If you want to sent to multiple emails use space to add more emails.
* Sender address in this example is : test4mail4kettle@gmail.com
* Sender name is : Test Sender

Server Tab :

Gmail SMPT details information for google account : click here

SMTP Server
SMTP Server : smtp.gmail.com
Port : 465

Authentication
Authentication user : test4mail4kettle@gmail.com
password : Takeitall
 Seucure connection type : SSL ( check the secure connection)

Email Message Tab

 Play with various option in this settings

Attached Files tab

9) Save the job &run it.


Additional Notes :
Enable Log information as a file 
"attached files tab"  -> Select file type -> Multiselect General & log.
* Save the transformation & run it. In the destination mail along with files you can find log file information. 

With gmail:
You need to give less secure for your gmail .. It'll block the connection.

Resolution for gmail :

If this was you
You can switch to an app made by Google such as Gmail to access your account (recommended) or change your settings at https://www.google.com/settings/security/lesssecureapps so that your account is no longer protected by modern security standards.

Click on the link and set "Turn on"  for Access for less secure apps

References:


1) Pentaho Cook book 4
2) http://stackoverflow.com/questions/19302855/send-email-with-pentaho-pdi


Improvements & suggestions are most welcome on this post :-) Post your comments to improve....

Thank you for reading it. 

Sadakar Pochampalli - Business Intelligence
Hyderabad, INDIA

Friday 13 February 2015

PDI Basics-1 : Command Line Arguments in Pentaho Kettle ETL

In this post we will see the basics on how to work with command line arguments in Kettle ETL.

Software setup : 

Kettle 5.2.0.0 CE

Quick points to Navigate through this post :
1) Command Line Arguments - definition & points

2) Ways to work with Command line arguments
                   1) From command line
                   2) From Kettle GUI

3) Example : With Transformation    
         Scenario : Store argument values(Strings) in a text file

4) Example : With Job  
          Scenario : Store argument values(Strings) in a text file5) Download section 

Download Example here :  Click Me..!!!

Download pdf version of the same article here :  Click Me...!!!! 

Go to "download" tab of this site to get the documents topic wise.  

1) Command Line Arguments :


Command line
Arguments

* A named, user-supplied, single-value input.
* Each transformation or job can have a max of 10 arguments. 
* It is declared as space-separated values given after the Pan or Kitchen line
* Arguments could be numbers, words (strings), or variables 
(system or script variables, not PDI variables).

Reference 

Blog Reference 





2) There are 2 ways to work with command line arguments
1) From command line
2) From Kettle GUI

1) From command line
Command line arguments syntax with transformation 
Win
pan.bat /file:<directory>\<transformation name> arg1 arg2 arg3
Linux
sh pan.sh -file=<directory>/<transformation name> arg1 arg2 arg3

Command line arguments examples with transformations 
Win
pan.bat /file:E:\Explore\Kettle\1_CommandLineArgumentsBasic.ktr 10 20 30
Linux
sh pan.sh -file=/Explore/Kettle\1_CommandLineArgumentsBasic.ktr 10 20 30

Command line arguments syntax with jobs 
Win
pan.bat /file:<directory>\<transformation name> arg1 arg2 arg3
Linux
sh pan.sh -file=<directory>/<transformation name> arg1 arg2 arg3

Command line arguments examples with jobs
Win
kitchen.bat /file:E:\Explore\Kettle\1_CommandLineArgumentsBasic.kjb 10 20 30
Linux
sh kitchen.sh -file=/Explore/Kettle\1_CommandLineArgumentsBasic.kjb 10 20 30

2) From Kettle GUI

l Arguments can be handled at the time of execution of transformation or job.
l In Transformation : At “Execute a Transformation” dialogue window
l Job : At “Excute a Job” dialogue window.

3) Example : With Transformation 
Scenario : Store argument values(Strings or numbers) in a text file

1) Take a new Transformation(Ctrl+N) save it as 1_CommandLineArgumentsBasic.ktr

2) Drag and drop “Get System Info” step from “Input” category.

3) Double click on it get it properties window as shown in below image and Type 3 filed names 
as Arg1,Arg2 and Arg3 and give type as command line argument1 and 2 and 3.

4) Take a “Text output file”  from “Output” category and connect it from “Get Sys Info” step. 

5) Give file path= E:\Explore\Kettle and file name =CommanLineArgumentsOutput and say
  its type = text .. Go to “Fields” tab and click on “Get Fields” button to get the fields coming 
 from “Get System Info” step. 


Execution of transformation
We will see 2 types of executions 
1) Execution from GUI &
2) Execution from command line

Execution from GUI

a) Save the transformation and click on Run button to get it “Execute a transformation window” 
b) In the arguments tab provide 3 arguments as shown in image
c) Now check whether the file is generated with the provided input or not at E:\Explore\Kettle 

Execution from command line
1) Open command prompt and navigate to the folder where you can find pan.bat file
 ( remember to work with transformation on command we need pan.bat file).

2) For example : In my local environment kettle is installed in below location. 
 D:\Softwares Archive Installed\Pentaho\pdi-ce-5.2.0.0-209\data-integration


3) Give below command with your arguments lets say 100 200 300 (space separated arguments) 
as shown in below image. 

D:\Softwares Archive Installed\Pentaho\pdi-ce-5.2.0.0-209\data-integration>
Pan.bat /file:E:\Explore\Kettle\1_CommandLineArgumentsBasic.ktr 100 200 300


 4) Now check whether the output file is updated or not with new arguments. 

4) Example : With Job
Scenario : Store argument values(Strings) in a text file

1) Clt+Alt+N to create a new job and save it as 1_CommandLineArgumentsBasic.kjb

2) Take a START job entry & transformation entry on to the canvas and connect them with HOP. 
In transformation settings give the just 
created transformation path as 
${Internal.Job.Filename.Directory}/1_CommandLineArgumentsBasic.ktr

3)Execution 
Here we can pass command line arguments in 3 ways 
1) GUI execution to give arguments at “Execute a job” window.
2) GUI execution - Provide arguments in “Transformation” entry..(Ignore giving agreements at “Execution a job” window) 
3) Command line arguments 

** GUI execution to give arguments at “Execute a job” window.
1) Save the job and click on Run Button to get “ Execute a job” window..
 Here find Arguments tab and provide values as shown in image 1) Hello 2) Kettle  3) Argument


2) Go back to the file and check for the result...

** GUI execution - Provide arguments in “Transformation” entry..(Ignore giving arguments at “Execution a job” window)
1) As shown in below image go to the transformation entry settings and click on “Arguments” tab 
and give 3 values 
2) Click on “Exucte” button and go back to the file location to check the output. 


The file with newly updated argument values is as follows


** Command line arguments 

1) Open command prompt and navigate to the folder where you can find kitchen.bat file 
( remember to work with jobs on command we need kitchen.bat file).

2) For example : In my local environment kettle is installed in below location. 
 D:\Softwares Archive Installed\Pentaho\pdi-ce-5.2.0.0-209\data-integration


 3) Give below command with your arguments lets say 500 600 700(space separated arguments)
 as shown in below image. 

NOTE :  while passing arguments from command line for job remove arguments in "Transofrmation
entry" if any. Also remove from "Execute a Job" window. 

D:\Softwares Archive Installed\Pentaho\pdi-ce-5.2.0.0-209\data-integration>
Kitchen.bat /file:E:\Explore\Kettle\1_CommandLineArgumentsBasic.kjb 600 700 800


4) Now check whether the output file is updated or not with new arguments. 


 I hope this information will helpful to some one :-)


Command Line Arguments as Place holders in SQL Queries : Read it  here http://diethardsteiner.blogspot.de/2011/03/pentaho-data-integration-scheduling-and.html


 Cheers..!!!








Thursday 5 February 2015

Playing with Date Format in Date selector component in Pentaho CDE - Example : Converting 2015-02-04 to 4-Feb-15

In this post, we will see how to customize date input control for default value & for it's format.

Thank you Sam Kumar( A community guy) for asking me to explore this and hope this solution might be useful in your project.

Though, there is a direct property of setting format for date in Date selector , that will not work when you set a default value for date type parameter.

Scenario : 

 Lets say  you have given "yesterday" as default value for date parameter and in Date selector if you give "dd-M-yy" format then you can able to see the selected date in that format BUT for the first time you will not find anything in Date selector because your default parameter value is "yesterday"

Software Setup : 

Pentaho BA Server version : 5.3.0.0.196 ( test server - not stable one)
C-Tools (CDE,CDF,CDA & CGG) : 5.3.0.0.196(5.3) - Formally TRUNK version with 5.3 server

Problem statement : 

Default values like "today", "yesterday", "One week ago", "One month ago" should take 5-Feb-15, 4-Feb-15 and etc format in Date selector when dashboard loads ...


Step by Step Screenshot solution :


1)  Design the lay out for Date label & for Date selector (i.e., Html Objects for label & selector  - lets assume these html objects are Col1 & Col2).

2) Creating Date parameter : parameter name = date_param
    Go to components -> Click on Generic from left side -> Click on Date parameter
   Give default value as "yesterday" ( or today or other from the drop down).

3) Creating Date selector for Date parameter created in Step 2 : Date selector name = date_picker
  *    Components section -> selections -> Date input component
  *    Give parameter(date_param) for listeners & parameter section.
  *    Give html object for this (i.e., Col2 )

4) Test -1 : Default view



5) Now, Converting the format
  * Go to Advanced properties of Date selector (date_picker)
  * Write below code in Post Execution section 


function f(){
       var date1 = Dashboards.getParameterValue("date_param").toString();

    testdate = $.datepicker.formatDate("d-M-y",new Date(date1));


 //alert(testdate);


 document.getElementById('render_date_picker').value=testdate;

}



6) Save the dashboard and test it : Test -2

parameter default value is : yesterday


That's it... We are done

Download Example here :  Click Me

NOTES :

1) From the stack overflow link below , as the CDE is already integrated with jQuery technology we have to use formatDate function for formatting the date. 
2) The code should go in Post Execution section. 
3) render_date_picker ----> render_ is the keyword for which we attach Date input component name. 
4) Code is independent for specific date control as it is going into Date input component's own Post Execution section. 
for example : I have taken another date component and tested..

  5) Find jQuery supported formats using this link https://jqueryui.com/resources/demos/datepicker/date-formats.html


Additional Notes : 

What about Date Range Picker Customization ?

Take 2 parameters : lets say param1_FromDate & param2_ToDate and write below code in "Post Execution" code area of "Date Range Input Component".

function f(){
 var date1 = Dashboards.getParameterValue("param1_FromDate").toString();
 var date2 = Dashboards.getParameterValue("param2_ToDate").toString();

 testdate = $.datepicker.formatDate("d-M-y",new Date(date1))+">"+ $.datepicker.formatDate("d-M-y",new Date(date2));

 alert(testdate);
 document.getElementById('render_date_range_picker').value=testdate;

}

  

Download :  Updated One 

 References : 
1) http://stackoverflow.com/questions/3552461/how-to-format-javascript-date 
2) https://jqueryui.com/resources/demos/datepicker/date-formats.html

R&D links from Sam :-)
3)
http://stackoverflow.com/questions/3552461/how-to-format-javascript-date
4)
http://api.jqueryui.com/datepicker/#option-dateFormat
:-) Sadhakar Pochampalli :-) BI developer :-)