Pentaho Tools :

Pentaho C-Tools(CDE,CDF,CDA),Pentaho CE & EE Server,OLAP-Cubes,Analysis using Pivot4J, Saiku Analytics, Saiku Reporting, Ad-hoc Reporting using Interactive Reporting Tool,Dashboards,Reports using PRD, PDD,Data Integration using Kettle ETL,Data Mining usign WEKA,Integration of Servers with Databases,Mobile/iPad compatible Dashboards using Bootstrap Css,Drilldown dashboards,Interactive Dashboards

Thursday 28 December 2017

Pentaho DWH BI POC - UNECE Analytics

Hi folks,

This is a document based POC article developed in early 2017 using Pentaho open source tools. The source data is taken from unece website for demonstration purpose. It can be found at http://w3.unece.org/PXWeb/en


Disclaimer: This article is strictly not a production ready one instead an experiment. It is to be used for educational purposes only, therefore following this approach may not suitable or work in your environment and the author or reviewers are not responsible for correctness, completeness or quality of the information provided. However, re-distributing the same content in any other sites is offensive and all rights reserved.

                                           

The sample architecture DWH life cycle from the POC is as follows

Tool set

ETL
Pentaho Data Integration (PDI)
pdi-ce-7.0.0.0-25
DWH
Kimball Star Schema
Kimball Star Schema
OLAP Analysis
Pentaho Schema Workbench(PSW)
 and Saiku Analytics
psw-ce-3.12.0.1-196
BA Server
Pentaho BI Server
7
Reporting
Pentaho Report Designer(PRD)
prd-ce-6.1.0.1-196
Dashboards
Pentaho Ctools
7
Source of Data
Excel files
MS-Office 2016
Source database
MySQL
5.6.25
Target database
MySQL
5.6.25
OS
Windows
10

Below is the architectural approach for the POC

1)Prepare source database from downloaded Excel files
    (staging of data is prepared from downloaded   Excel files and in the process of creating source     data base, the data will be profiled and cleansed)
2) Populate data mart by identifying dimensions and facts from source database.
3) Populate warehouse using PDI tool using incremental data load approach.
4) Create OLAP cubes for data analysis.
5) Data visualization using report and dashboard tools.

Download the full document and source code here : Click Me

I hope this helps someone in community who are beginners or who are in intermediate stage in DWH.

Thank you
Sadakar Pochampalli



Thursday 21 December 2017

Cloud computing and DevOps - How these trends paved the way to next generation Information Technology?

This is a re-blogging from my other blog site - "Exploring Cloud and DevOps"
The new buzzword in this dog-eat-dog IT industry is cloud computing and DevOps. The goal of this article is to brief about cloud computing and automation using DevOps tools and how the organizations are adopting, leveraging the technologies and practices to make handsome profits out of it.

Cloud Computing


Cloud computing refers to the online provisioning of computing services instead of utilizing traditional on-premise resources. It’s occupying a bigger slice of the market by incredible features such as security, scalability, reliability and maintainability. There are firms like Amazon, Google, Microsoft Azure and Oracle etc. who are market leaders in providing cloud services. These organizations maintain high availability data centers having high end internet in various parts of the world called regions and offering services such as instances creation or computing services, storage and networking services etc. over internet. It is known as replacement of on-premise data center(s) which consists of in-house infrastructure. Upon rental basis the vendors are offering ready to use services and termed it as “pay as you use” i.e., pay for the resources or services that will actually be in use.
The adoption of this trend is in leaps and bounds by new entrepreneurs and/or big players to have a single pane of cloud glass, so they can majorly concentrate only on product road maps and code development and less likely to bother about infrastructure and its management. By adopting this(cloud) practice companies are staying ahead of the curve in the cloud game thus making profitable business and investing in endeavoring to solve new problems in education sector, health care sector, finance etc.

Looking at workforce on cloud platforms, the term cloud became a breeze in the heads of people, but one should agree to the fact that it is like free as in speech, not free as in beer when get involved. There would be no ifs ands or buts for the excess or misuse of services on billing perspective, hence to work on cloud platforms, one ought to be experienced in networking, server’s management and should be possessed with architectural skills using the tools of respective service providers in prior. In other words, one must know all the ins and outs of the entire flow and precise usage of the tool(s).

This way, the cloud computing had become paramount importance in next generation IT and paved the way for new business trends in the IT sector by lowering the investment on purchasing and maintaining infrastructure and concentrating on development part.

DevOps


The other part of popular tech. buzzword that is taking the heads of people is DevOps. It’s a process of automating the development and operations of project(s) using tools. It consists of continuous development, continuous integration and continuous deployment and configuration management of a product that makes a continuous pipe line. This concept can be implemented using open source or enterprise tools and technologies, for instance tools like Jenkins, team city, puppet, chef etc. are widely used ones.

Initially this approach requires to build a pipe line starting with committing product code to a centralized repository from all the developers and then building and testing it in various phases. The other part of the pipe line is to deploy the code to tens of hundreds or to thousands of servers and it can be done using deployment tools such as Ansible. For instance, social applications like facebook or twitter or you tube needs to be available to millions of users daily, so these vendors make use of tools mentioned above whenever product(s) release takes place without any downtime.


In a nutshell, this approach saves a lot of time to develop and deploy projects and it eventually reduces manual efforts thus makes the customers delighted to make a profitable and highly running business.

- Sadakar Pochampalli

Thursday 30 November 2017

Tuesday 28 November 2017

Re-blogging: Amazon VPC with a single public subnet

Hi, 
This post is a re-blogging from my new blog site called "Exploring cloud and devops". 

I've kept it as video tutorial (with out sound at this point of time) so readers now can quickly learn the topics. 



Article at : 

(or)



Disclaimer: This article is strictly not a production ready one instead an experiment on AWS. It is to be used for educational purposes only, therefore following this approach may not suitable or work in your environment and the author or reviewers are not responsible for correctness, completeness or quality of the information provided. 

Thursday 2 November 2017

Where to download Pentaho 8.0 CE products in hitachi vantara community website ?

Hi,

Pentaho is now an umbrella product of Hitachi Vantara and the new website is quite cumbersome.
Hitachi has planned to release 8.0 version in November 2017 (read Pedro's article on it  at http://pedroalves-bi.blogspot.in/2017/10/pentaho80.html).

Below URL helps you find out pentaho community products list and from there one can download as usual.


URL : 
https://community.hds.com/docs/DOC-1009931-downloads


Pentaho 8.0 Login Page : First Look

Wednesday 2 August 2017

Tip : Query to get latest records from scd type2 product dimension table in DWH using kettle


SELECT * FROM  cdc_timestamp_scd_product_target;



SELECT * FROM cdc_timestamp_scd_product_target
WHERE sk!=0
AND sk IN (SELECT MAX(sk) FROM cdc_timestamp_scd_product_target WHERE sk!=0 GROUP BY product_id)


Monday 17 July 2017

Jenkins : Create a first build of a Simple Java Program Stored in GitHub in Jenkins

This post talks about how to build a jenkins job using simple java program that is pulled from Github.

Technology Stack :
1) Jenkins
2) GitHub
3) Git Client


1) Create a new repository in GitHub  (Assuming you have a GitHub account)


2) Give repository a name : Say > JavaHelloWorld

3) Observe few commands at newly created repository on below image.
We use add, commit, push commands to send a file to GitHub

4)  (Assuming git client is installed)
Create a folder in D drive say : D:\JavaHelloWorld and put the JavaHelloWorld.java program and a batch file say run.batch  in which we write javac and java commands to execute the program

Now do gitbash to  add, commit and push the 2 files.

git add JavaHelloWorld.java
git commit -m "Commiting Java Hello World Program - first"
git push -u origin master








 repeat the same for run.sh file (this is added after pushing the JavaHelloWorld file)




5) Start Jenkins in your environment  and build a new project as shown in below
6) Copy/Clone the URL of github

 7) In Jenkins Source Code Management tab provide the gitURL
https://github.com/sadakar/JavaHelloWorld.git and provide credentails of your github as shown below

8) In Build Triggers tab provide the batch code to run the example. 
In this example, the code is written in run.bat file to execute the hello world program 

 9) Now, click on build project as shown below
9) See the execution log on Console output. 

 10) Navigate to see where the pulled git code stored internally in the OS drive.



Thursday 13 July 2017

Jenkins : Creating a first build with HelloWorld java program from Windows-7 (Compile & Run)

This post talks about how to create fundamental job and how to build in jenkins by taking a HelloWorld java program.

Environment : Windows 7,  Jenkins 2.60.1

Below are the steps to do.

1) Open jenkins dashboard : http://localhost:8080/
2) Create a new job by clicking on "New Item".

3) Enter a name for the project/job lets say "HelloWorldJob" and chose "Freestyle Project"

4) Navigate to "Build" tab and then select "Execute Windows batch command" and write below to commands to execute the java HelloWorld program that we will put in the job workspace in next step

Click on Apply and Ok to save the job.

javac HelloWorld.java 
java HelloWorld


5) Put HelloWorld.java file in Project workspace

Location of workspace: C:\Program Files (x86)\Jenkins\workspace

NOTE : Unless you build the job at least for 1 time, the jenkins engine will not create job/project workspace in above location

So, build the job for the first time without the HelloWorld.java file to get the workspace created.

Click on "Build Now" as shown in below


Check whether the workspace is created or not

6) Now, put "HelloWorld.java" file in "HelloWorldJob" workspace

public class HelloWorld {
    public static void main(String args[]){
        System.out.println("I'm dancing");
    }
}
7) Now, if you look at, in jenkins dashboard, the file is loaded in the job workspace. 

8) Now click on "Build Now" to run the job

  The light blue indicates the job ran successfully.

9) Now, click on "Console Output" as shown in below to see the output of the build


10) Console output


References
https://stackoverflow.com/questions/15020034/how-to-compile-and-run-a-simple-java-file-in-jenkins-on-windows

Tuesday 11 July 2017

Kubernetes : Deploying and Scaling a Container in Kubernetes (Using Yaml file)

Hello,

Before learning how to deploy containerized image(s) to single node kubernetes cluster, you should go through below two articles explained.

1) Build a docker image and push to docker hub : click here
2) Learn how to deploy the container to kubernetes single node cluster using commands  : Click here

Learn how to deploy the container to kubernetes using yaml file :
Follow below list of commands explained



Docker to kubernetes
Yaml file
Deployment through yaml file
.\kubectl.exe create -f sadakarkubernetes2.yaml

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: sadakarkubernetes2
spec:
  replicas: 3
  template:
    metadata:
      labels:
        app: get-started
    spec:
      containers:
        -
          image: "sadakar/get-started:part2"
          name: sadakar
          ports:
            -
              containerPort: 80
Test yaml code from
Expose the deployment as a service
.\kubectl.exe expose deployment sadakarkubernetes2 --type=NodePort
Launch the service
.\minikube.exe service sadakarkubernetes2


URL of the service
.\minikube.exe service --url=true sadakarkubernetes2


kubernetes dashboard: 


 output:

Thank you
Sadakar Pochampalli