Pentaho Tools :

Pentaho C-Tools(CDE,CDF,CDA),Pentaho CE & EE Server,OLAP-Cubes,Analysis using Pivot4J, Saiku Analytics, Saiku Reporting, Ad-hoc Reporting using Interactive Reporting Tool,Dashboards,Reports using PRD, PDD,Data Integration using Kettle ETL,Data Mining usign WEKA,Integration of Servers with Databases,Mobile/iPad compatible Dashboards using Bootstrap Css,Drilldown dashboards,Interactive Dashboards

Thursday 28 December 2017

Pentaho DWH BI POC - UNECE Analytics

Hi folks,

This is a document based POC article developed in early 2017 using Pentaho open source tools. The source data is taken from unece website for demonstration purpose. It can be found at http://w3.unece.org/PXWeb/en


Disclaimer: This article is strictly not a production ready one instead an experiment. It is to be used for educational purposes only, therefore following this approach may not suitable or work in your environment and the author or reviewers are not responsible for correctness, completeness or quality of the information provided. However, re-distributing the same content in any other sites is offensive and all rights reserved.

                                           

The sample architecture DWH life cycle from the POC is as follows

Tool set

ETL
Pentaho Data Integration (PDI)
pdi-ce-7.0.0.0-25
DWH
Kimball Star Schema
Kimball Star Schema
OLAP Analysis
Pentaho Schema Workbench(PSW)
 and Saiku Analytics
psw-ce-3.12.0.1-196
BA Server
Pentaho BI Server
7
Reporting
Pentaho Report Designer(PRD)
prd-ce-6.1.0.1-196
Dashboards
Pentaho Ctools
7
Source of Data
Excel files
MS-Office 2016
Source database
MySQL
5.6.25
Target database
MySQL
5.6.25
OS
Windows
10

Below is the architectural approach for the POC

1)Prepare source database from downloaded Excel files
    (staging of data is prepared from downloaded   Excel files and in the process of creating source     data base, the data will be profiled and cleansed)
2) Populate data mart by identifying dimensions and facts from source database.
3) Populate warehouse using PDI tool using incremental data load approach.
4) Create OLAP cubes for data analysis.
5) Data visualization using report and dashboard tools.

Download the full document and source code here : Click Me

I hope this helps someone in community who are beginners or who are in intermediate stage in DWH.

Thank you
Sadakar Pochampalli



Thursday 21 December 2017

Cloud computing and DevOps - How these trends paved the way to next generation Information Technology?

This is a re-blogging from my other blog site - "Exploring Cloud and DevOps"
The new buzzword in this dog-eat-dog IT industry is cloud computing and DevOps. The goal of this article is to brief about cloud computing and automation using DevOps tools and how the organizations are adopting, leveraging the technologies and practices to make handsome profits out of it.

Cloud Computing


Cloud computing refers to the online provisioning of computing services instead of utilizing traditional on-premise resources. It’s occupying a bigger slice of the market by incredible features such as security, scalability, reliability and maintainability. There are firms like Amazon, Google, Microsoft Azure and Oracle etc. who are market leaders in providing cloud services. These organizations maintain high availability data centers having high end internet in various parts of the world called regions and offering services such as instances creation or computing services, storage and networking services etc. over internet. It is known as replacement of on-premise data center(s) which consists of in-house infrastructure. Upon rental basis the vendors are offering ready to use services and termed it as “pay as you use” i.e., pay for the resources or services that will actually be in use.
The adoption of this trend is in leaps and bounds by new entrepreneurs and/or big players to have a single pane of cloud glass, so they can majorly concentrate only on product road maps and code development and less likely to bother about infrastructure and its management. By adopting this(cloud) practice companies are staying ahead of the curve in the cloud game thus making profitable business and investing in endeavoring to solve new problems in education sector, health care sector, finance etc.

Looking at workforce on cloud platforms, the term cloud became a breeze in the heads of people, but one should agree to the fact that it is like free as in speech, not free as in beer when get involved. There would be no ifs ands or buts for the excess or misuse of services on billing perspective, hence to work on cloud platforms, one ought to be experienced in networking, server’s management and should be possessed with architectural skills using the tools of respective service providers in prior. In other words, one must know all the ins and outs of the entire flow and precise usage of the tool(s).

This way, the cloud computing had become paramount importance in next generation IT and paved the way for new business trends in the IT sector by lowering the investment on purchasing and maintaining infrastructure and concentrating on development part.

DevOps


The other part of popular tech. buzzword that is taking the heads of people is DevOps. It’s a process of automating the development and operations of project(s) using tools. It consists of continuous development, continuous integration and continuous deployment and configuration management of a product that makes a continuous pipe line. This concept can be implemented using open source or enterprise tools and technologies, for instance tools like Jenkins, team city, puppet, chef etc. are widely used ones.

Initially this approach requires to build a pipe line starting with committing product code to a centralized repository from all the developers and then building and testing it in various phases. The other part of the pipe line is to deploy the code to tens of hundreds or to thousands of servers and it can be done using deployment tools such as Ansible. For instance, social applications like facebook or twitter or you tube needs to be available to millions of users daily, so these vendors make use of tools mentioned above whenever product(s) release takes place without any downtime.


In a nutshell, this approach saves a lot of time to develop and deploy projects and it eventually reduces manual efforts thus makes the customers delighted to make a profitable and highly running business.

- Sadakar Pochampalli