Pentaho Tools :

Pentaho C-Tools(CDE,CDF,CDA),Pentaho CE & EE Server,OLAP-Cubes,Analysis using Pivot4J, Saiku Analytics, Saiku Reporting, Ad-hoc Reporting using Interactive Reporting Tool,Dashboards,Reports using PRD, PDD,Data Integration using Kettle ETL,Data Mining usign WEKA,Integration of Servers with Databases,Mobile/iPad compatible Dashboards using Bootstrap Css,Drilldown dashboards,Interactive Dashboards

Monday, 3 February 2014

Part - I :: data warehousing --- Star schema from orcle documentation.

This post is for collecting data ware housing concepts quick understanding to work with BI tools such as Talend ETL or Pentaho Kettle.
 Few of the other basic concepts will be included in Part-II to Part -V 
Schema 
A schema is a collection of database objects, including tables, views, indexes, and synonyms.

Schema models are used for designing data warehousing.

The Star schema
  1. It  is the simplest data warehouse schema. 
  2. Why it is called star schema ?
    B'z the diagram of star schema resembles as a star with points radiating from a center. 
  3. The center of the star consists of one or more fact tables and the points of the star are the dimension tables.
  4.  A star schema is characterized by one or more very large fact tables that contain the primary information in the data warehouse and a number of much smaller dimension tables (or lookup tables).
  5. Each dimension tables contains information about the entries for a particular attribute in the fact table.
Star Query:
  1. A star query is a join between a fact table and a number of lookup tables.
  2. Each lookup table is joined to the fact table using a primary-key to foreign-key join, but the lookup tables are not joined to each other.
Star join :
  1. A star join is a primary-key to foreign-key join of the dimension tables to a fact table. 
  2. The fact table normally has a concatenated index on the key columns to facilitate this type of join.

Advantages of Star Schema:
* Star schemas are denormalized
* That is ,the normal rules of normalization applied to transactional relational databases are relaxed during star schema design and implementation.
  • Simpler queries:
    •  Star schema join logic is generally simpler than the join logic required to retrieve data from a highly normalized transactional schemas.
  • Simplified business reporting logic
    •  when compared to highly normalized schemas, the star schema simplifies common business reporting logic, such as period-over-period and as-of reporting.
  • Query performance gains 
    • star schemas can provide performance enhancements for read-only reporting applications when compared to highly normalized schemas.
  • Fast aggregations 
    •  The simpler queries against a star schema can result in improved performance for aggregation operations.
  • Feeding cubes 
    •   star schemas are used by all OLAP systems to build proprietary OLAP cubes efficiently.
    •  most major OLAP systems provide a ROLAP mode of operation which can use a star schema directly as a source without building a proprietary cube structure.
Example-1:
  



References:
http://docs.oracle.com/cd/A87860_01/doc/server.817/a76994/schemas.htm
http://en.wikipedia.org/wiki/Star_schema