In a previous article, we discussed SQL Server Integration Services, its components and how to create package by using Import Export Wizard. In this article we will take an overview of SSAS and its various components.
With SSIS, we discussed how data can be fetched and cleansed before it is put into a uniform platform. With SSAS, as the name suggests, we can use the data for analysis.
With any organization we have two kinds of data, one which stores transactional details and the other which has historical (Data Warehouse - DW) details. The data in DW will be populated periodically from the data from transactional (OLTP – OnLine Transactional Processing). The data in OLTP is optimized for faster insertion and updations, but the data in DW is meant for faster querying. With DW we can store duplicate data, computed columns, non normalized data as the storage is not important. The important aspect is that when a query is fired, it should give faster results.
When we want to take any decisions based on data, the data needs to be in an analyzed format. For e.g. we want to find out the code churn for developers in an organization. We need to have all the details for each developer; such as how many lines of code were written, modified or deleted. When we need the report, if we start calculating the numbers for each developer for a day, a week, a month or a quarter, it will be time consuming job. If I already have this aggregated data in cache, my job will be very easy. This aggregated data is put in cubes which are termed as multidimensional databases. As the normal cube has multiple dimensions, the data can be viewed with various axes. In this case one axis can be for each developer; another can be for a particular period and third can be for a particular department or for a particular project. Thus we can create different slices for looking at the same data. These slices are termed as dimensions and the actual data is the fact. The sales related data (like the sales person information, data for sales, quantity, and price) will be termed as fact and the axes (like sales person wise information, location wise details, quarterly details) will be termed as dimensions.
The thumb rule says that facts are numeric measures while dimension has primary keys. Thus primary keys in dimension tables are related to foreign keys in fact tables so as to get the details as and when required. There are 2 dimensional models namely star schema and snowflake schema. In star schema, single fact table is related to multiple dimension tables based on primary key in dimension tables (which resembles start like shape) as follows.
In snowflake schema, one of the dimension acts as a fact for another dimension thus giving snowflake like appearance as below
With SSAS, we first need to specify the source from which data is to be fetched. Later the data can be analyzed. After the source, we can create a Data Source View (DSV) to the source. If the source database has 50 tables, we need not want to work with all of them in a given situation. So we create a window to the data with required tables in it. We can also group tables from different sources and put it in the same DSV.
We can even create logical relationships amongst the table in the DSV where the source database remains intact. We are just creating a logical view to the tables.
In this article, we took an overview of SSAS and its various components. In the next article, we will create a cube by using Business Intelligence Development Studio (BIDS).
2 comments:
excellent job! as i read certification book 70-448 and then read this article. u cemented the facts i had in my head. i would simply say, weldone and thanks a lot - regards - Adnan Khan
Very nice and clear explanation. I really like your post. Thank you.
Bhavna P.
Post a Comment