Monday, September 8, 2008

Data Warehouse Architectures

Data warehouses and their architectures vary depending upon the specifics of an organization's situation. Three common architectures are:
  • Data Warehouse Architectures(Basic)
  • Data Warehouse Architectures(with a staging area)
  • Data Warehouse Architectures(with astaging area and Data marts)
Data Warehouse Architecture (Basic)
Figure 1 shows a simple architecture for a data warehouse. End users directly access data derived from several source systems through the data warehouse.
Figure 1 Architecture of a Data Warehouse

In Figure 1, the metadata and raw data of a traditional OLTP system is present, as is an additional type of data, summary data. Summaries are very valuable in data warehouses because they pre-compute long operations in advance. For example, a typical data warehouse query is to retrieve something like August sales. A summary in Oracle is called a materialized view.
Data Warehouse Architecture (with a Staging Area)
In Figure 2, you need to clean and process your operational data before putting it into the warehouse. You can do this programmatically, although most data warehouses use a staging area instead. A staging area simplifies building summaries and general warehouse management. Figure 3 illustrates this typical architecture.
Figure 3 Architecture of a Data Warehouse with a Staging Area


Data Warehouse Architecture (with a Staging Area and Data Marts)
Although the architecture in Figure 3 is quite common, you may want to customize your warehouse's architecture for different groups within your organization. You can do this by adding data marts, which are systems designed for a particular line of business. Figure 4 illustrates an example where purchasing, sales, and inventories are separated. In this example, a financial analyst might want to analyze historical data for purchases and sales.
Figure 1-4 Architecture of a Data Warehouse with a Staging Area and Data Mar

Benefits of Data Warehousing

  • A data warehouse provides a common data model for all data of interest regardless of the data's source. This makes it easier to report and analyze information than it would be if multiple data models were used to retrieve information such as sales invoices, order receipts, general ledger charges, etc.
  • Prior to loading data into the data warehouse, inconsistencies are identified and resolved. This greatly simplifies reporting and analysis
  • Information in the data warehouse is under the control of data warehouse users so that, even if the source system data is purged over time, the information in the warehouse can be stored safely for extended periods of time
  • Because they are separate from operational systems, data warehouses provide retrieval of data without slowing down operational systems.
  • Data warehouses facilitate decision support system applications such as trend reports (e.g., the items with the most sales in a particular area within the last two years), exception reports, and reports that show actual performance versus goals.
  • Data warehouses can work in conjunction with and, hence, enhance the value of operational business applications, notably customer relationship management (CRM) systems.

Data warehouse

A data warehouse is a repository of an organization's electronically stored data. Data warehouses are designed to facilitate reporting and analysis. This classic definition of the data warehouse focuses on data storage. However, the means to retrieve and analyze data, to extract, transform and load data, and to manage the data dictionary are also considered essential components of a data warehousing system. Many references to data warehousing use this broader context. Thus, an expanded definition for data warehousing includes business intelligence tools, tools to extract, transform, and load data into the repository, and tools to manage and retrieve metadata.In contrast to data warehouses are operational systems which perform day-to-day transaction processing