Der Begriff Data Warehouse steht für eine Datenbank, deren Datenquelle die operativen Systeme im Unternehmen sind und die eine Datenbasis für
Entscheidungs Unterstützende Systeme (EUS) darstellt. Das Data Warehouse-Konzept läßt sich als eine strikte Trennung von
operationalen und entscheidungsunterstützenden Daten und Systemen beschreiben.
In 1990 prägte Bill Inmon den Begriff "data warehouse ". Seine Definition lautet:
A (data) warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making process.
Subject-oriented - data are organized according to subject instead of application e.g. an insurance company using a data warehouse would organize their data by customer, premium, and claim, instead of by different products (auto, life, etc.). The data organized by subject contain only the information necessary for decision support processing.
Integrated - When data resides in many separate applications in the operational environment, encoding of data is often inconsistent. For instance, in one application, gender might be coded as "m" and "f" in another by 0 and 1. When data are moved from the operational environment into the data warehouse, they assume a consistent coding convention e.g. gender data is transformed to "m" and "f".
Time-variant - The data warehouse contains a place for storing data that are five to 10 years old, or older, to be used for comparisons, trends, and forecasting. These data are not updated.
Non-volatile - Data are not updated or changed in any way once they enter the data warehouse, but are only loaded and accessed.
Data Warehouse Definition, wie sie an der Stanford University gelehrt wird:
A Data Warehouse is a repository of integrated information, available for queries and analysis. Data and information are extracted from heterogeneous sources as they are generated....This makes it much easier and more efficient to run queries over data that originally came from different sources.