When it comes to your data warehousing design, we can all agree that
if business requirements are not the driving force, “all bets are
off” for a successful project. Now, especially when managing the
bottom line is critical, data warehousing must offer enhanced visibility
into all aspects of costs, and quick access to what’s profitable
(and what’s not!), while offering this information to a wider audience
than ever before.
But if your organization is like most,
a certain amount of “analysis paralysis” can hinder the early
stages of data warehousing design and even implementation, brought on
by questions such as:
- Should we design top-down or bottom-up?
- Should we build quick-hit data marts, or plan for an enterprise architecture?
- Are there technical limits to how much data we could, should, or
would store because of our design choices?
These questions are enough to give any
data warehouse architect a good case of “FUDD” (Fear Uncertainty
Doubt and Dismay)! On a more serious note, these issues can reflect a
debate typically between two camps: those who look to rapidly built, subject-focused
data marts for quick ROI, and those who advocate the need for
a central data warehouse that will ensure consistent data across
the enterprise.1 More and more, IT teams
are looking for a “best-of-both” approach that meets the need
for rapid, non-technical access to accurate, consistent information at
all levels of the enterprise.
For those considering SAP Business Information
Warehouse (SAP BW), the good news is that you won’t necessarily
have to choose between one model or the other — SAP BW is fully
equipped to incorporate a hybrid approach to data warehouse design.
A Brief Overview of the Data Warehousing Debate
Although this article assumes familiarity with data warehousing design,
it’s useful to highlight some of the basics of these two camps as
they pertain to this discussion of SAP BW:
information factory (CIF): centralized data
William Inmon’s concept of a corporate information factory (CIF)
is most associated with a centralized data warehouse model. In this top-down
enterprise-wide approach, the CIF ultimately subsumes what is typically
known as a “data warehouse” (depending on your definition,
of course!) and is the nexus for your legacy systems. The CIF includes
the ODS (the Operational Data Store for granular, volatile, operational
data), a Central Data Warehouse (subject-oriented, non-volatile,
detailed data), and Data Marts, among other elements. Most large
enterprise systems have traditionally favored this approach for its ability
to handle large amounts of data.
data warehousing: architected data marts
connected by an enterprise bus architecture.2 In
approach, the architecture is divided into
the Operational Source
Systems, the Data Staging Area,
the Data Presentation
Area, and the Data Access Tools Area.3
Kimball takes a bottom-up approach to data
warehousing, where the central feature is
a coordinated bus architecture of star schemas
that allows the data warehouse to be designed
in manageable, bite-size chunks often referred
to as data marts. Each new data mart is plugged
into the bus. Individual stars in the overall
constellation share conformed dimensions
— dimensions that apply cross-functionally
within the organization and therefore have
applicability to multiple star schemas. In
the end, an overall data warehouse is built
data mart by data mart. The distinct separation
of layers into a “front room” (the
presentation layer) and “back room” (the
staging area, generally hands-off to information
consumers) is also key. Latitude is provided
for ODS functionality, but is not detailed
in Kimball’s architecture.
Putting It All Together with SAP BW
With SAP BW to facilitate convergence and foster compatibility, an IT
team can avoid the quandary of having to select one approach over the
other. In many companies, a hybrid approach combines the drive for an
overall, enterprise architecture with the flexibility of an “island-hopping”
campaign of coordinated data marts (thus sustaining the corporate will
to fund the data warehouse!). With SAP BW, cooperative scenarios are possible
where the ODS and data warehouse can play the role of the bus, backroom,
or both at the discretion of the data warehouse architect.
That said, there are still gaps between
the two approaches. So let’s take a closer look into how SAP BW
helps bridge those to attain a “best of both worlds” approach.
As part of SAP’s larger NetWeaver
solution, SAP BW 3.0B/3.1 (released in June 2002/December 2002 respectively)
is tightly integrated with SAP Web Application Server and SAP Enterprise
Portal to form an unmatched, scalable, extensible, and user-friendly data
warehousing solution.4 While SAP BW is recognized
as a CIF-compliant, comprehensive data warehousing solution (see the sidebar
on page 30), SAP BW is also able to comply with the multidimensional approach
— or with a complementary, hybrid approach.
In SAP BW, diagrammed in Figure
1, data first flows into the Persistent Staging Area (PSA).5
Most customers tend to leverage PSA functionality in SAP BW, even though
its presence is an option, not a requirement. From here, ODS objects6
can be used to build a logical ODS or data warehouse. Since the granularity
and temporality of data stored in an ODS object is customer-defined, customers
are free to model a data warehouse (granular, non-volatile, integrated)
or ODS (granular, volatile, operational) by forming logical collections
of ODS objects. This capability allows SAP BW to adapt to the prevailing
definition of “ODS” (as contrasted with the notion of the
“data warehouse”), even within the same data warehousing project.
||SAP BW Can Bridge the Gaps Between Differing Approaches to Architecture
Whether the ODS object collections form
an enhanced staging area or a fully functional reporting area, SAP BW
accommodates directing the data to multidimensional storage. If preferred,
very granular data may be directed to multidimensional storage. Alternatively,
summarized data could come off of the data warehouse and into the data
mart (possibly multidimensional) storage area. In either case, conformed
dimensions are supported across star schemas.
In all cases, with SAP BW there are many
options for placement of system boundaries, ranging from assigning one
system for each element in Figure 1 (i.e., the ODS can encompass its own
system), to separating the data warehouse from multidimensional “data
marts,” to housing the entire diagram on a single system.6 SAP BW
also provides robust tuning and optimization tools for all aspects of
the data warehouse (ETL processes, PSA, ODS, star schemas, Web and Windows
queries, etc.) on platforms as divergent as Windows NT, AS/400, and Unix.
Support for the Corporate Information
SAP BW provides a solution that extends beyond the enterprise data
warehouse and attendant ODS/data marts. Based on feedback from customers
and industry analysts, SAP has brought to market a fully realized
corporate information factory (CIF).7
CIF compliance in SAP BW encompasses an unprecedented array of services
within a data warehousing context, including:
- A wide range of packaged Decision Support Systems (DSS) applications
available from SAP — ranging from sophisticated Supply Chain
Supply Network and Demand Planning features to complex financial
consolidations, planning, and balanced scorecarding capabilities
— that integrate directly with SAP BW. These hybrid (part
functional, part analytical) applications build on the unique
strengths of SAP BW’s open, robust architecture.
- Support for non-SAP DSS applications via the built-in “Open
- Support for e-analytics via XML interfaces, along with bundled
ETL for SAP’s own e-business applications.
- Data marting, supported in a variety of ways: physical and
logical, global and local, integrated and distributed. (In a CIF,
data marts are a component of the overall architecture. Their
role in SAP BW can be customer-defined.)
- An integrated data mining engine, plus integration interfaces
to industry-recognized data mining tools.
- An integrated metadata repository publishable to the Web.
- Archiving capabilities.
- Near line storage integration with select industry-leading
near line storage solutions.
- A comprehensive, integrated, platform-independent monitoring
environment for all aspects of the data warehouse — from
ETL processes to report performance.
- Robust, packaged ETL for SAP applications.
- Published, open interface compliance in every aspect of the
data warehouse, in addition to open interfaces and access to SAP
BW’s frontend for third parties.
Whatever configuration you use, SAP provides a proven Accelerated SAP
(ASAP) implementation methodology for gathering business requirements
for the right-sized data warehouse. What’s more, SAP BW’s
ASAP methodology includes Business Content for rapid data warehouse development,
and consists of preconfigured, extensible ETL processes, data elements,
ODS objects, star schemas, and role-based reports integrated with SAP
Enterprise Portal. SAP BW Business Content represents SAP’s years
of best business practice experience, with coverage of both cross-industry
functional content and vertical industry-specific content. All of this
is layered on top of rock-solid, CIF-compliant technology within SAP BW.
SAP BW provides a robust, adaptable solution
to your enterprise data warehousing needs. With its ability to conform
to industry-recognized architectural and philosophical data warehousing
tenets, you can move beyond debates about design, and focus on meeting
business and user requirements for your analytic and reporting needs.
For more information on SAP BW, visit
article addresses some of the key points
of deliberation in what’s known as the “Inmon vs.
Kimball” debate — referring to
William Inmon and Ralph Kimball, two pioneers
of data warehouse design. See William Inmon’s
publications, including Building the
Data Warehouse (Wiley, 2002), “Data
Warehouse, ODS And Data Marts: The Corporate
Information Factory,” and “What
Is a Data Warehouse?,” and Ralph Kimball’s
books The Data Warehouse Lifecycle Toolkit (Wiley,
1998) and The Data Warehouse Toolkit,
Second Edition (Wiley 2002).
known as a “conformed
architecture.” Note that “multidimensional” here
refers to the overall architecture of the
data warehouse, not the underlying
physical data storage mechanism. Many debates
rage over the use of Multidimensional OLAP
(MOLAP), Relational OLAP (ROLAP), Hybrid
OLAP (HOLAP), and a host of other (xOLAP?)
related approaches to physical data storage.
3 Ralph Kimball, The
Data Warehouse Toolkit, Second Edition (Wiley
article in this issue of SAP Insider (www.sapinsider.com),
along with www.sap.com/netweaver or http://service.sap.com/netweaver.
5 The “persistence” of
the PSA is customer-defined.
SAP BW, individual Third Normal Form (3NF)
building blocks are labeled “ODS objects.” These
are not to be confused with the “ODS” described
in Inmon’s corporate information factory
(which can be “global,” “local,” or “web” ODSs),
or the ODS Kimball allows for (a system between
operational systems or a hot partition of
the data warehouse, as two examples). In
fact, collections of SAP BW ODS objects could
form an “ODS” as defined by either
camp, or they could comprise an “Enterprise
Data Warehouse” as defined by Inmon.
7 See "SAP and
the Corporate Information Factory" at www.billinmon.com/library/whiteprs/SAP_CIF.pdf.
BW even supports scenarios where the "Source Systems" are
hosted in the same physical system as BW
itself. See SAP's MCOD (Multiple Components
in One Database) functionailty discussed
Glen Leslie is product manager with
the U.S. Business Intelligence product
management group at SAP Labs. He has
been involved in data warehousing design
and implementation projects since 1996.
He can be contacted at email@example.com.