What is OLAP?
Users of information come in all shapes and sizes, and they certainly have different reasons or objectives for looking for a specific data or piece of information. Certainly, a single set of data may be used by thousands of people with a hundred different purpose in mind, so it shouldn’t come as a surprise that their way of approaching the data will also vary.
In recent years, processing information has become easier, thanks largely to technology playing a huge role. The computer, for example, has become one of the most essential technological tools or devices today, so much so that it is now quite difficult to imagine a time when we did not have the computer to rely on. Undoubtedly, business and industry are only two of the many fields or areas that greatly benefited from computer and information technology, particularly when it comes to its core operations as well as other business processes.
Business intelligence, or BI, is one of those aspects that an enterprise or organization focuses a considerable amount of its resources on. You will find businesses investing on computer systems and applications – both software and hardware – to be used in its business processes.
Basically, business intelligence encompasses all the tools, systems and infrastructure that an organization or business enterprises uses in its strategic planning and management process. These include software applications, programs, and other technologies that facilitate the collection, analysis and presentation of business information.
Aside from being described as a set of utilities, BI is also often used to refer to the discipline for handling or managing information. This is why it also covers the best practices and activities such as data mining, querying, reporting, relational database, and – what we will focus on – online analytical processing.
AN INTRODUCTION TO OLAP
BI is a very broad category, and Online Analytical Processing is only one of the many tools contained in it. However, it is also one of the more useful and powerful tools, and proof of that is the large number of BI applications that make use of this technology.
Online Analytical Processing, or OLAP, is a powerful data discovery tool designed to enable users to perform multidimensional analysis of data. It aids users to access, extract, and view data (using a computer) from different perspectives or points of view.
OLAP demonstrates a slight variation from the “Online Transaction Processing” (or OLTP), which has a more traditional, simpler and more limited view, considering how it is primarily focused on transactions and transaction-oriented applications. It doesn’t look past the surface, so users are more focused on data entry, transaction processing, and retrieval. Think of the typical business applications used in day-to-day business operations, and it is highly probable that OLTP is used.
With OLAP, however, the focus is much wider, going beyond a straight unidirectional transaction to consider multiple dimensions. Thus, it is the technology designed primarily for decision support systems and management information systems of organizations.
OLAP and OLTP are two different beans, but they belong on the same pod, which means that they are complementary technologies, working together to further solidify an enterprise’s business intelligence. Businesses require both, not just one or the other.
We can gain better understanding of OLAP by taking note of how it differs from OLTP.
- OLAP application. As mentioned earlier, OLAP is mainly for the use of ‘knowledge workers’, such as managers and executives, who will require information for theirdecision-making. OLTP is more operational in its application, with employees and staff (for example, an IT professional) being the main users.
- OLAP outlook. OLAP takes a strategic stance, thinking long-term and historical instead of just a few weeks or months, which is the horizon that OLTP is looking at. Operational information is not seen to have enough of an impact after, say, 10 or even 5 years from the time the transaction took place. Therefore, OLAP looks at the underlying information of these operational data to identify or establish trends over 5, 10, or more years.
- OLAP storage. OLAP data is stored in a multidimensional database, which makes sense, really, considering how users will be approaching the same data from different directions, with different objectives for their analyses. The “dimension” we are referring to in a multidimensional database is the data attribute. Users may look at the same data set, but will focus on different data attributes, depending on their objectives. For example, data on the annual sales of an automobile manufacturing company may be accessed by User A to analyze the sales trend within the 12-month period. User B is more regional in his approach, since his objective is to compare sales levels for the year in the different geographical locations or regions and recognize the best-performing country or region. User C, on the other hand, may want to use the data as basis for forecasting sales in units for the next five to ten years.
- OLAP emphasis. The emphasis of OLAP is on retrieval of information, which will be used in strategic decision-making. In contrast with operational or transactional decision-making, strategic decision-making is not something that is performed frequently, so as to require information in a snap. Thus, OLAP may be refreshed daily, weekly, or any frequency or timing for collecting and cleaning data for analysis later on. This will also allow for better indexing of information that can be easily retrieved even several years into the horizon. This is clearly not the case with OLTP, which puts emphasis on updating instantly and automatically.
THE OLAP CUBE
At the core of the OLAP concept – and any OLAP system, for that matter – is the OLAP Cube. The cubic shape – and the arrangement of data into that shape – represents how OLAP adheres to the principle of multidimensional analysis, resulting in a data structure designed for fast and effective data analysis, as opposed to the limitations attached to relational databases.
Also called the “hypercube”, the OLAP Cube is described as consisting of “numeric facts (measures) categorized by facets (dimensions). Dimensions refer to the attributes that define a business issue or problem; to put it simply, a dimension is a label describing the measure. For example, in sales reporting, the measure will be the amount of sales, while the dimensions will often include the Sales Period, the Salespersons, the Product or Service Sold, and the Sales Region.In reporting manufacturing operations, the measure may be the Total Production Costs and the Units of Production. The dimensions will be the Date or Time of production, theProduction Stage or Phase, and maybe even the Workers involved in the production process.
Usually, analysis of operational information obtained from transactions may be performed through the use of simple spreadsheet, with the data values presented in rows and columns. This is fine, considering the two-dimensional nature of the data. It’s different in the case of OLAP, which involves a multidimensional data array, with data often obtained from different and unrelated sources – something a spreadsheet won’t be able to handle effectively and efficiently. The Cube, on the other hand, addresses that problem, and also ensures the storage of information in a logical and orderly manner.
How does it work?
A business collects data from different sources and presented in different formats, such as text files, multimedia files, Excel spreadsheets, an Access database, and even an OLTP database. All the data is gathered in a Data Warehouse, which is essentially a large repository or storage of data accumulated straight from sources.
In the Data Warehouse, the raw data obtained from OLTP and other sources will be cleaned of any erroneous, incomplete, and inconsistent transactions or data. After cleaning and transformation, the cleaned data will be stored in a relational database.
The data will then be loaded into the multidimensional OLAP server (or the OLAP Cube) for analysis.
End users responsible for business applications, data mining and other data-intensive business operations will then access the information they require from the OLAP Cube.
BASIC ANALYTICAL OPERATIONS OF OLAP
There are three basic analytical operations in OLAP:
Also known as “consolidation”, roll-up involves accumulating all data that can be collected, and computing all their relationships in one or more dimensions. More often than not, this may require the application of a mathematical formula.
As an example, take a look at a retail chain, with sales outlets in various cities within the state. In order to identify sales patterns and anticipate future sales trends, sales data from all of the outlets are “rolled up” to the company’s main sales department for consolidation and computation.
This is the reverse of roll-up. It starts with a large set of data, and then broken down into its smaller parts, thereby allowing users to go through the details.
In the retail chain example, the analyst will drill down by looking at the sales data and viewing the individual brands or products that are considered to be bestsellers in each of the sales outlets in the different cities.
Slice and Dice
This analytical operations involves two actions: to take out a specific data set from the OLAP Cube (this is the “slicing” aspect of the analysis) and viewing it from various perspectives or angles (this is where “dicing” comes in).
This can happen when all data – not just the sales data – of all sales outlets are obtained and entered into the hypercube. Management or the analyst will slice from the OLAP Cube the data set pertaining to sales, which will then be diced, or viewed in the analysis of unit sales per region, while other users may focus in evaluating the cost-effectiveness and efficiency of the sales process. On the other hand, another can use the same data set in evaluating the effectiveness of the company’s marketing and advertising campaign.
IMPACT OF OLAP ON BUSINESS
You will find OLAP in majority of business applications, across industries, benefiting not only the businesses but all other stakeholders as well.
Speaking in general terms, we often hear OLAP being described as a vital tool for data mining operations. Some of its most common applications include:
- Financial Reporting, which also covers reporting on Sales and Costs
- Budgeting and Financial Planning
- Business Process Management
- Marketing Analysis
- Sales Analysis
- Database Marketing
- Management Reporting
- and more…
Industries continue to grow, which means that soon we will be seeing more applications of OLAP. Before that, though, let us try to take a look at the benefits of OLAP, in order to better understand why businesses integrate it into their business intelligence initiatives.
- OLAP systems are designed specifically for analysis. OLTP systems are highly valued by businesses, and you will find almost all businesses and enterprises using OLTP systems in their execution of transactions. However, they are not ideal for use in data analysis, and that is where OLAP comes in.
- OLAP serves as a “one-stop shop” for all business information needs. Imagine a set up where you, as the manager and decision-maker of a company, will have to use separate databases for every decision you make. For example, there will be a separate database for information to be used in planning and budgeting. Similarly, there will be another database for when the business will have to prepare business and financial reports. And then again, for monitoring and analysis purposes, there will be another database. That can be quite tedious, not to mention costly, having to maintain more than one or two databases. OLAP works to streamline that, creating only one platform for all your information requirements.
- OLAP ensures consistency of information. High-quality information – which is basically information that is reliable and, therefore, may be used with confidence as basis in making business decisions – must have consistency, and OLAP takes care of that aspect.
- OLAP provides predictive advantages. The multidimensional processing adapted by OLAP makes for a more dynamic analysis. This is the reason why OLAP systems and technologies are used to evaluate “what if” and alternate business scenarios.
TYPES OF OLAP SYSTEMS
Now let us take a look at the various types of OLAP Systems, their advantages and disadvantages, and examples of OLAP tools or products currently being used by businesses.
Relational OLAP, or ROLAP
This system marries the standard operation of a relational database and the multidimensional nature of the data to be subjected to analysis. Analysis of the multidimensional data will also be conducted using other relational tools.
In this method, the multidimensional database will obtain or load data from the relational database only when there is a need for those specific data. That means there is no data directly stored in the multidimensional database.
If the database is medium to large, ROLAP is the appropriate OLAP system for you.
- High data efficiency. This is partly because access language and query performance are optimized specifically for the analysis of multidimensional data.The fact that the multidimensional basis will only draw data or information as and when needed means the database won’t be as vulnerable to speed problems as other system types.
- Scalability. ROLAP supports scalability in handling large volumes of data, and even when the amount of data is steadily increasing. This is also partly thanks to how the database is not too bogged down with too many unnecessary data loaded into it.
- More flexible tools. Even when the data is mostly textual or narrative, meaning they cannot be aggregated, computed, or even clearly quantified, ROLAP tools will still be able to handle them. It also offers convenience, such as when it comes to accessing data stored within the relational database. It does not require any special tool – not even an OLAP tool – to access the data, since any SQL reporting tool will do the job just as easily and effectively.
- High demand for resources. ROLAP entails high utilization of various resources such as manpower, fixed assets throughout the analytical process (although not as demanding as other types). This is especially true when the ROLAP data model is subjected to fine-tuning and code customizations to improve or enhance its capabilities.
- Aggregatable data limitations. ROLAP tools make use of SQL for all computations of aggregatable data. But there are limits to the capability of SQL in handling computations, especially heavy ones, such as those involved in budgeting and financial reporting. Risks are high that results returned by SQL will have inaccuracies. This limitation may be addressed by customizations and recoding, but even that will require a lot.
- Slow query performance. Its query performance is slow, generally, when compared with MOLAP, regardless of the fact that ROLAP deals with larger data volumes. Often, this is attributed to the number of users simultaneously running ROLAP tools.
- Security issues. ROLAP makes it possible for anyone to have direct access to the data cells.
Multi-dimensional OLAP, or MOLAP
Basically, this is the typical cube-based OLAP that we have learned and talked about in the definition. It implements analytical processing of multidimensional data by using the OLAP Cube, or any data cube, to enable the analytic process to add dimensions as required.
Any data loaded into the multidimensional database will be stored or archived within it, and will only be called up in the future when they are required.
MOLAP is more effective when we are talking of small- to medium-sized databases.
- High functionality while remaining compact. Performance is the strongest point on MOLAP, since it can manage, analyze and store considerable amounts of multidimensional data. Storage is optimized, thanks to compression techniques that keep data size on disk small as to be minimal.
- Fast database query performance. It can get quite irritating when analytical processing is bogged down by a slow database mostly due to inefficient, or lack, of storage. That is not an issue with MOLAP, where the analytical process can take place in the OLAP Cube. Compared to a ROLAP, MOLAP databases are considerably faster, with the duration for indexing becoming shorter.
- Automation of data calculation. Most MOLAP tools come packed with pre-calculated data sets, and this also contributes to the database’s fast overall performance.
- Fast and effective data retrieval. MOLAP is designed to pre-structure aggregated data and, combined with the automation of computation of data aggregates, a lot of the work is cut down, and the data retrieval process is also made easier. Users won’t have to jump through hoops to extract the data or information needed.
- High demand for resources. If ROLAP already has a high demand for resources, MOLAP proves to be more demanding, requiring more resources from the client.
- Limited scalability. MOLAP may be scaled up, but only to a certain extent.
- Increased possibility of data redundancy. Although this is not a general and frequent occurrence, there is still a risk of data redundancy.
Hybrid OLAP, or HOLAP
In this system, there are two databases used: the aggregate data, or the totals, are stored in the multidimensional database. Meanwhile, the details, or the smaller bits and pieces of data, will be in the relational database. It is only when more detailed data is needed that they will be unloaded from the relational database and subsequently loaded into the multidimensional database.
Basically, HOLAP combines the best parts of ROLAP and HOLAP, resulting in a flexibility that allows it to use both relational data sources and the OLAP Cube, with its pre-calculation capabilities. This combination also works well in addressing the individual weaknesses of ROLAP and MOLAP.
- Optimized aggregate maintenance and storage. Its MOLAP features ensure that management of aggregated data is easy, and disk space is economized to ensure that it remains compact and won’t cause any issues with regard to access speed and convenience.
- Fast query performance. HOLAP’s usage of cube technology, which is apparent in MOLAP, enables faster performance for both broad and detailed data. In some HOLAP systems, the database storage is divided into two: large volumes of detailed data are stored in the relational section of the storage, while the smaller volumes of aggregated historical data are placed in the other (multidimensional) section.
- Dynamic updates. ROLAP requires instant updates, and that is incorporated into the HOLAP, allowing users real-time access to the data, while the MOLAP component guarantees that data were still subjected to cleaning and conversion for relevance in order to accurately reflect evolving structures. Even if the updates are instantaneous, the information will still be deemed useful and relevant for decision-making purposes.
- High level of complexity. This is probably the biggest disadvantage in HOLAP systems, arising from the fact that the server is expected to support both MOLAP and ROLAP tools and applications. This is likely to require higher technical know-how on the part of the staff assigned to maintain it, not to mention other resources required to run and operate a HOLAP for business intelligence.
- Potential overlaps. We are talking about two different system engines here. Somewhere along the way, there is bound to be some overlapping, especially when it comes to their functionalities. The only difference will be in the level of severity of these overlaps. The most common overlap is seen in the optimization techniques and potential redundancy of data in the storage. Data duplication is highly possible, with data appearing on both the ROLAP and MOLAP components.
Hybrid Transaction/Analytical Processing
HTAP is still relatively new as a technology, having been formally named and introduced only in 2014 by technology research firm, Gartner Inc. However, the concept has been around for as long as the idea of processing data through a computerized system has been introduced.
While HOLAP combined the best features of MOLAP and ROLAP into a single architecture, HTAP boasts the combination of the capabilities of OLTP and OLAP into a single database, to ultimately enable real-time processing of business operational intelligence. In short, it introduces data systems that can perform online transaction AND analytical processing of data.
- More powerful processing. HTAP often entails the usage of new hardware and software platforms and, combined with the latest in-memory technologies, we are looking at a decidedly more powerful processing system with both OLTP and OLAP features and capabilities.
- Speeds up decision-making. With conventional OLAP alone, it will take time before data can be converted into a form that allows an informed decision to be made. OLTP, on the other hand, is not meant for analysis and decision-making, so using transactional data as basis for planning and other similar business processes is out of the question. HTAP bridges that gap by allowing transactional data obtained through OLTP to immediately undergo advanced analytics live so that managers, leaders and executives can make decisions immediately.
- Guaranteed data uniqueness. Replication and redundancy of data are eliminated, and this is made possible by the seamless integration of OLTP and OLAP features into the HTAP.
- High technological and other resource requirements. If you thought HOLAP demands a lot, then the higher level of complexity of HTAP means that it is even more demanding. Aside from the obvious requirement of advanced and updated technologies, hardware and software, HTAP also requires people with highly specialized experience and skills.
There are other, lesser known and utilized OLAP systems or technologies, but these four are the major ones, currently being used by large corporations, business entities, and even governments. Other OLAP systems have come and gone, but these proved to be the more established technologies.
Clearly, online processing in business intelligence is here to stay. There is no doubt that, sometime – maybe soon – in the future, other, better, technologies will be developed to supplant OLAP. Until then, business and even government leaders are bound to find many uses for it.
If you have ever heard of the 80/20 rule, there is a good chance that the person talking about it …