A Holistic View on DBMS / Database Technologies
Data is at the centre of most of today’s businesses and companies must pay close attention to how they store, access and use data. One of the systems at the heart of current data management is a database management system.
But as with most technology products out there, people have the ability to choose from a wide range of database technologies and systems. Picking the most convenient and relevant for your business can be difficult.
This guide will explain what database management systems are and what are the benefits of using one. We’ll also explain some of the most common distinctions between these technologies and provide a few tips on how to select the best system for your business.
INTRODUCTION TO DATABASE MANAGEMENT SYSTEMS
What are database management systems (DBMS)? First, it’s good to understand the different components of DBMSs. At the core of a DBMS is a database, which is essentially an organized collection of data. The data in the database is modeled in a manner, which helps support processes that seek information.
Creating a database is simple, but you also need to be able to use the database for different functions. This is where database management systems come into play.
A DBMS is a computer software application, which helps to interact with the user of the database, different applications, as well the database itself in order to gather and analyze data. A DBMS allows interaction with the data, whether it is to create, analyze, delete or change the data within a database.
While the basic function of DBMSs are essentially the same, there are certain distinctions between different solutions. Therefore, DBMSs are often further categorized into separate groups based on:
- the model they support,
- the type of device they run on,
- the language they use to access the database,
- the internal engineering of the software.
What do database management systems do?
Understanding the idea behind database management systems is easier if you understand the use of these systems. The DBMS provides users four core functions. These are the ability to create, retrieve, update and manage data. These functions are enabled by the way in which a DBMS helps manage three core aspects:
- the data,
- the database engine, which allows data to be accessed, locked and modified,
- the database schema, which defines the logical structure of the database.
Management of these foundational elements allows DBMSs to perform different procedures related to the four core functions mentioned above. With a DBMS, users can perform a variety of actions including:
- Data management
- Data definition
- Transaction support
- Currency control
- Facilities to import and export data
- User management
- Performance analysis
In essence, a DBMS provides users a centralized view of data. For example, businesses might use a DBMS to collect information on customers, but also for operational purposes such as accounting. As we’ll see in the section below, this can have many benefits.
Examples of DBMSs
We’ll be looking at different database technologies later on in the guide, but it’s helpful to mention some known examples of DBMSs. Well-known DBMSs include applications like:
- IBM DB2
Check out the YouTube video below to understand the basics of DBMSs:
WHY DATABASE MANAGEMENT SYSTEMS ARE IMPORTANT?
As the above shows, a DBMS can have a number of functions. The complexity of these systems has evolved and in general, the software is often divided into two main categories: the general-purpose DBMSs and the special-purpose DBMS. Since DBMSs can conduct complex tasks or perform more specified roles, countless groups of people benefit from them.
Typical database management system users include:
- Database administrators
- Database designers
- Application programmers
- End users
Yet, it would be too easy to simply write-off DBMSs as software only IT-personnel need to use. There are huge benefits to using DBMS technologies, which means understanding the different systems is crucial for any business person or individual working with data.
The benefits of database management systems
Perhaps the most crucial advantage of DBMSs is how it allows the end users of the application, as well as the programmers, to access and use the same data. More importantly, this happens without jeopardizing data integrity. A DBMS provides data independence; it offers flexible access to data and guarantees access without forcing the user to necessarily understand where the data is located. To users, this kind of data independence can remove the concern over any possible changes to the structure of data.
A business could easily add new information to the database, as its operations develop and change, without disruption of the existing system. Data loss or operational problems will be limited by using a DBMS.
A DBMS can also enhance the integrity and security of the data. Since it’s possible to allow different levels of access to the same data, the data won’t be easily compromised. As most DBMSs also mean data is located in a physically different location from the user, the security of the data increases further.
In addition, the core functions include tasks such as backup and recovery of the data. This guarantees data isn’t jeopardized while it’s being accessed and modified. Overall, it ensures uniform administration procedures for data.
Finally, it must be mentioned that since DBMSs can be operated remotely, it’s possible to outsource data administration. This can be especially useful for small businesses, which might not have the resources to conduct data administration themselves.
TYPES OF DATABASE TECHNOLOGIES
As mentioned briefly above, there are different database technologies. Each technology offers its own pros and cons and below is a look at some of the most common technologies.
Single- vs. multi-file databases
A notable difference between technologies can be whether they are single-file or multi-file databases.
A single-file database is the simplest database structure, as it consists of unified information, which can often be used and accessed in a pre-determined manner. Complexity in a single-file database is not commonplace.
While this type of technology can provide benefits to narrowly defined data, which organizes in a limited manner, it can offer limited possibilities for use when dealing with large sets of data. For example, different single-file databases don’t interact with each other, even though they might consists of the same information. Therefore, when you update one database, your other databases won’t automatically include these changes and could thus contain wrong information. Single-file databases must also be all in the same location, which can cause difficulties in storage and increase the security risk of the data.
As we’ve increased collection of data, the complexity of data has also risen. Much of the data we use is not in the above, unified format. A multi-file database links different data formats together and allows a more flexible way of organizing and using data. It provides users with the ability to link between different data sets and make sense of them.
Furthermore, a multi-file database provides the advantage of splitting the database and using it from different locations. This provides the technical advantage of speed, as multiple users can access the information quicker than if it was located in the same physical location or disk.
Relational vs. non-relational databases
As the above distinction might have shown, a majority of today’s DBMSs are multi-file databases. But these can be further organised in a different manner. One of the most common options to choose between is relational vs. non-relational databases.
The most basic model for DBMSs is the relational database model, often referred to as RDBMS. This means that the multi-file databases mentioned above, are linked together and data from separate files can be used and accessed from different locations.
The strengths of this model are similar to the strengths of DBMSs. The relational database structure is flexible and reliable. Since it’s such an established technology, the costs and risks associated with it are small.
Nonetheless, there are weaknesses to RDBMSs, with the majority of them relating to specific performance issues. Relational models require pre-determined formats, which can limit the complexity and often result in issues if information is added with a different format. There’s also lack of support for complex base types, such as drawings.
Note that the common relational databases are often referred to as SQL databases. The SQL refers to the programming language (Structured Query Language) and many RDBMSs, such as Oracle and MySQL, use this language for creating and processing databases.
The opposite of a relational database is a non-relational database. Since relational databases are often referred to by their programming language, SQL, the non-relational databases are known as NoSQL databases.
These models are designed to bypass the problem of accessing data, which is not typically structured in a standard model. It therefore provides more scalability and flexibility to relational databases. In fact, non-relational databases can often be stored in a single-file format, because the databases are document-oriented rather than structured.
Non-relational technologies allow the use of non-structured data, such as videos, or photos. These datasets can be categorized in a number of ways, with pre-determined fields.
The downside is that this kind of database categorization requires extra processing power. The requirements of the technology can add a strain, not only for the physical requirements, but also for the cost effectiveness of the system.
Centralized vs. distributed databases
Databases can also be stored in different manners. The most common distinction between different ways database systems are organized is between centralized and distributed databases.
Databases were typically centralized, which means stored, located and maintained in a single location. In the past, this was due to the technological limitations of computers. Under this approach, the data can be accessed from different places, but the data itself is stored in a central computer or database, in a single database file.
Centralized databases benefit from enhanced data integrity and minimized data redundancy, since the data is only found in a single location there’s only one primary market of it. Securing this database can be easier, as you only need to secure the single data location. Overall, the maintenance cost of the data is minimized.
On the other hand, the database access relies quite heavily on network connectivity. Since the data is only stored in a single location, problems in access can result in total loss of data access. Naturally, faults in storage could potentially lead to full loss of data, which could be catastrophic for businesses.
To counter some of the problems with centralized databases, distributed databases have become a popular option. In these systems, the data is stored in multiple physical locations. Furthermore, the distributed databases can be divided into homogenous and heterogeneous databases. The system includes a multi-file system, controlled by a single, central DBMS.
The benefits of distributed systems are flexible in the sense that they provide more security in case there’s a fault in the system. Retrieving lost data is typically easier and faster in a distributed system. Often they also provide better access, as data can be accessed via multiple networks.
The downside is that the creation of a distributed system can be more complex. These DBMSs are based on a hierarchical structure, making it harder to maintain data. Data redundancy can increase in a distributed system. Furthermore, since the data is scattered and provides multiple access points, securing the system can be harder.
Column-oriented vs. row-oriented databases
Finally, DBMSs can differ in how the data is stored. In most instances, DBMSs are either column-oriented or row-oriented.
The differences are more evident once you understand that relational databases typically provide data in a two-dimensional table. This database is essentially a selection of columns and rows. But in order to access this data, DBMSs have to use either a column-oriented or row-oriented approach of collecting the data. This means data is either read as a column format or a row format.
A column-oriented system means the data is stored as sections of columns of data. This means a single column consists of the values and datasets of that column and there can be a number of these columns within the database.
It’s typically used in database management dealing with data warehouses, clinical data and customer relationship management (CRM) systems. The approach is beneficial for these types of tasks because column-orientation helps normalize the data and read different data sets relating to the same field efficiently.
But since a column-oriented database focuses on a whole column, the more complexity your data queries involve, the harder it can be to perform using this approach.
On the other hand, a row-oriented database sees the data stored in sections of rows, rather than columns. The aim is to limit the effort in receiving data regarding a particular query. In fact, many relational database management systems tend to favor a row-oriented approach. For example, online shopping websites often use row-oriented databases for fetching product information.
While this approach can provide data efficiently and quickly, it isn’t efficient when you require the use of a whole dataset. For example, if you were looking for information over personnel earning a specific salary, gathering this data would take an enormous amount of time under a row-oriented approach.
SELECTING A DATABASE MANAGEMENT SYSTEM
As the above shows, there are a number of options available when it comes to database management systems. Whilst the main benefits of DBMSs remain the same, the tasks and needs of the user are essential in determining the right type of technology.
Overall, when you are deciding on a DBMS, you should consider these three aspects:
- The complexity of data – if you are operating multi-file databases, you definitely want to consider using appropriate and the more complex DBMSs available. In fact, under these circumstances it might be a good idea to consider outsourcing your database management. On the other hand, if you are dealing with a single-file database, you are most likely able to manage it without a complex system or understanding of DBMS.
- The structure of data – you also need to consider the way the data is structured. As the above showed, if you are looking for a DBMS for customer database management, a column-oriented software is better than row-oriented. It is important to determine not just how the data is stored, but also how you’ll use it.
- The feature requirements – naturally, you must also keep in mind feature requirements, such as operational requirements. Certain DBMSs don’t operate on platforms such as Linux or Windows, while others could use a programming language you are not aware of. Furthermore, since certain approaches, such as distributed databases require much more from the hardware itself, it might not be a cost effective option for your business.
Finally, you should consider the kind of development and support the DBMS provides for you. This is especially important for business users, as your needs might evolve as years go by. You don’t want the DBMS to restrict your ability to upgrade later and additional software support can be crucial to ensure your business isn’t damaged while you sort out the issues with the DBMS.
In Mountain View (CA), we meet Founder & Chairman of Addepar, Joe Lonsdale. Joe talks about his …