In the business world, whether you are a small company or a thriving enterprise, data holds the keys to success. It does not matter if your company sells the best product in your market or if your firm offers the best services available, without data to track things such as sales, costs, opportunities, and other performance factors, you are losing valuable insights and possibly a competitive edge.
The concept of a data warehouse is older than the Internet, but what is it exactly? A data warehouse, also referred to as an enterprise data warehouse (EDW), is a central database consisting of normalized data from a collection of disparate databases and other data sources. They are often used as the basis for business intelligence systems in companies when used with business intelligence tools such as Power BI and Tableau.
In most cases, since building a data warehouse can get quite technical, they are normally built and maintained by the IT team at a company. However, data warehouses are meant to be consumed by reporting and business intelligence tools to present the data in a usable manner by and to the business users. It is important to remember that in the end, it will be business people consuming the data and it should be presentable in a way that makes sense to them.
It is also worth mentioning, that while the term data warehouse includes a grand term such as warehouse, they are not limited to large enterprises. Small companies are able to utilize such constructs as well, depending on their needs.
Now that we have an idea of what a data warehouse is, there are so many other questions that arise. While the subject is way too large to fit into a single blog post, I will address some common questions and concerns I hear when discussing data warehousing.
The first thing I want to point out is these systems are meant to be a central repository of data. They will not replace your systems such as your Enterprise Resource Planning (ERP) or Customer Resource Management (CRM) system. Instead, data warehouses are meant to be a bridge between these systems that allows you to aggregate the data in your environment.
The data warehouse also does not necessarily have all the same information that your other systems will have in them either. Normally your data warehouse will have a subset of data from every system. Duplicating your full database will likely be a waste of time and resources. For those reasons, you should only bring over relevant data that you plan on analyzing, reporting on, or is descriptive data that helps describe other data present in the data warehouse.
For example, you may want to bring in sales data from your ERP system and opportunity data from your CRM system. In order to be able to relate this data in a meaningful way, you will need to have the customer information from both systems so that you can relate the data to each other. It is quite likely things such as customer numbers will be quite different between systems which can make it normally very difficult to relate this data. Since the data warehouse is its own independent system, you have the flexibility to build tables to map this data to each other to save end users time when trying to read the data.
Other issues you will likely face when pulling data from different sources is the fact many sources store data in different formats. A good example of this is percentages. Some systems will store them as whole numbers (90 = 90%) and other systems will store them as decimals (.9000 = 90%). For this reason, when the data is loaded it also has to be normalized. This is done through a process called an ETL (Extract, Transform, Load). During this process, you can also normalize things such as currency and units of measure.
While most of your data will be loaded from remote sources, the data warehouse is a good place to store that important data that doesn’t have anywhere else to live. An example of this would be things such as exchange rates if you are a company that deals with multiple currencies. This is particularly the case if you have multiple systems that are located in different countries and need a standard way to report across them.
In the world of data, security is always a major topic, as it should be. This is one of my favorite parts about data warehousing. Implementing a data warehouse can greatly increase the control of access to your data.
First, let’s look at a use case. Company Small Business uses an ERP system that has minimal reporting built into it. The only way that the business users and finance team can get good data is by having the IT team build them reports that pull data directly from that database by using SSRS (SQL Server Reporting Services) or Crystal Reports. The problem you then run into is the business teams are then reliant on the IT team to build these reports which takes them away from doing other important tasks. This also takes time away from the business users because they have to work with the IT team to produce usable reports in a format that suits the business. At this point, you are either hiring more IT members or giving the business a way to access data themselves.
Let us address a few issues here. The first one being the self-service approach. This approach requires users to directly access the data at the source. Users directly accessing data means that the user has to have permissions to those databases. While you can set up permissions to that data, it is very high maintenance to keep track of and to audit. This holds especially true as the company grows.
The second issue with this situation is whether it is the IT team creating reports, or it is users accessing data, both are directly hitting your production database. Beyond this being a security issue, this can also cause performance issues with your applications. Especially with smaller ERP and CRM type systems where data is constantly being read and written. Depending on the database used, it can cause things such as table locks while the system is trying to read or write data to the database while users are reading data from the table. It compounds the issue worse when you are running large database queries which take seconds to minutes to run. This becomes a nightmare rather quickly.
A simple solution to this is using a data warehouse. By moving the data to a centralized database, you now can report and analyze without users constantly hitting the production databases directly. Instead, you can schedule data pulls from the database via the ETL process during times when there is little system use or pull smaller subsets of data at a time. One caveat to point out is by using this construct, the data is not real time. Depending on your systems, you can possibly create the data warehouse to be near real time, but not actual real time. This being the case, this is not ideal for reporting data that must be up to the second or even up to the minute.
In addition to performance, by moving the reporting to a separate server, you can now lock down your production databases so that people can not access them directly. To further add security, you can utilize systems such as SQL Server Analysis Services (SSAS) to add a layer between the actual data warehouse and the users consuming the data. SQL Server Analysis Services allows you to control access to a data to a very finite manner based upon the mode you are running it in.
The last point of security I want to address revolves around auditing. While eliminating user access to the databases will greatly improve your security and ability to audit who has access to the data, utilizing the ETL process to move data to the data warehouse will ensure that data being reported has not been modified by users. This is a great utility when it comes to audit time if you have to go through audits such as a SEC audit.
In the world of data there is no silver bullet to put you ahead of the pack. However, data warehousing can be a great tool when it comes to gaining insights in your company’s data by bringing it all into one place and normalizing it so it can be consumed in a standard format.
Building a data warehouse also takes a deep understanding of the data and also how all of the systems work together. Depending on the amount of systems and their complexity, it can be quite an undertaking to build one. However, the end result will give your company a great resource to make important business decisions with.
If this sounds like something you would like to explore for your company, we invite you to contact us so we can help you discover what we can do to create a competitive edge for you and your company.