Do You Need a Data Warehouse?

Over the past few days, I have been at a conference. It was an insurance-focused conference and there was a common theme in some of the questions I was asked as I was talking to people walking by our booth. That is “Do I need a data warehouse”. This comes in several different forms. Some people would come and ask if we build data warehouses. Someone asked me if I thought they needed a data warehouse.Various forms of this question would come up so I thought I would address this head-on today.

I have talked about this in the past. I think I even have YouTube videos out there that address this as well as blog posts. It’s so important that we really understand, first, what we’re trying to do. Why we’re doing it. What is a data warehouse? We should try to figure out if a data warehouse is appropriate for your organization. So that’s what I’m going to try to do.

First, let’s break down the problem and the vision. That’s step one. Let’s just figure out what the problem is and then talk about what an ideal situation would be in relation to that problem. The problem is that your organization is unable to capture value opportunities due to the lack of insight. That’s what we’re saying. You recognize that you could do better in simple terms. You could do better. You could reach your goal. Your people could be more efficient. You could capture more revenue. You could reduce expenses. You could improve quality. You could add value if you had the insights needed to do so. That’s the problem. If we had these things, we could capture value opportunities, but we don’t and that’s the problem.

That is self-qualifying. Is that your scenario? Do you have the opportunity to capture value? If you had information available in people’s hands, could they operate more efficiently? Could they succeed? Could they improve their productivity? That is the case today.

Then, what’s the vision? If that’s the problem, what’s the vision? The vision is simple. It’s reportopia. I’ve talked about this and I’m going to continue to talk about it. Reportopia is the vision. Reportopia is a state of being. It’s where all of us have all the information we could possibly want at our fingertips in real-time, and we’re able to use that information to succeed.

So we’ve got all the information we could possibly imagine. It’s readily available, it’s in the format we need it and we’re going to use that information to succeed. This is a utopia. It is the utopia of reporting, which is why I call it reportopia. If you are in marketing this means you know exactly who your target is, you know where they live, you know their phone number, you know their address, you know they want your product. You know they’re ready to buy now. We’re talking about reportopia. You have all the information you could possibly imagine or possibly want at your fingertips.

I’ve mentioned before, reportopia is not something we ever fully achieve. It’s something we’re striving for. We’re going to do our best to get as close as we possibly can to reportopia. And it’s not as if it’s a project. It’s a function of the business. It’s something that we’re always striving for. We’re looking for ways to improve. So, that’s the vision, reportopia.

The problem again, we know we’ve got value opportunities, but we’re not capturing them because we don’t have the right information in the right people’s hands. The vision is that we solve that problem. Everybody has all the information they could possibly need to always succeed.

Now that we’ve talked about the problem and the vision, let’s take a little bit of a detour. Let’s talk about data solutions in general. The topic of rather or not you need a data warehouse needs to first go into what is a data solution and what are the different types of data solutions?

First, what is a data solution? A data solution is something that we call these things at LeapFrogBI, you’ll hear this refer to in different ways. I’m going to use our terms how we describe this type of thing as a data solution. It is the thing, the data storage area, which lives between your reports and something else. It’s going to empower your reporting mechanism. The report, of course, is the front-end of these solutions. It’s the data visualizations. It’s the data feed and so on. It’s the front-end. It’s what’s being delivered. The data solution is empowering that front-end report.

What are the types of the data solutions? We break this down into six categories. I’m going to move through these quickly, because this is not the focus of this podcast. It is important to understand before I start moving into the ultimate question.

The first data solution is what we call direct system of record reporting. This is where your front-end, your report solution, is pulling data directly from your business application. If you’re using an ERP system, this means that your report is connected directly to that business application’s database and it’s pulling data out from that database in order to serve up a report. That’s direct system of record reporting.

I’m moving up a pyramid so you can imagine that’s the simplest, cheapest, least capable solution. The next level is a stage process. This is basically where you’re taking data out of your database and you’re replicating it. You’re just simply copying it from point A, which is your production business application. This application could be a CRM system, an ERP system, a policy admin system, a health record system, etc. Whatever that system is, we’re going to replicate that information and move it into a stage area. Just copying. And I can’t go into all the details of why we do these different things in this podcast. I’ll just say that staging at least takes the load off your business application so that your reporting solution is not deteriorating the performance of your business application.

The next level up from a stage process is a persistent process. A stage process is typically volatile, meaning we’re just copying all the data from your business application and dropping it into a staging area. Then, we’re deleting all that stuff or truncating it, and then overriding it with the new data from the next day. Persisting means we’re going to persist that information, possibly store history of changes of that information. So, we’ve gotten to persistent data store at this point. We’re at level three of the data solution.

Level four is what we call an operational data store. This is where you’re not just copying data over or even persisting data, but you’re manipulating that data to create data sets that are designed to support whatever reporting requirements you might have. It’s now beyond again the simple copying of information. You’re now applying business logic. You’re restructuring, you’re transforming, you’re doing a variety of things to meet the needs of your reporting requirements.

The next level is a data warehouse. We are going to talk more about what a data warehouse is in just a minute. That’s level five in this little pyramid. This is moving data out of your source systems, your business applications and into a data structure that’s going to persist, it’s going to apply business logic. It’s also going to create data structures that are designed specifically to support not just one business need, but your overall business. Not just today, but in the future. It’s designed to be a scalable solution.

One more level on top of that is artificial intelligence. This is where you’re getting into the predictive side of things. This could be modeling or many other forms.

A data solution, just to recap, is the thing that is going to be used to empower your front-end reporting solution. Your reports are going to connect to this data solution so that they can create data visualizations or whatever it is you need. That way, you can achieve the value that you know is available with that information in the right people’s hands.

Now, I want to go a little further into data solutions and break it down into two categories……. point solutions and data warehouses. A point solution is a data structure that’s designed to support a very particular need. If someone has a reporting need, and they come to the information technology group and say, “I need this report and I need these columns in this report. I need these data visuals. I need to interact in this way. I need to deliver to these people. I need a refresh on this interval.” They have a set of needs, someone is focused on that need, and they’re going to solve it. They’re not thinking about the overarching problem of a data solution that’s going to be able to scale. A data solution that might be able to reuse some of this same business logic for other purposes. They’re just solving the point solution that’s in front of them at that moment.

Point solutions are going to grow. You’re going to have point solution A, then you have a point solution B, and a point solution C. You’re going to keep having more and more point solutions that solve very particular needs. That’s one approach. And in reference to what I was talking about a minute ago, this would encompass your direct system of record reporting. Your stage, persist, and operational data store are all point solutions. Operational data store gets us a little beyond point solutions depending on the type of “cowboy coding” that might be going on. I would consider all of those to be point solutions.

Data warehouses, on the other hand, are data structures that are designed to support your business needs not only from a point solution standpoint, but they’re building a foundation that encompasses what we know we need to support. It enables us to scale in a way. We don’t have to throw away what we did yesterday and rebuild in a different way today, because we learned of a new need. It’s not designed in that way. Instead, it’s designed to create a foundation. Something that’s scalable. Something that can adjust with your changing business needs both from the source system standpoint and your reporting requirements.

So those are the two categories… point solutions and data warehouses. The other thing I would say about a data warehouse is a data warehouse should be designed as I mentioned with your business needs in mind, but it should be designed so that your business users can intuitively navigate it.

In today’s technologies, this gets a little bit more complicated. It’s not as if people are navigating a data warehouse schema necessarily. There’s a lot of layers in these solutions. Regardless, one of the objectives is that your business user shouldn’t have to go in and navigate tens of thousands of tables in all your business applications in order to come up with a simple list. Instead, we’ve got this entity readily available, it’s easily defined, it’s easy to find and navigate, and use in reporting.

Also, I want to mention regarding data warehouses, you might notice that I did not mention technology in that definition. The technology doesn’t matter. There’s a lot of great technologies. A technology is not a data warehouse regardless of what the vendors want to make it sound like, it really does not matter if they call their product a data warehousing product. It does not matter.

There are a lot of ways to build data warehouses. Of course, there’s some that are more or less appropriate depending on the particular circumstances. The technology itself does not define a data warehouse. Again, if you want to do this on index cards, you can do this on index cards. It’s not going to be very efficient. I don’t recommend it, but it isn’t about technology, is the point. It’s about building something that’s going to help your business succeed today. Create a foundation that’s going to be able to expand and build upon in the future.

Now I want to go into what I hope to talk about, which is: Do you need a data warehouse? It would be very easy to say as a business intelligence and data warehousing company, yes, you need a data warehouse. That is often what people hear. They need a data warehouse because data warehouses are awesome. They enable you to do all these great things. I am a huge advocate of data warehouses. I spent my career building, maintaining, extending and seeing how successful these solutions can enable companies to be.

At the same time, I’m not going to tell anybody they need a data warehouse. First, I need to understand the scenario. This is where things get a little bit gray. I’m going to run through a few questions that I think can answer that for you. So, you can self-answer that question by asking yourself these few questions. First, does your organization have a data-driven decision-making culture? You might say yes, we do. We want data and that’s what we’re going to use to make decisions.

But let’s look at that question again and break it down. Does your organization have a data-driven decision-making culture? How do you know if your organization has a data-driven decision-making culture? Or, how do you know if your organization can shift its culture so that it is a data-driven decision-making culture? That’s what you need to think about because unless you have a data-driven decision-making culture, forget about a data warehouse. Forget about any of this stuff.

If you don’t have a data-driven decision-making culture, you don’t need a data warehouse, step one. You probably don’t need a data solution at all because without that culture, none of this is going to matter. You could build whatever you want, but people aren’t going to use it. If they don’t use it, at the end of the day, it’s useless. It’s a waste of money.

So, let’s qualify ourselves. How do we know if we have a data-driven, decision-making culture? Well, here’s a few things that you might use as evidence. If you can see people out there building little ad hoc solutions in whatever way, they possibly can. For example, we most often will use Excel or Access to create bits of information needed to operate more efficiently. If you’re seeing that go on in your organization, that’s a pretty darn good indication that you have people that are data-driven. They want information to make better decisions.

Another direct indicator is you have people complaining that they don’t have the information they need to make good decisions. They may not put it in that way, but that’s a direct bit of information that you can use to say you have people that are already thinking this way.

Another bit of evidence is you having to hire people and just throw people at a problem when you know that there are more efficient ways of doing things. Whoever that person is that has that feeling that there’s a more efficient way to do something as opposed to just throwing more people at a problem, that’s also an indicator that you have data-driven thinkers.

Also, do you rely on opinions, or do you ask for proof? This, of course, takes forms. If you’re in a meeting, and you have someone talking, are you relying on their opinions, their bias, and their experience? Or are you also saying, “Do we have any real facts to back it up? How can we prove that?” That is an indication of data-driven culture.

Does proof change minds? If you’re in that meeting and you have someone who is expressing a theory and you have the proof, do you have people that are thinkers and able to put their own bias aside? They can put their own opinion aside and let the outcomes of the data, well-prepared, accurate, timely, relevant data change their minds. If not, again, it doesn’t matter if you have that information unless people are going to use it. Can this information overcome opinions and bias?

These are some of the questions that I would ask you to think about when you answer that question. Does my organization have a data-driven decision-making culture or does it want to? Do we want to empower the organization to have that data-driven decision-making culture? If the answer to that is yes, then let’s go to the next step. If it’s no, forget about going to the next step. I would imagine in my experience; most people are critical thinkers. Most people recognize that having information to make better decisions is a good thing. Most of the time answer to that is going to be yes.

The next thing I would ask is, “are your existing reporting solutions adequate?” Now, this is a roundabout way of trying to figure out whether we have people that are thinking about how to improve and look for those value opportunities. If you believe in your organization as a whole believes that what you have today is adequate, whatever it is…. it could be nothing, it could be you have a financial system and it’s spitting out whatever reports are available. If whatever you have today is adequate, if the answer to that is yes, then don’t move to the next step. If it’s adequate, that means you have not identified opportunities to improve. You may not have value opportunities. It’s only at a certain point in an organization that you’ll begin to run into these opportunities. From my experience, it’s not the first six months of business. At some point, organizations do have opportunities to operate more efficiently. This question is focused on trying to uncover whether people are looking for those value opportunities and recognize that there are opportunities to improve and if we have the information available to do so.

The third question is, does your organization have a long-term vision? And this gets to a point solution type of approach versus a long-term vision. Are you thinking of this as a little project where we carve out this little thing that we know we want to improve on or are you thinking about this more holistically where this is a business function?

Do you want to use this well-prepared information to just support this one little need that we know about today, or do we want to build an infrastructure that’s going to be able to help us succeed overall? A business function just like HR and marketing. This is another business function, the business intelligence and data warehouse, business function. If your answer to that question is no, we are not looking at this as a business function, we just want to solve this one problem, then I would say absolutely do not try to build a data warehouse. That’s not going to work out well for anyone.

But if you have a data-driven decision-making culture, you know that people are willing to use data to change minds, and overcome opinions and bias. You’re looking at this long-term. You want to build this foundation. Then, let’s consider something like building a data warehouse.

The last question is does your data’s source topology and your reporting requirements warrant a data warehouse? I know. That’s kind of a cop-out because that’s a hard one to answer unless you are knee-deep in this stuff. At the end of the day, whether you need a data warehouse depends on your data source topology. How many data sources do you have? How many need to be integrated so that you can get to an enterprise view? How complex are those data sources, and what are those data sources current capabilities? That’s one side of the equation… the complexity level of your data sources.

On one end of the spectrum, you may have a very simple data source topology. It might be one data solution. It might be a very simple one. On the other end of the spectrum, you might have several data solutions. You might have reference data, master data, or external data. They might all be very complex and not capable of doing the things that you need to do.

When we’re determining if you need a data warehouse, we’re going to look at your reporting requirements. Do you have very simple reporting requirements? This goes along with vision as well. You may have some reporting requirements that you know are very simple, but do you want to grow? Do you want to support not only the requirements that you know about today, but do you want to create this foundation that’s going to enable you to support more complicated requirements?

This is a little bit difficult to describe, but you can envision a matrix with the rows representing the complexity of your data sources with the top of the row being very simple, the very bottom being very complicated. The columns represent your reporting requirements. If your reporting requirements are very simple, then that would be on the far-left side. And if they’re very complicated, they’ll be on the far-right side. The intersection of those two things helps you determine if a data warehouse is the right solution for you.

I just went through a lot of stuff. I want to make sure that I clearly answered the question. So, I’m going to recap. If someone asks, “Do you need a data warehouse?” I would say number one, do you have a data-driven culture? If the answer to that is yes, then move forward. Number two, are your existing reporting solutions adequate? If the answer is no, let’s move forward. Do you have a long-term vision? Do you want your organization to be built on top of data-driven decision-making? Not just today, not just in this little problem we know about today, but long-term. We’re looking at this as a business function. If the answer is yes, then let’s move forward. And then number four, let’s look at what the situation is on the ground. What are our data sources? What do those look like? What’s the complexity level of our data sources today?Then what do your reporting requirements look like? Not just today’s reporting requirements, but what do we expect those reporting requirements to look like in the future? Let’s get the intersection of those two things and figure out the appropriate data solution. Most likely if you’re looking at things long-term, even if you have a simple data source topology today, and you have simple reporting requirements. and you know you want to support a data-driven decision-making culture, then it’s probably for you to go right into a data warehouse solution.

On the other hand, if you just have a simple reporting need and you’re not looking at this long-term, then you do not need a data warehouse. That’s not the right approach. It might succeed, but it would be just simply the wrong tool for the job.

I hope this has been helpful in determining whether you need a data warehouse. When it comes to business owners, process managers and individual contributors we are trying to build reportopia. We want to enable individuals to do better, to succeed by providing them with the information that they need to do so. Whether that means you need a data warehouse, or if it means you’re going to do direct system of record reporting, it’s important, but it’s the ingredients, not the end-product. Keep that in mind. Again, I hope this has been helpful. Thank you so much for listening.


Share this post