Kind of a dramatic quote to begin a discussion about data, but this is one of the most important lessons I’ve learned throughout my 20+ years as a data analyst. I had no idea how lucky I was early in my training to be taught by some great leaders; along the way I discovered that you can learn a lot from examples of bad leadership as well. The truth is most organizations realize how important good data is for making decisions, however I still see, in large and small organizations people making decisions from and sharing data that is either incomplete, not given proper context, or is otherwise flawed, and without effective database leadership, this can and does lead to poor decisions that can have wide reaching implications on the success of your organization.
The person with not only the access to, but a command of the data within an organization is arguably one of the most powerful people in an organization if they manage their data responsibilities well. Here are a two of the most basic but vital rules toward how to yield that “power” responsibly.
- Fully understand the problem for which you are trying to solve, or the question you are trying to answer.
Objective: “How many bike buyers did we have in the months of June and July and summarize their subsequent behavior for the following 4 months.” I passed the request on to a data analysist who pulled the data, did the qc work, and felt confident we had answered the question. I then prepared to present the data in a large meeting with our executive team. Prior to the meeting, I thankfully shared my data with a colleague who worked in the bike division. He told me my data was flat out wrong, with numbers that were way too high. “I don’t see how we can have 21K bike buyers from that period when we only sold 9K bikes.” I was baffled. I had triple checked my work. Turns out my data was not wrong—it was dead on accurate. The problem was we didn’t understand the context of the question we were being asked. The disconnect was with the way we pulled the data. Our method was:
- Isolate all bike buyers
- Isolate all June and July transactions
- Match bike buyers to those who transacted in June and July and report on their subsequent behavior
The question that was being asked was how many people bought bikes in June and July, and what was their subsequent behavior. Not how many people who had ever bought bikes bought in those months, and what was their subsequent behavior. The method should have been:
- Isolate all transactions from June and July where a bicycle was included on the purchase
- Isolate all buyers from those transactions
- Report on their subsequent behavior
In both cases, the analyst would have answered a question I put in front of them. It was my responsibility as a leader to make sure the context of the question was understood, and we had the correct set of data prior to the meeting—and I nearly failed. Had I not verified my data with a colleague, I would have lost face for providing incorrect data, and nothing else I would have reported would have credibility, even though the data pulled was accurate—it was the wrong data and therefore it would have been considered garbage. This was both a leadership failure on my part for not clearly defining the problem, and a teachable moment for the analyst to show them how asking a few insightful questions when posed with a question can be valuable.
- Only provide data that has context, and do everything you can to ensure that the data is being interpreted appropriately
The following real-world example offers many lessons about data, and responsibility: Recently hired to oversee database analytics and customer strategy for a large company, my first responsibility was to direct and manage the onboarding of a new CRM analytics platform. Adding to the complexities, major shifts occurred within our senior leadership as the company was going through a transformative time with a new ownership group. With the marketing department losing its senior leadership, a member of the board of directors (and a partner with ownership group) was asked to oversee our department, and my project on an interim basis. Additionally, this board member was very outspoken with their opinions relating to major shifts in strategic branding and customer targeting; a potential radical departure from the core audience that had driven the majority of our sales and growth.
I was pressured to provide data before I felt comfortable that our quality control process verified the database’s integrity, nor had we fully-vetted the accuracy and integrity of the reporting. To make matter worse, our team was not fully trained on how to utilize the database querying tools. Regardless of my insistence that we needed to better verify the data before sharing, and potentially telling an invalid story, I was under pressure to provide reports before they were ready. I then saw reports prepared and interpretations relating to the data that I did not support; with statistics intended to support theories when, in proper context, they did not. It was a politically challenging position for a new employee still building my credibility. The database was brand new and we had not worked out the bugs, and the person to whom I was reporting was powerful within the organization yet did not have a full grasp on the data or was driven to support their beliefs by sharing it out of context. As uncomfortable as my situation was, I took a risk and phoned the CEO, asking if I could speak in confidence. I outlined my concerns and tried to detach from ego or emotion. I shared my beliefs about the importance of data validity, neutrality with data interpretation, and context. I told him that I did not fully support the interpretations of the data I had provided, the way it was presented, or the conclusions suggested as a result. From this experience I learned or reinforced a few things:
- Data should not be shared without an explanation of what you think it means within the proper context
- “I don’t know” or “the data does not tell us anything” are sometimes the best answers to give. Don’t force the data to tell a story it does not.
- Do not let ego or emotion interfere with how you interpret or share data. The role of a database professional is not to pick sides, but to provide accurate, actionable data
- Do not let your theories dictate how you interpret data—neutrality is key. Often data will debunk what you expected it to tell you and that is a good thing
- Context is crucial
- Sharing data comes with responsibility. It’s better not to share data than to share bad data!
If you have any interesting stories similar to this one, I’d love to hear them. Shoot me an email at tblake@cohereone.com