My Ground Rules – Data Analysis “Before & After”
In my opinion there are many different ways & techniques to analyze the data and every individual brings in their own flavor to it which is unique & exciting to learn from. Through this article I wanted to focus more on the key ground rules (non-technical ones, as per me) to keep in mind before & after data analysis to make the entire process effective, enjoyable & enriching.
Effectively understanding the data in-hand along with simple analysis over the years has helped in adding a new dimension in my thought process which has gradually enabled in altering the vision/direction for better. Personally, for me knowing the key trends emerging from data catapults my confidence on the approach I want to pursue thereby eliminating any second thoughts (it’s important to keep them at bay) while conceptualizing potential changes/solutions to implement. While trying to bring in any change the key aspect is not just to feel comfortable yourself but clearly discussing the same with the team & communicating it in a way which is clear, concise & easy for them interpret. Yes, visuals definitely help and more importantly translating the observations identified from data into potential opportunities to overcome the issue in-hand/something which may happen in future immediately helps the team to connect and come on board to work on/implement a solution. In my experience I have realized it’s important to create an atmosphere where all key stake holders eventually own the idea/thought (it cannot be your idea/one person’s thought anymore once it leaves the drawing board) and feel it’s theirs to effectively drive it.
I strongly feel life altering ideas and innovative thoughts originated due to a certain issue/problem ascertained by data (yes, data does drive a point to help understand the significance of the issues/problem) and most importantly it (data) helps in understanding the emerging trends which fuels the engines of innovation. In my experience innovation is more of an after-thought (with all due respect) or a step which comes at a later point preceded by the burning thought/idea and our efforts to strengthen the vision through data analysis.
My fascination towards data started early in my career when I was trying to review my team’s performance on different metrics and understand the key strengths & opportunities. While the initial idea was to identify few call-outs and have them handy in case of a performance review however once I started digging deep, I gradually realized some key trends evolving (few on expected line but others took me by surprise) from the data. Since this was more of a casual analysis (and honestly I didn’t expect to find anything curious) I was all over the place and didn’t have a structured approach, so while I was stumbling upon some unexpected trends I realized it may not be possible for me replicate these steps again if I had to redo it. Very early into my data analysis journey I realized the need for few ground rules (sounds heavy, but isn’t) before I bury myself into data.
It’s important to know your data (but be ready to be surprised): In my experience randomly going over the data and trying to analyze it without understanding it clearly leads nowhere, it in fact can be frustrating to be grazing the data-set and looking to find trends while we really don’t understand all the features. So to avoid wasting time by shooting in the dark (or my favorite – trying to find a hay in a haystack, hope you got my point) it’s always advisable to first understand the data we are working on, like clearly know everything about the data before we kick-off the analysis. This also helps in devising a purpose statement in your mind (if already not given) or break down the larger goal into smaller objectives. In my experience brainstorming with a team of subject matter experts to gain insights on data is something which has been extremely valuable as it helps in learning more about the data without even starting to work on it. Moreover, it helps in understanding any potential challenges we might face down the line while slicing & dicing the data. During such discussions I have realized it’s best to forget whatever you know about the data/goal or you think you know and just guide the discussion by being inquisitive as this helps to listen keenly & frame leading questions to gain the required insights. I generally spend a certain amount of time on every feature during these discussions till I feel comfortable & understand where it originates from, what does it represent and it’s importance in the larger scheme of things.
Now for the be ready to be surprised bit, yes knowing all about the data in-hand through the discussions and reviewing the insights obtained from it are two different things. There has always been an element of surprise which I have stumbled upon that I either couldn’t/didn’t see coming during the discussions, so it’s extremely crucial to be open-minded about it and not think of it as an anomaly. Yes, that anomaly could be an emerging trend or maybe just that but that’s what is most exciting about any analysis isn’t.
So yes, it’s important to know your data to relate with the analysis as we go along and create smaller objectives of the purpose statement while consciously keeping an open-mind about the surprises we might encounter.
Assumptions & Subconscious bias: Like I mentioned earlier it’s important to understand all the key features in the data and what they mean/represent – Yes, I understand this rule may seem trivial and is a basic expectation before anyone gets into data but trust me I have had few challenges or let me call it missteps over the years, how you ask? Let me break it down,
Assumptions (just to clarify I am not specifically talking about the statistical assumptions here, while that’s a good topic too, maybe for another time) could be many but it’s best avoided. Over the years I have realized that a quick validation to see if I am working on the right set of data can really save a heartache or two half way through the process. Unfortunately yes, I have had those heartaches and it really can devastating to learn the analysis you have done is on a data-set which just doesn’t truly represent the larger population hence the insights derived (while can be truly exciting) doesn’t really hold true.
Subconscious bias generally happens when you have a good knowledge of the data you are analyzing or have worked on a similar data in the past, now basis prior knowledge there is a good chance we ignore certain features. Our brain is generally very quick to understand the commonalities & patterns and how these features weren’t of much help in the past or probably didn’t add value earlier hence we may subconsciously start ignoring them assuming this data is largely similar as earlier. While the data could be similar yet not exactly the same so the risk here is that we end up ignoring newer trends which may emerge by adding these features.
In my experience I would say it’s best not to leave any stone unturned, you just don’t know what we might find in one of them.
Outline the summary/steps as you proceed: I have always believed that analysis of any kind is not a one-off exercise and it should be something I can replicate at a later time or atleast make it easier for myself/someone to revisit the steps to get a holistic view. Yes, there are certain elements which could be more of a black box (certain machine learning/deep learning algorithms) however whatever is not must be outlined for ease of review. Personally, I do a lot of small analysis on different data-sets and when I need to revisit something which is not summarized it’s a little more difficult or cumbersome process. The other key aspect is when I revisit something which has a summary, I immediately get new ideas/thoughts on how to maybe take a different approach basis what I have learnt between then & now thereby further refining the analysis and improve the output.
Just remember it’s easy to make quick notes while actually performing the analysis as it might help not just in review but enable further refinement too.
Don’t forget your initial goal/intent: You might be wondering what this is all about and why are we discussing it now, let me explain it. There have been times when I just got carried away with my analysis which eventually landed me far beyond the initial goal/intent. While yes this isn’t a bad thing because analysis is more of an experience and not a process or routine however it ended up delaying what I had initially set-out to achieve. There could be many things we might discover during an analytical journey (and get carried away) but it’s best to remind ourselves what’s the goal from time to time while making a note of the other interesting aspects/approaches to revisit. I honestly don’t have a hard and fast rule to restrict myself so when I realize am moving off the target, I go with the flow till am comfortable to replicate the same when I revisit. Now if time is of the essence my advice is to avoid the temptation make a quick note to come back to it later and do a course correction.
Don’t feel that the initial goal/intent is restrictive it helps in shaping the analysis while also providing an opportunity to go beyond, one step at a time though.
Story-telling: In my initial years I truly felt this part (I used to refer to it as presentation till I realized it’s not just that) was extremely restrictive because after hours and hours of analysis translating all of that exciting insights into maybe a one-slider or a small document felt not natural to me. The one question which always bothered me was, if it took me all this data analysis and visualization to identify the key insights how is it possible to replicate the same in a tiny confined space, there is no way I could communicate with conviction since there is a lot of detail lost in translation. Gradually I have realized over the years the true importance of effective story-telling and how it can be done without losing the essence while not translating everything from our analysis board to the slides. Yes it took me a while but the realization struck me when I was trying to present the analysis to my team (apparently I was doing mock presentation before I could take it forward), yes it started well and I was very energetic calling out the key insights, going over hoops of graphs & data eventually coming to a point where I felt the speaker me is unable to keep up the intensity through the long presentation and the audience (my team) is confused on what’s the point of all of this and where it’s headed. This helped me to understand even if I am trying to present this to select set of people who have some understanding of what I am talking about, the attention span & the enthusiasm (both of which are extremely short-lived) will exponentially dwindle after the first few minutes, so how can I make it better, what should I do to keep the listener engaged/connected to enable them to relate to discussion.
This is when I realized it’s not presentation but effective story-telling that can keep the audience engaged and started dividing the entire effort into three parts.
Opening act – Overview: Quick summary of the issue along with the brief overview of data (to name a few, data source/date range of data analyzed/features in data-set) and summary of ground work (it could be, discussions with the team or subject matter experts to understand data/tools or applications used)
Second act – Insights from analysis: Overview of key insights obtained from the data which led us to the final act, yes it could be tempting to discuss & showcase all the analysis done but don’t go overboard as it will result in audience losing interest quicker than we actually realize, so pick-up key elements which helps in transitioning from opening act into the final act.
Final act – Proposal & Questions: Key conclusions to address the issue/opportunity from the opening act and proposing the way forward using the insights from analysis. While you will be taking on questions throughout the three acts, it’s crucial to have them during the final act since the proposal has been put forward and you definitely need to do a sense check on acceptance or any additional expectations. Also, some of the questions may aid in bringing out the analysis you have performed but didn’t feature in the second act.
Few key things to remember, the narrative must be cohesive and the transition smooth to hold the audience through the story-telling process. I am more of a visual person and in my experience most people relate quickly to visuals rather than summaries so choose maybe few visuals which will resonate with the group – may purely depend on whom you are narrating it to, so it’s better if you have some idea of the expected audience, while yes it needs to appeal to all, the three acts can be further customized if we know audience.
Keep it simple: Just remember the problem/issue in-hand could be daunting as will be the data analysis (not if you enjoy the journey, but yeah it can be exhaustive) but we need to keep the proposal/solution simple. Yes, you might argue this may not be the case always and the actual solution might be a complex one, what I mean is break it down so that it’s easy to comprehend and easy to communicate so that people are on-board to try it. Please remember simple doesn’t mean weak/compromise, it’s actually dynamic-scalable-easy to implement, something which can be done with the existing infrastructure or with small changes to it. In the Final act while presenting the proposal it’s advisable to present more than one (upto three, not more) with the simplest one on top, the reason is simple – this way the stakeholders understand while the Proposal 1 can be implemented now eventually there might be some changes to the infrastructure required down the line to implement a Proposal 2/3, this will resonate with them when they are reviewing/revising the strategies to bring in the changes.
Well these are some of My Ground Rules which has helped me in making the data analysis an enriching experience rather a monotonous task, these rules take care of the monotony and help us focus more on the analysis itself. Thank you!
Author – Prakash Sivaraman