How to decide what data to analyse (a five-step framework)
- Kat Greenbrook

- Apr 2
- 3 min read
One of the most common questions in data work isn't about statistics or software. It's much more fundamental: where do I even start?
Most people working with data have access to more of it than they can meaningfully use. Not knowing which data to look at, which questions to ask, or how to connect the numbers to the outcomes that actually matter is known as analysis paralysis—and it's completely normal.
The answer is a more intentional approach to framing the analysis before you begin. Here is a five-step process that provides that structure.

Step 1: Identify the opportunity
Start with what's not working. What would you like to improve, even if you don't yet know why it's happening? This is a practical question, not a technical one. You don't need to think like a data expert here. Think like a practitioner in your field.
In an education context, this might sound like: attendance is slipping in a particular year group, or some students aren't achieving at the same level as their peers. The opportunity doesn't need to be precisely defined—just named.
Naming the opportunity gives your analysis a direction. Without it, you're exploring data without knowing what you're looking for.
Step 2: Zoom in or out to the right level
Once you've named the opportunity, decide what level of data you need to look at. Are you focused on an individual student? A class? A year group? A demographic? Or the school or system as a whole?
This is sometimes called the "grain" of the data or the unit of analysis. Your opportunity might involve more than one level. If you're concerned about Year 10 attendance, you might look at individual students and the year group as a whole. Being explicit about this stops the analysis from sprawling in too many directions at once.
Step 3: Define your questions
With a clear opportunity and data level in mind, write down the specific questions you want answered. These also don't need to sound technical. The more they reflect your real goals, the more useful they'll be.
For the attendance example: which subjects have the lowest attendance rates? Are there patterns by demographic group? Has the trend changed over the past three years? Each question should connect back to the opportunity you've identified. If a question doesn't link to the opportunity, it belongs in a different analysis.
Step 4: Connect your questions to data
Now move from ideas to evidence. For each question you've written, ask: what kind of data would help answer this?
This is where you identify the specific metrics you'll need—attendance rate, consecutive days absent, median days absent per student. It's fine to list more than one metric per question, and it's fine if you don't have access to all of it yet. This step helps you see clearly what data you have and what you still need to find.
You don't necessarily need to create new data here. In most cases, the data already exists, but now you have a clearer purpose for using it.
Step 5: Compare your data
Data gets its meaning from comparison. A number on its own tells you very little. The same number compared to a target, a different group, or itself over time starts to tell a story.
For attendance data, you might compare year groups against each other, track a cohort's attendance over time, or measure against a national benchmark. The comparison you choose should be driven by your questions, which were driven by your opportunity. This is what stops your analysis going down too many rabbit holes.

Why this matters for data storytelling
Working through these five steps before touching the data changes what you find and how you communicate it. When you start with an opportunity rather than a dataset, your analysis has direction. When your questions are explicit, your findings are easier to explain. When your comparisons are deliberate, the story in the data is easier to see.
Most data communication problems start before the communication. This framework is a way of building that clarity in from the start.
Understanding data storytelling can also sharpen your analysis. When you're clear on the questions you need to answer and the audience you're answering them for, you look in the right places, and you're less likely to get lost in data that doesn't serve the work. The Data Storyteller's Handbook and the Rogue Penguin workshops cover that full process.
Kat Greenbrook is a data storytelling consultant, author, and workshop facilitator based in Wellington, New Zealand. She is the founder of Rogue Penguin and the author of The Data Storyteller's Handbook.



