What is social data literacy?

Kat Greenbrook
Apr 21
3 min read

Every dataset is the product of choices. Someone decided what to measure. Someone decided how to categorise the people in it. Someone decided which questions were worth asking and which weren't. Someone decided how to present the findings.

Most of the time, those choices go unnoticed because their defaults are invisible. When you've always measured something a particular way, it just feels like how it's done.

Social data literacy is the ability to see those choices—to recognise how social context shapes the way data is created, interpreted, and communicated, and to factor that into your work. It's a concept I've been developing over the past year, and it's the foundation of my current writing.

Venn diagram with four circles: "Social Context," "Data Creation," "Data Interpretation," and "Data Communication" overlapping at "Social Data Literacy."

Where the choices hide

The choices show up at every stage of the data process: creation, interpretation, and communication.

Data creation: someone decides what gets measured and what doesn't, which communities are counted, and which categories are used to group people. See The myth of neutral data.
Data interpretation: someone decides what the data means. The same numbers can support very different conclusions depending on the questions an analyst brings to them.
Data communication: someone decides what to include and what to leave out. These choices shape what an audience understands and what actions become available to them.

At every stage, you can make these choices actively or you can rely on defaults. Defaults aren't necessarily wrong. Sometimes the standard approach is the right one. The problem is when defaults are applied without awareness, and assumptions go unexamined.

What it looks like in practice

Social data literacy requires a habit of pausing at each stage of the data process to ask whose perspective is shaping what you're seeing.

A survey that groups everyone over 65 together feels like a neutral design choice. But it makes invisible the very different experiences of someone who is 66 and someone who is 95. And any patterns within that range disappear from the data entirely. Noticing that is a creation question: whose experience does this make invisible?

When one team consistently scores lower than others in an engagement survey, the most familiar interpretation is a performance problem. But if you ask what the organisational context tells you (three restructures in two years), this changes the question from what is wrong with this team to what has been done to this team. That's an interpretation question: what might I be missing?

A report showing that exits from emergency housing increased after a policy change can accurately describe what happened. But it can also leave out where people went after they exited. Adding rough sleeping data from the same period (which had also increased) changes what the audience understands and what they can act on. That's a communication question: what becomes invisible based on what I choose to include?

You can make these choices actively, or you can rely on defaults. Social data literacy is knowing the difference.

Why social data literacy matters now

Data has traditionally been created, interpreted, and communicated by people. People made choices at every stage, consciously or not. As AI takes over more of these stages, the choices become harder to see, which makes the ability to notice them more important, not less.

This is the focus of my current research and writing. If you'd like to follow along as these ideas develop, the best place to do that is my newsletter.

Kat Greenbrook is a data storytelling consultant, author, and workshop facilitator based in Wellington, New Zealand. She is the founder of Rogue Penguin and the author of The Data Storyteller's Handbook.