
The Design Psychologist | Psychology for UX, Product, Service, Instructional, Interior, and Game Designers
Welcome to The Design Psychologist, a podcast where we explore the intersection of psychology and design. The show is hosted by Thomas Watkins, a design psychologist who has spent years applying behavioral science principles to the creation of digital products.
We sit down with a variety of experts who apply psychology in different ways to the design of the world around us. Thomas uses his expertise to guide conversations that provide practical advice while illuminating the theory behind why designs succeed.
Tune in if you are a design practitioner who seeks to understand your work on a deeper level and craft experiences that are intuitive, effective, and delightful.
The Design Psychologist | Psychology for UX, Product, Service, Instructional, Interior, and Game Designers
How Well Do Our Words Reflect Our Inside World? A psychological perspective on the limits of self-report, introspection, and understanding the human mind
How much can you trust what users tell you?
In this solo episode, we dive into one of the most slippery yet essential tools in UX research: self-reporting. From interviews to surveys, self-reports are everywhere—but they come with hidden psychological traps.
We explore:
- Why self-reported data can be both useful and misleading
- The psychological reasons people often misrepresent their own behavior
- When to trust what users say—and when to dig deeper
- The subtle difference between described and observed behavior
If you’ve ever relied on user quotes to justify a design decision—or been burned by data that didn’t translate to real-world outcomes—this episode will give you a sharper lens for interpreting what users say versus what they do.
Tune in to sharpen your research instincts and make your design decisions more psychologically grounded.
Never miss an episode.
If you’d like a note when new episodes of The Design Psychologist drop, join the newsletter. I’ll send you fresh insights on psychology and design straight to your inbox.
[Sign up for the newsletter here—it only takes a moment. → https://3leafdesign.substack.com]
WEBVTT
00:00:00.017 --> 00:00:04.637
Suppose you're designing something new that helps people accomplish a goal.
00:00:04.837 --> 00:00:11.877
Maybe it's a website that sells a product or a procedure someone follows to deliver a service.
00:00:12.097 --> 00:00:16.037
Or maybe it's an instructional curriculum or a mobile app.
00:00:16.217 --> 00:00:20.777
And let's say you want to know whether it really works, if it's clear,
00:00:21.057 --> 00:00:24.857
usable, and actually helpful. So you ask people.
00:00:25.137 --> 00:00:30.997
Seems logical, right? But what if I told you that people often cannot accurately
00:00:30.997 --> 00:00:33.717
explain why they do what they do?
00:00:34.097 --> 00:00:39.437
Today we'll be getting into the idea of self-report data, the stuff people tell
00:00:39.437 --> 00:00:43.237
us when we ask them about their choices, actions, and feelings.
00:00:43.637 --> 00:00:49.257
In this episode, we'll unravel one of the biggest challenges in design research.
00:00:49.517 --> 00:00:53.997
Can we trust what users say about their own experiences?
00:00:54.977 --> 00:01:00.377
For well over a century, psychologists have tried to understand what's really
00:01:00.377 --> 00:01:05.497
going on inside people's heads, how they think, decide, and behave.
00:01:05.977 --> 00:01:11.957
Brain scans, a relatively new-ish technology, offer a peek inside the brain,
00:01:12.097 --> 00:01:15.557
but they're expensive, complex, they're hard to access.
00:01:16.077 --> 00:01:23.117
Other methods exist, like lab-based experiments that use reaction times or priming
00:01:23.117 --> 00:01:25.697
to infer what people are thinking,
00:01:25.817 --> 00:01:31.377
but they're not always practical to use in the real world and applied research settings.
00:01:31.697 --> 00:01:34.657
The easiest option is just ask.
00:01:34.917 --> 00:01:41.097
That's the self-report method. But while it's simple, it's also notoriously unreliable.
00:01:41.317 --> 00:01:46.577
People often misremember, they rationalize and they invent things,
00:01:46.597 --> 00:01:49.897
or sometimes they just don't know why they do what they do.
00:01:50.277 --> 00:01:56.217
In today's episode, we're exploring when self-report data can give us useful
00:01:56.217 --> 00:02:01.617
insights versus when it can seriously mislead our design process.
00:02:01.917 --> 00:02:06.117
So I'd like you to ruminate on a few questions throughout this episode.
00:02:06.377 --> 00:02:11.597
Why do people misjudge their own behavior? And what's the difference between
00:02:11.597 --> 00:02:17.477
asking the user of a product what they think versus observing what they do?
00:02:17.477 --> 00:02:22.477
In this episode, I'll break down the research on self-report data and I'll discuss
00:02:22.477 --> 00:02:28.657
techniques that can help us conduct user research that gives real insight into
00:02:28.657 --> 00:02:30.597
what drives people's behavior.
00:02:31.565 --> 00:02:36.325
Let's start our discussion with introspection. I want you to imagine this.
00:02:36.545 --> 00:02:43.005
In the quiet of Wilhelm Wuntz' laboratory, Heinrich sits patiently at a wooden table.
00:02:43.285 --> 00:02:48.805
Beside him, a research assistant stands holding the first of three objects,
00:02:49.005 --> 00:02:52.005
ready to hand them to Heinrich one by one.
00:02:52.465 --> 00:02:58.605
Research assistant carefully places a first object, a rock, into Heinrich's hand.
00:02:58.905 --> 00:03:06.265
Heinrich grips it gently, And in his trained response, he begins a methodical introspective report.
00:03:07.245 --> 00:03:11.965
It's heavier than I expected, he says. The weight presses down immediately into
00:03:11.965 --> 00:03:14.725
my palm, creating the sensation of density.
00:03:14.985 --> 00:03:18.605
I can feel the rock's mass pulling slightly as I lift it.
00:03:19.305 --> 00:03:23.445
Wundt listens intently, and his assistant jots down notes with precision.
00:03:24.285 --> 00:03:27.865
Heinrich continues, his attention now focused on the texture.
00:03:28.625 --> 00:03:32.485
The surface is uneven, rough in some places, and smoother in others.
00:03:33.105 --> 00:03:38.365
There's a prominent groove here along the side. He traces his finger over it,
00:03:38.645 --> 00:03:40.985
describing the sensation as he experiences it.
00:03:41.205 --> 00:03:43.905
The groove feels deeper than the rest of the surface.
00:03:44.285 --> 00:03:49.125
It's like it's been cut away or worn away. The rock was cool in my hand,
00:03:49.225 --> 00:03:50.545
but it's slightly warming now.
00:03:50.905 --> 00:03:56.425
He places the rock on the table, and the assistant hands him a second rock, a larger piece.
00:03:57.245 --> 00:04:01.165
Heinrich takes it and immediately notices the difference. This one is larger,
00:04:01.405 --> 00:04:03.885
but it's lighter than the first, he reports.
00:04:04.465 --> 00:04:08.385
It feels hollow or less dense than I expected for its size.
00:04:08.565 --> 00:04:13.325
There's no strain in holding it. I can feel a distinct lightness compared to the previous rock.
00:04:13.785 --> 00:04:18.185
His fingers begin exploring the surface again. The texture is much rougher.
00:04:18.505 --> 00:04:21.845
There's a sharp ridge along the edge. It's much more jagged.
00:04:22.345 --> 00:04:26.765
He turns the rock around in his hand. It's colder than the first rock And it's
00:04:26.765 --> 00:04:31.785
staying cold longer After finishing with the second rock Heinrich sets it down
00:04:31.785 --> 00:04:33.265
And the assistant hands him a third.
00:04:34.140 --> 00:04:38.500
Now, this story was a dramatized version of what a laboratory scene might have
00:04:38.500 --> 00:04:44.780
been like for Wilhelm Wundt, the father of psychology in Leipzig, Germany, in the 1860s.
00:04:44.940 --> 00:04:49.540
If that session didn't sound terribly scientific to you, then you're right.
00:04:49.840 --> 00:04:51.820
Introspection was riddled with problems.
00:04:52.260 --> 00:04:54.040
But we had to start somewhere.
00:04:54.760 --> 00:04:59.380
Psychology has this challenge of trying to study something that is unseen.
00:04:59.760 --> 00:05:03.980
While technically it is the study of behavior which can be seen,
00:05:04.140 --> 00:05:08.660
and that's why the behaviorist movement of the early 1900s wanted to focus on
00:05:08.660 --> 00:05:10.560
behavior to be more scientific,
00:05:11.220 --> 00:05:16.520
in psychology we're still trying to theorize or conjecture about what's going
00:05:16.520 --> 00:05:20.720
on inside the proverbial black box of the human mind.
00:05:21.120 --> 00:05:25.540
So for example, we can see the inputs going in the black box,
00:05:25.660 --> 00:05:30.180
that's the stimuli, and we can see the outputs that came out of the black box,
00:05:30.360 --> 00:05:31.120
and that's the behavior,
00:05:31.480 --> 00:05:35.660
but we can't see what's inside the black box, which is everything in between
00:05:35.660 --> 00:05:36.940
the input and the output.
00:05:37.720 --> 00:05:41.220
Nowadays, we've come a lot closer to being able to see the black box.
00:05:41.300 --> 00:05:46.080
For example, we have fMRI machines and other brain imaging techniques where
00:05:46.080 --> 00:05:50.620
we can see brain activity going on inside the black box in real time.
00:05:51.080 --> 00:05:55.160
For years, we didn't have that. And even though we do have that,
00:05:55.300 --> 00:05:56.920
it still doesn't tell us everything.
00:05:57.140 --> 00:06:02.440
For example, if you're in an experiment and the experimenter gives you a complicated
00:06:02.440 --> 00:06:08.000
math problem to solve while you're in an fMRI machine, we might see images of
00:06:08.000 --> 00:06:10.140
your prefrontal cortex lighting up.
00:06:10.220 --> 00:06:13.440
That means it's being activated while you're trying to solve that problem,
00:06:13.440 --> 00:06:17.900
but we still can't see what the neurons are doing on the neural level.
00:06:18.200 --> 00:06:22.940
For example, are they computing a mean or are they comparing slopes?
00:06:23.200 --> 00:06:26.040
What are they doing? It's still somewhat of a black box.
00:06:27.001 --> 00:06:31.441
So we're stuck with clever laboratory techniques most of the time in order to
00:06:31.441 --> 00:06:33.381
understand the human mind.
00:06:33.941 --> 00:06:39.501
In fact, the history of the psychology lab could be described as a long series
00:06:39.501 --> 00:06:46.001
of scientists inventing clever ways of drawing conclusions about something that's invisible,
00:06:46.261 --> 00:06:52.221
human cognition or the mind, by observing something that is visible, behavior.
00:06:52.581 --> 00:06:56.101
So it's difficult to understand what's inside the black box.
00:06:56.101 --> 00:07:01.841
It also turns out that there's this attitude versus behavior problem that makes
00:07:01.841 --> 00:07:05.421
it ineffective to ask people what's going on inside their minds.
00:07:05.801 --> 00:07:11.521
For example, sociologists started discovering in the 1930s that giving people
00:07:11.521 --> 00:07:15.881
opinions on surveys didn't necessarily match their behavior.
00:07:16.721 --> 00:07:20.941
Leon Fessinger developed the theory of cognitive dissonance,
00:07:21.081 --> 00:07:26.361
which is a tension between attitude and behavior that a person attempts to reconcile.
00:07:26.981 --> 00:07:31.761
So attitude and behavior is not the same thing, and it's usually attitude that
00:07:31.761 --> 00:07:38.021
you get when you ask someone how they feel, rather than an accurate description of their behavior.
00:07:38.601 --> 00:07:43.341
And then there's the fact that it's just generally difficult for people to be
00:07:43.341 --> 00:07:47.561
able to articulate their true reasons for doing things.
00:07:48.061 --> 00:07:54.121
There was a famous study in the 1970s where social psychologists Richard Nisbet
00:07:54.121 --> 00:08:00.161
and Timothy Wilson demonstrated that people can be very bad at explaining why
00:08:00.161 --> 00:08:01.481
they made certain decisions.
00:08:01.941 --> 00:08:08.061
In this experiment, participants sat down and had several pairs of stockings placed in front of them.
00:08:08.281 --> 00:08:13.681
The task was to choose the best pair. people would pick them up touch them compare
00:08:13.681 --> 00:08:18.881
them and they'd make a choice and say something like oh I picked these these are way better,
00:08:19.673 --> 00:08:24.273
When asked why, people confidently reported all kinds of different reasons.
00:08:24.433 --> 00:08:27.133
They might have talked about the softness or the texture.
00:08:27.393 --> 00:08:30.193
They might have even talked about the quality of the stitching.
00:08:30.813 --> 00:08:34.093
The only problem was that all the stockings were identical.
00:08:35.133 --> 00:08:39.033
People unknowingly made up reasons for picking one of the two pairs.
00:08:39.433 --> 00:08:42.293
And they also tended to pick the pair that was on the right.
00:08:42.473 --> 00:08:47.693
This was an effect that Nisbet and Wilson ended up calling right-hand bias.
00:08:48.373 --> 00:08:53.753
Nisbet and Wilson concluded that there are certain high-level cognitive processes
00:08:53.753 --> 00:08:58.353
that we just don't have access to when trying to describe what we're thinking
00:08:58.353 --> 00:09:00.113
and why we're doing what we're doing.
00:09:00.713 --> 00:09:05.753
So, returning to our question of introspection, we walked through our first
00:09:05.753 --> 00:09:11.413
story about Wundt in his laboratory doing introspection in the late 1860s.
00:09:11.433 --> 00:09:17.233
But let's take a look at what happened 100 years later in a laboratory on the other side of the world.
00:09:17.693 --> 00:09:23.513
In the late 1960s, inside a cognitive psychology laboratory at Carnegie Mellon,
00:09:24.413 --> 00:09:29.053
Alan Newell and Herbert Simon sat at a table with a research participant.
00:09:29.053 --> 00:09:35.333
This experiment was intended to offer new insights into human problem-solving.
00:09:36.473 --> 00:09:40.973
Newell and Simon were pioneers in artificial intelligence, and they were very
00:09:40.973 --> 00:09:48.153
interested in the logical steps and the elements involved in the problem-solving process.
00:09:48.633 --> 00:09:54.833
In front of the participant were three wooden pegs and a stack of circular disks on the first peg.
00:09:55.153 --> 00:09:59.833
This is the Tower of Hanoi problem. It's a classic problem-solving test that
00:09:59.833 --> 00:10:04.833
requires moving the disks from the first peg to the third, and you've got to
00:10:04.833 --> 00:10:06.253
follow a strict set of rules.
00:10:06.613 --> 00:10:11.873
You can only move one disk at a time, and you can't put a larger disk on top of a smaller one.
00:10:12.985 --> 00:10:19.225
A tape recorder was nearby, ready to capture every moment of the participant's verbalizations.
00:10:20.265 --> 00:10:24.045
Newell presses the record button and instructs the participant by saying,
00:10:24.265 --> 00:10:27.465
tell us everything you're thinking as you work through the problem,
00:10:27.685 --> 00:10:29.765
and he signals the start of the session.
00:10:30.225 --> 00:10:34.965
The participant first stares at the Tower of Hanoi puzzle, thinking out loud,
00:10:35.245 --> 00:10:39.245
I need to move the largest disk, but it's blocked by the smaller ones.
00:10:40.185 --> 00:10:44.325
Newell and Simon listened for the first sign of strategy to be verbalized.
00:10:44.565 --> 00:10:46.425
This was the heart of their research,
00:10:46.605 --> 00:10:51.505
using verbal protocols to capture cognitive processes in real time.
00:10:51.685 --> 00:10:55.825
As the participant moved the first small disk and continued to speak,
00:10:56.025 --> 00:10:58.685
his thoughts flowed as an uninterrupted stream.
00:10:58.985 --> 00:11:04.265
He says, I'll start by moving the smallest disk to the second peg so I can free
00:11:04.265 --> 00:11:10.545
up space to move the others, then carefully picking up the small piece to put it on the middle peg.
00:11:10.725 --> 00:11:17.465
He continues, now I can move the second smallest one to the third peg. And he continues.
00:11:18.065 --> 00:11:22.825
Now, as you listen to those two different scenes, the introspection scene from
00:11:22.825 --> 00:11:29.785
the 1860s and the verbal protocol scene from the 1960s, they might seem awfully similar.
00:11:30.125 --> 00:11:34.825
They both involve a person talking as they do something and the research is
00:11:34.825 --> 00:11:37.945
recording their words and using their words as data.
00:11:38.205 --> 00:11:43.685
So why is it that one of them is considered a long-debunked methodology while
00:11:43.685 --> 00:11:46.885
the other one is embraced as a valid research technique?
00:11:47.385 --> 00:11:49.665
There's a few differences to keep in mind.
00:11:50.551 --> 00:11:56.851
The first one is that introspection was making some big claims that the verbal protocol is not.
00:11:57.431 --> 00:12:03.191
Introspection was claiming to identify core elements of the human mind,
00:12:03.371 --> 00:12:08.131
kind of in the same way that physics can identify elements and build them up in pieces.
00:12:08.131 --> 00:12:11.171
That's what the structuralist movement that's the
00:12:11.171 --> 00:12:13.931
movement that was using introspection that's what
00:12:13.931 --> 00:12:17.511
they were trying to do by picking apart
00:12:17.511 --> 00:12:20.551
the human mind into little elements and then
00:12:20.551 --> 00:12:27.131
building it back up into higher level cognitive constructs verbal protocol isn't
00:12:27.131 --> 00:12:31.551
trying to do that in verbal protocol you're making a much more modest claim
00:12:31.551 --> 00:12:36.911
you're simply saying that as somebody is doing this particular behavior that
00:12:36.911 --> 00:12:39.711
I'm observing and that's the main piece of data,
00:12:39.711 --> 00:12:46.271
I also have an insight into maybe what step they're on or whether or not they're confused.
00:12:46.991 --> 00:12:52.211
So in introspection, the words are the main data and they're presumed to have
00:12:52.211 --> 00:12:56.951
a direct and strong attachment to real things that are going on inside the mind.
00:12:57.231 --> 00:12:59.671
And it was a very unreliable method.
00:12:59.971 --> 00:13:01.991
Different labs got very different results.
00:13:02.431 --> 00:13:06.871
When it comes to self-report data, you've got to be careful.
00:13:07.191 --> 00:13:13.571
Popular business activities like focus groups and customer feedback can be very
00:13:13.571 --> 00:13:17.091
misleading if they're used in the wrong way, and they often are.
00:13:17.331 --> 00:13:22.051
There are more reliable methods of obtaining self-report data.
00:13:22.051 --> 00:13:27.751
Consider things like well-constructed questionnaires and in-depth interviews.
00:13:28.391 --> 00:13:33.991
However, these should be conducted by people who are knowledgeable and skillful
00:13:33.991 --> 00:13:36.211
about how to use those methods.
00:13:36.511 --> 00:13:41.831
You want good data that reflects reality because it was gathered with methods
00:13:41.831 --> 00:13:45.831
that have built-in safeguards against human bias.
00:13:46.511 --> 00:13:51.731
Bias comes from all directions. So when a interviewer is asking questions,
00:13:52.071 --> 00:13:54.051
they can have bias in the way they're asking.
00:13:54.751 --> 00:13:58.671
And when someone's responding, they can also respond with bias,
00:13:58.671 --> 00:14:05.611
and it takes an expert to recognize that they're responding with some kind of a bias and tease it out.
00:14:06.091 --> 00:14:12.751
Now, how do we ensure that the user feedback we gather is both accurate and
00:14:12.751 --> 00:14:15.751
useful to our design process?
00:14:16.351 --> 00:14:22.791
One strategy is to pair self-report data with observable behavioral data.
00:14:22.791 --> 00:14:26.191
Imagine watching someone browse a shopping website.
00:14:26.471 --> 00:14:30.191
If they suddenly stop scrolling, that's a behavioral cue.
00:14:30.671 --> 00:14:35.211
Ask them what's going on and you'll get their perspective, but you cross-check
00:14:35.211 --> 00:14:37.091
it with an observable action.
00:14:37.331 --> 00:14:42.671
That's how you bridge the gap between what users say and what they actually do.
00:14:44.097 --> 00:14:50.337
To hone the skill of interpreting self-report data, there's some good readings and ways to practice.
00:14:50.357 --> 00:14:55.697
I recommend investing a little time in learning qualitative research methods.
00:14:55.937 --> 00:15:01.557
For example, a book called Observing the User Experience by Mike Kuniofsky.
00:15:01.917 --> 00:15:07.677
Or check out the book Time to Listen by Indie Young, who I interviewed for this
00:15:07.677 --> 00:15:09.497
season of The Design Psychologist.
00:15:09.497 --> 00:15:16.317
She fascinatingly redefined qualitative research and she proposes a method for
00:15:16.317 --> 00:15:21.717
extracting concepts from people's minds in a way that's more effective and empathetic.
00:15:22.657 --> 00:15:26.877
So remember that self-report data is far from perfect.
00:15:27.077 --> 00:15:32.697
It has its limitations, but self-reporting can be valuable if you have skilled
00:15:32.697 --> 00:15:40.377
researchers guiding the process and if you combine that self-report insight with real behavior.
00:15:41.237 --> 00:15:48.997
The key is to refine your methods, watch for bias, and validate findings with multiple data sources.
00:15:49.877 --> 00:15:55.757
In the end, self-report data is a tool, imperfect but powerful when we use it with care.
00:15:56.117 --> 00:16:00.837
It reminds us that understanding people is never as simple as just asking.
00:16:01.197 --> 00:16:07.937
Our ability to articulate what we need is often limited, but great design shouldn't be.
00:16:08.097 --> 00:16:15.177
It takes nuance, empathy, and skill to uncover what people truly think, feel, and do.
00:16:15.917 --> 00:16:20.617
That's a challenge of design psychology, seeing the full human picture.
00:16:20.957 --> 00:16:26.557
So don't just take answers at face value because often the quality of what we
00:16:26.557 --> 00:16:30.837
design is tied to the depth of what we understand about people.
00:16:31.137 --> 00:16:36.337
The more clearly we see the people we're designing for, the better we can shape
00:16:36.337 --> 00:16:38.397
experiences that truly resonate.