A follow-up to the Mission to Mars post, which describes the experimental work. This one is about the methodology layer underneath it — specifically, what I got wrong.
The Setup
My background is in physics. I ended up in physics education research sideways, through the astro-lab project and through a genuine interest in why students find physics so alienating and what might help. When it came time to frame that work as a thesis, I had to choose a methodology.
I chose design thinking. Or more precisely, I chose something that borrowed heavily from design-based research and design thinking frameworks and that felt, at the time, like the obvious match for what I was doing. I was designing experiments. I was iterating on them. I was testing them with students and refining them. Design thinking is a framework for exactly this process. What could be more natural?
Several people told me I was making a mistake. Colleagues with more qualitative research experience, a supervisor who had been through the methodology debates in education research more times than he wanted to count. The consistent advice was: use grounded theory. Be systematic about your data. Let the categories emerge from what you actually observe rather than from what you designed the experiment to produce.
I thought I understood what they were saying. I did not understand what they were saying.
What I Thought Design Thinking Gave Me
Design thinking, as a research framing, offered what felt like a clean correspondence between method and subject matter. The thing I was producing was a designed artifact — a teaching experiment. The process I was following was inherently iterative: run it, observe what happens, revise, run it again. The framework had a vocabulary for this (empathise, define, ideate, prototype, test) that matched my actual working process.
Design-based research, the academic version of this approach in education, has a real literature behind it. It is used in educational technology research and in curriculum development. It is not a made-up category. The argument for it is reasonable: if you are trying to design effective educational interventions, then designing and studying those interventions at the same time is a coherent research strategy.
What I told myself was: I am doing design-based research. The methodology matches the work. The thesis will describe the design process, the rationale for each design decision, the iterative refinements, and the evidence that the final design works. This is a contribution to knowledge because it produces a principled, evidence-informed design that other practitioners can use and adapt.
This is not wrong. But it is not enough for a thesis. And I only understood why it is not enough after I had spent considerable time trying to make it be enough.
The Reckoning in the Methodology Chapter
The methodology chapter of a thesis is where you have to be explicit about the epistemological status of your claims. You are not just describing what you did. You are explaining why the thing you did counts as knowledge production, what kind of knowledge it produces, and how someone else could evaluate whether you did it correctly.
This is where design thinking started to come apart.
What kind of claim does a design study make? The honest answer is: it makes a claim about this design, in these contexts, with these students. It does not easily generalise beyond that. If I show that the Mission to Mars experiment produces measurable improvements in students’ understanding of air pressure in a student lab context at the University of Cologne in 2019, the implication for other teachers in other contexts is… unclear. The design worked here. Maybe it will work for you. Good luck.
A thesis contribution needs to be something more transferable than that. It needs to produce knowledge about a phenomenon, not just knowledge about a specific designed object. “Here is a well-designed experiment” is a practitioner contribution, which is genuinely valuable, but it is not the same as a theoretical contribution to the field.
The iteration problem. Design thinking celebrates iterative refinement. But in a thesis, every iteration needs to be motivated by evidence, and the nature of the evidence and how it maps onto the design changes needs to be made explicit. If I changed something between version 1 and version 2 of the experiment, the methodology chapter must explain: what data told me to make that change? How did I analyse it? What coding framework did I apply? What alternative changes did I consider and rule out, and on what grounds?
Design thinking has no systematic answer to these questions. It has process descriptions (“we tested with users and gathered feedback”) but not research methodology answers (“I applied open coding to the think-aloud protocols and the following categories emerged, which pointed toward this specific revision”). Without that precision, the “iteration” in the methodology chapter looks like: I tried it, it did not quite work, I made it better. Which is honest but not a researchable process.
The validation problem. Design-based research often validates its designs against the criteria that motivated the design. I designed the experiment to address specific student misconceptions about air pressure. I then tested whether students who did the experiment had fewer of those misconceptions afterward. If the answer is yes, the design is validated.
But this is circular in a way that becomes visible under examination. The misconceptions I targeted were the ones I identified at the start. The students I studied were the ones who came to my lab. The measurement instrument I used was one I designed to detect the specific changes I expected the design to produce. The whole system is oriented toward confirming the design rather than discovering something about the phenomenon.
Grounded theory cuts this loop. You start with the data — the students’ actual responses, their misconceptions as they express them, the things that confuse them that you did not anticipate — and you build categories from the bottom up. What you end up with is a theory of how students actually think about air pressure (or whatever the topic is), which may or may not match what you assumed when you designed the experiment. The cases where it does not match are precisely where the theoretical contribution lives.
What Grounded Theory Would Have Required
Grounded theory, done properly, is laborious. The Glaserian version (open coding, theoretical sampling until saturation, constant comparative method) requires treating every interview, every observation, every student response as a data source to be systematically analysed, compared, and connected into a coherent theory.
Theoretical sampling means you do not decide in advance how many students to study or what contexts to observe. You keep gathering data until new cases stop producing new categories — until the theory is saturated. This is methodologically sound and practically painful, because you cannot know in advance when you will be done.
Memoing — writing ongoing analytical notes about the emerging categories and their relationships — is a discipline that forces you to be explicit about your reasoning at every step. Not just “these two responses seem similar” but “these two responses are similar because both students are treating pressure as a property of moving air, and here is how that connects to the misconception documented by [citation].”
I did not want to do this. I wanted to design experiments. Grounded theory felt like a detour from the thing I was actually interested in.
The advice I received was: this is not a detour. A systematic analysis of what students think about air pressure, and how they think about it, and what experiences shift their thinking, is a theoretical contribution that would make the experiments more useful to everyone — not just a record of experiments that worked in one lab in one city in one year.
They were right about this.
What I Actually Learned (Too Late to Use in the Thesis)
The most useful student responses in the Mission to Mars experiment were not the ones that confirmed the design was working. They were the unexpected ones.
The PVC pipe failure — the moment when the lid pops off and students hear the sound — was included because I thought it would demonstrate the direction of pressure force in a visceral way. What I observed, which I noted but did not systematically analyse, was that different students interpreted the pop differently. Some immediately understood it as the internal air pushing out. Others interpreted it as the external vacuum pulling the lid. A few were unsure which way the force had been directed even after the event.
A grounded theory analysis of those responses would have produced something genuinely interesting: a typology of how students process a demonstrable physical event when it conflicts with their existing pressure intuitions. That typology would have been transferable to other experimental contexts, other pressure scenarios, other situations where students encounter the vacuum-suction confusion.
Instead I noted it, described it qualitatively, and moved on because it was not what the design was optimised to produce.
That is the design thinking trap. You are so focused on the designed outcome that you treat unexpected observations as noise rather than as data. Grounded theory treats them as the most valuable data you have.
A Note for Other Physicists Entering Education Research
If you are coming from a natural science background and you are starting work in education research, the methodology question will feel foreign at first. In physics, methodology is largely a matter of technical choice — which instrument, which statistical test, which model. The epistemological questions (what kind of knowledge does this produce? how does it generalise?) are handled by the experimental framework itself, which is a known, shared, peer-reviewed practice.
In qualitative education research, those questions are not handled in advance. You have to work them out explicitly, for your specific study, in writing. This is uncomfortable for people trained in a tradition where you do the experiment and then write up what happened.
The temptation, for a physicist, is to choose a methodology that feels like a framework for doing things rather than one that feels like a framework for thinking about what you found. Design thinking is a framework for doing things. Grounded theory is a framework for thinking about what you found.
Both are legitimate. But a thesis needs to make a theoretical contribution, and theoretical contributions come from systematic analysis of phenomena, not from documentation of designed objects.
I would have finished faster and understood more if I had done the uncomfortable thing from the start.
The experimental work this post is commenting on is described in Mission to Mars. For a more successful later use of qualitative methodology in a related context, see AI Transcription and Grounded Theory.
References
Glaser, B. G., & Strauss, A. L. (1967). The Discovery of Grounded Theory: Strategies for Qualitative Research. Aldine.
Strauss, A., & Corbin, J. (1998). Basics of Qualitative Research: Techniques and Procedures for Developing Grounded Theory (2nd ed.). SAGE Publications.
The Design-Based Research Collective (2003). Design-based research: An emerging paradigm for educational inquiry. Educational Researcher, 32(1), 5–8. https://doi.org/10.3102/0013189X032001005
Brown, T. (2008). Design thinking. Harvard Business Review, 86(6), 84–92.