This post is a follow-up to the June 2020 post on ViLLA and video in teacher education. That post was about why classroom video is useful and what the ViLLA project found. This one is about the practical question that post sidestepped: what does it actually take to film a real lesson?

The manual — Kramer, C., Spicker, S. J., & Kaspar, K. (2023). Manual zur Erstellung von Unterrichtsvideographien — is open access and freely downloadable at kups.ub.uni-koeln.de/65599. Funded by the BMBF under the ZuS Qualitätsoffensive Lehrerbildung programme (grant 01JA1815).


Why a Manual Exists

The argument for classroom video in teacher education is not hard to make. The evidence that video-based learning improves the perceptual and interpretive skills of student teachers is solid enough that “should we use video?” is no longer a particularly interesting question. The interesting questions are downstream: which kind of video, for what purpose, produced how, stored where, used under what conditions.

The last of those — produced how — turns out to be the one that most programmes have the least guidance on. There is a reasonably large research literature on the effects of classroom video, and a smaller but growing literature on design principles for video-based learning environments. There is much less on the practical production side: what you need to decide before you enter a school building, what can go wrong during filming, and what the post-processing work actually involves.

The gap matters because it creates a reproducibility problem. If every research group that wants classroom video has to figure out independently how to handle consent across four institutional levels, how to position two cameras in a classroom with a window on the wrong side, and how much post-processing time to budget per lesson, a lot of effort goes into re-solving problems that have already been solved. The manual is an attempt to make that accumulated knowledge explicit and shareable.


Three Phases, and Why Preparation Is the Most Important One

The manual is structured around the production lifecycle: preparation, production, and post-processing. Each section ends with a practical checklist. The structuring is not original — it follows Thomson (2019) and draws on Herrle and Breitenbach (2016) and several other methodological guides — but the synthesis reflects what we learned from actually running videography sessions at the University of Cologne over several years.

The strongest claim in the manual is that preparation is the most important phase. This sounds obvious and is consistently underestimated.

Methodical preparation: the question before the camera question

Before any equipment decisions, the manual asks you to work through a prior question: is video actually the right medium for what you want to know?

This is not a rhetorical check. Classroom video is excellent at capturing dynamic processes — movement, gesture, voice, simultaneous events — and works well for constructs like classroom management and communication patterns. It works less well for constructs where the relevant data is not visible on the surface, like a student’s prior knowledge activation or the cognitive demands of a task. Using video for those questions is possible, but you need more sessions, more annotation work, and supplementary instruments. Building that into your timeline before you start is considerably better than realising it after you have sixty hours of footage.

The manual also distinguishes four decisions about what kind of video you are making:

  • Authentic vs. staged: real everyday teaching vs. deliberately constructed cases. Authentic footage gives you ecological validity; staged footage lets you control which situations appear.
  • Own vs. others’ teaching: self-recording for reflection vs. observing others for general analysis.
  • Typical vs. best practice: real-world teaching in its ordinary form vs. exemplary demonstration material.
  • Sequence vs. full lesson: a targeted extract sufficient for a specific analytic focus vs. a complete lesson for contextualised, developmental analysis.

None of these are neutral technical choices. They are methodological decisions that determine what the resulting footage can be used for and what it cannot.

The most time-consuming part of any real videography project is not the filming. It is obtaining the permissions.

You need written consent from pupils, parents or guardians (separately, depending on age — the threshold is 14 in the German legal framework the manual follows), the class teacher, school leadership, the school authority, and in some states the relevant ministry. The scope of the consent you obtain determines the scope of use you can put the footage to: footage filmed under a narrow research-project-only consent cannot be uploaded to ViLLA; footage filmed with broad usage rights can. The broader the rights you request, the higher the barrier for participants to agree.

The practical implication: decide early what you want to do with the footage, because what you put in the information letters and consent forms determines what is possible for the lifetime of the data. This is a decision you cannot easily undo.

The manual also addresses the case where some pupils do not consent: in that situation, it is often possible to position non-consenting pupils in a “blind spot” — an area of the room where neither camera nor microphone captures them. But this requires knowing the room layout and the planned seating arrangement in advance, which is another reason organisational preparation starts earlier than you think.

Technical preparation: as much as necessary, as little as possible

The guiding principle for equipment selection is stated directly in the manual: so viel wie nötig, so wenig wie möglich — as much as necessary, as little as possible.

This matters because there is a pull toward technical elaboration that does not always serve the research purpose. More cameras capture more perspectives; more microphones capture more of the acoustic space; 360° cameras give you everything. But more equipment means more setup time, more opportunities for failure during filming, and substantially more post-processing work. And more visual complexity in the final video does not automatically mean more analytically useful material — it can mean more cognitive load for the students watching it.

The baseline setup the manual recommends is two static cameras positioned facing each other: one centred on the students, one centred on the teacher. This configuration, with lavalier microphones on teachers and boundary microphones for student audio at the cameras, captures most of what you need for classroom management research and teacher education at a level of complexity that is manageable. Extensions — pan cameras for interaction analysis, additional cameras for group work, mobile eye-tracking for teacher perspective, 360° cameras — are described as additions for specific purposes, not as defaults.


What Happens During Filming

The production section of the manual is the most specific and in some ways the most useful part if you are planning a session for the first time. Some things worth knowing:

Start the cameras before the lesson. Authentically start once means you cannot go back. Events that happen before the official start of the lesson — how a teacher enters, how students settle, how the first few minutes of a lesson are framed — can be analytically relevant. And any technical problems that surface before teaching begins can still be fixed. Footage filmed before the lesson is easy to cut in post; lost footage from the opening of a lesson is gone.

The camera operator’s job is to be boring. The manual is explicit that operators should neither engage with the lesson content nor conspicuously attend to the equipment. A relaxed posture, eyes on the monitor, not reacting to what is happening in the room — this is the technique that allows pupils and teachers to stop registering the cameras, which typically happens within the first few minutes if operators are not drawing attention to themselves.

Use a clapper. When running multiple cameras or separate audio recorders, a handclap or clapperboard after all devices are rolling gives you a synchronisation point for later editing. This is known to everyone who has ever synchronised footage, but it is the kind of thing that is easy to forget in the scramble of setting up during a ten-minute break.

Backlighting is the enemy. Windows behind subjects produce the most common image quality problem in classroom footage. The manual discusses ND filters for cases where backlighting cannot be avoided, but the first-choice solution is room scouting in advance to know where the windows are and plan camera placement accordingly.


Post-Processing: The Hidden Cost

The post-processing chapter is the one I think is most likely to recalibrate expectations productively.

Post-processing is time-intensive in proportion to the number of camera angles, the number of audio tracks requiring synchronisation or correction, and the extent of image and sound quality work needed. The manual is explicit that editing should be done by people with content knowledge — not just technical skill — because the person in the edit suite is constantly making decisions about what to include, how to cut between perspectives, when to show the teacher’s face vs. the students' faces. Those decisions are not editorially neutral. They determine what a viewer of the finished video can perceive.

This is the point in the manual where the methodological problem I mentioned in the previous post becomes concrete: the videography setting is not a neutral window onto the classroom. The two-camera cross-cut convention (cut to the face of whoever is speaking) is widely used and convenient for teaching purposes, but it is also an editorial choice that foregrounds spoken exchange and makes other information — spatial position, background activity, gestural communication between students — less visible. Knowing that this choice was made is part of what a researcher or educator needs to know in order to use the footage responsibly.

Data security deserves its own mention. Video files are large, they contain images of minors, and they need to be stored under conditions that comply with current data protection law — which means redundant backup, restricted access, purpose limitation, and active awareness of what the current legal requirements are (which change). The manual recommends checking applicable regulations before starting rather than after, and treating data security as part of the workflow design rather than an administrative afterthought.


What Is Coming Next

The manual’s final chapter points toward three developments that are worth tracking:

360° video and VR. Gold and Windscheid (2020) found that 360° classroom video produces higher presence in student teacher observers than conventional video, though without differences in learning outcomes measured by events noticed or ratings of teaching quality. Whether the presence effect translates into something measurable is an open empirical question. The VR version of this — using 360° classroom footage as an immersive training environment where student teachers can observe without the pressure of having to act — is methodologically interesting and practically plausible at costs that are no longer prohibitive.

Animated classroom video. The handful of studies on animated (as opposed to filmed) classroom situations suggests that student teachers notice similar learning-relevant events in animated and real footage (Smith et al., 2012; Chieu et al., 2011). If that holds up, animation offers a way to construct specific scenarios that would be hard to capture or ethically complex to film — situations involving conflict, failure, or particular forms of student difficulty — without requiring access to actual classrooms or consent from real pupils.

Mobile eye-tracking. The combination of classroom videography with mobile eye-tracking worn by teachers (Rüth, Zimmermann, & Kaspar, 2020) opens the teacher’s-perspective angle that a fixed camera cannot capture. It is a technically more demanding addition to the setup but an analytically distinctive one, and the hardware costs have come down substantially.


A Note on Open Access

The manual is freely available at kups.ub.uni-koeln.de/65599. We made it open access deliberately. The practical obstacles to classroom videography — not knowing how to handle consent, not knowing what equipment configuration works for a standard lesson, not knowing how long post-processing will actually take — are not obstacles that should be higher for researchers at institutions without an existing videography infrastructure. The knowledge exists; it should be findable.

If you are at the University of Cologne and want to run a videography session but do not have your own equipment, the ZuS Media Labs project has a lending programme. Contact the team at zus-kontakt@uni-koeln.de for the current equipment catalogue.


For the specific challenges the manual doesn’t address — recording in music education, instrument acoustics, one-to-one lessons, and practice-session documentation — see the follow-up post on filming music education.


References

Chieu, V. M., Herbst, P., & Weiss, M. (2011). Effect of an animated classroom story embedded in online discussion on helping mathematics teachers learn to notice. Journal of the Learning Sciences, 20(4), 589–624. https://doi.org/10.1080/10508406.2011.528324

Gold, B., & Windscheid, J. (2020). Observing 360-degree classroom videos — effects of video type on presence, emotions, workload, classroom observations, and ratings of teaching quality. Computers & Education, 156, 103960. https://doi.org/10.1016/j.compedu.2020.103960

Herrle, M., & Breitenbach, S. (2016). Planung, Durchführung und Nachbereitung videogestützter Beobachtungen im Unterricht. In U. Rauin, M. Herrle & T. Engartner (Hrsg.), Videoanalysen in der Unterrichtsforschung, 30–49. Beltz Juventa.

Kramer, C., König, J., Strauß, S., & Kaspar, K. (2020). Classroom videos or transcripts? A quasi-experimental study to assess the effects of media-based learning on pre-service teachers’ situation-specific skills of classroom management. International Journal of Educational Research, 103, 101624. https://doi.org/10.1016/j.ijer.2020.101624

Rüth, M., Zimmermann, D., & Kaspar, K. (2020). Mobiles Eye-Tracking im Unterricht. In K. Kaspar et al. (Hrsg.), Bildung, Schule, Digitalisierung, 222–228. Waxmann.

Smith, D., McLaughlin, T., & Brown, I. (2012). 3-D computer animation vs. live-action video. Contemporary Issues in Technology and Teacher Education, 12(1), 41–54.

Thomson, A. (2019). The creation and use of video-for-learning in higher education. Master’s thesis, Queensland University of Technology. https://doi.org/10.5204/thesis.eprints.130743