<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Latency on Sebastian Spicker</title>
    <link>https://sebastianspicker.github.io/tags/latency/</link>
    <description>Recent content in Latency on Sebastian Spicker</description>
    <image>
      <title>Sebastian Spicker</title>
      <url>https://sebastianspicker.github.io/og-image.png</url>
      <link>https://sebastianspicker.github.io/og-image.png</link>
    </image>
    <generator>Hugo -- 0.160.0</generator>
    <language>en</language>
    <lastBuildDate>Thu, 08 Feb 2024 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://sebastianspicker.github.io/tags/latency/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>When Musicians Lock In: Coupled Oscillators and the Physics of Ensemble Synchronisation</title>
      <link>https://sebastianspicker.github.io/posts/kuramoto-ensemble-sync/</link>
      <pubDate>Thu, 08 Feb 2024 00:00:00 +0000</pubDate>
      <guid>https://sebastianspicker.github.io/posts/kuramoto-ensemble-sync/</guid>
      <description>Every ensemble faces the same physical problem: N oscillators with slightly different natural frequencies trying to synchronise through a shared coupling channel. The Kuramoto model — developed by a statistical physicist to describe fireflies, neurons, and power grids — applies directly to musicians. It predicts a phase transition between incoherence and synchrony, quantifies why latency destroys networked ensemble performance, and connects to recent EEG studies of inter-brain synchronisation.</description>
      <content:encoded><![CDATA[<p><em>The problem is ancient and the language for it is recent. In any ensemble — a
string quartet, a jazz rhythm section, an orchestra — musicians with slightly
different internal tempos must stay together. They do this by listening to each
other. But what, exactly, does &ldquo;listening to each other&rdquo; do to their timing? And
what happens when the listening channel is imperfect — delayed by the speed of
sound across a wide stage, or by a network cable crossing a continent? The answer
involves a differential equation that was not written to describe music.</em></p>
<p><em>This post extends the latency analysis in <a href="/posts/nmp-latency-lola-mvtp/">Latency in Networked Music
Performance</a> with the dynamical systems framework
that underlies it.</em></p>
<hr>
<h2 id="two-clocks-on-a-board">Two Clocks on a Board</h2>
<p>The first documented observation of coupled-oscillator synchronisation was made
not by a musician but by a physicist. In 1665, Christiaan Huygens, confined to
bed with illness, was watching two pendulum clocks mounted on the same wooden
beam. Over the course of the night, the pendulums had synchronised into
<em>anti-phase</em> oscillation — swinging in opposite directions in exact unison.
He reported it to his father:</p>
<blockquote>
<p>&ldquo;I have noticed a remarkable effect which no-one has observed before&hellip; two
clocks on the same board always end up in mutual synchrony.&rdquo;</p>
</blockquote>
<p>The mechanism was mechanical coupling through the beam. Each pendulum&rsquo;s swing
imparted a small impulse to the wood; the other pendulum felt this as a
perturbation to its rhythm. Small perturbations, accumulated over hours, drove
the clocks into a shared frequency and a fixed phase relationship.</p>
<p>This is the prototype of every ensemble synchronisation problem. Each musician
is a clock. The acoustic environment — the air in the room, the reflected sound
from the walls, the vibrations through the stage floor — is the wooden beam.</p>
<hr>
<h2 id="the-kuramoto-model">The Kuramoto Model</h2>
<p>Yoshiki Kuramoto formalised the mathematics of coupled oscillators in 1975,
motivated by biological synchronisation problems: firefly flashing, circadian
rhythms, cardiac pacemakers. His model considers $N$ oscillators, each with a
phase $\theta_i(t)$ evolving according to:</p>
$$\frac{d\theta_i}{dt} = \omega_i + \frac{K}{N} \sum_{j=1}^{N} \sin(\theta_j - \theta_i), \qquad i = 1, \ldots, N.$$<p>The first term, $\omega_i$, is the oscillator&rsquo;s <em>natural frequency</em> — the tempo it
would maintain in isolation. These are drawn from a distribution $g(\omega)$, which
in a real ensemble reflects the spread of individual preferred tempos among the
players. The second term is the coupling: each oscillator is attracted toward the
phases of all others, with strength $K/N$. The factor $1/N$ keeps the total
coupling intensive (independent of ensemble size) as $N$ grows large.</p>
<p>Musically: $\theta_i$ is the phase of musician $i$&rsquo;s internal pulse at a given
moment, $\omega_i$ is their preferred tempo if playing alone, and $K$ is the
coupling strength — how much they adjust their tempo in response to what they
hear from the others.</p>
<hr>
<h2 id="the-order-parameter-and-the-phase-transition">The Order Parameter and the Phase Transition</h2>
<p>To measure the degree of synchronisation, Kuramoto introduced the complex order
parameter:</p>
$$r(t)\, e^{i\psi(t)} = \frac{1}{N} \sum_{j=1}^{N} e^{i\theta_j(t)},$$<p>where $r(t) \in [0, 1]$ is the <em>coherence</em> of the ensemble and $\psi(t)$ is the
collective mean phase. When $r = 0$, the phases are uniformly spread around the
unit circle — the ensemble is incoherent. When $r = 1$, all phases coincide —
perfect synchrony. In a live ensemble, $r$ is a direct measure of rhythmic
cohesion, though of course not one you can read off a score.</p>
<p>Substituting the order parameter into the equation of motion:</p>
$$\frac{d\theta_i}{dt} = \omega_i + K r \sin(\psi - \theta_i).$$<p>Each oscillator now interacts only with the mean-field quantities $r$ and $\psi$,
not with every other oscillator individually. The coupling pulls each musician
toward the collective mean phase with a force proportional to both $K$ (how
attentively they listen) and $r$ (how coherent the group already is).</p>
<p>This mean-field form reveals the essential physics. For small $K$, oscillators
with widely differing $\omega_i$ cannot follow the mean field — they drift at
their own frequencies, and $r \approx 0$. At a critical coupling strength $K_c$,
a macroscopic fraction of oscillators suddenly locks to a shared frequency, and
$r$ begins to grow continuously from zero. For a unimodal,
symmetric frequency distribution $g(\omega)$ with density $g(\bar\omega)$ at the
mean:</p>
$$K_c = \frac{2}{\pi\, g(\bar\omega)}.$$<p>Above $K_c$, the coherence grows as:</p>
$$r \approx \sqrt{\frac{K - K_c}{K_c}}, \qquad K \gtrsim K_c.$$<p>This is a <strong>second-order (continuous) phase transition</strong> — the same
mathematical structure as a ferromagnet approaching the Curie temperature,
where spontaneous magnetisation appears continuously above a critical coupling.
The musical ensemble and the magnetic material belong to the same universality
class, governed by the same mean-field exponent $\frac{1}{2}$.</p>
<p>Above $K_c$, the fraction of oscillators that are <em>locked</em> (synchronised to the
mean-field frequency) can be computed explicitly. An oscillator with natural
frequency $\omega_i$ locks to the mean field if $|\omega_i - \bar\omega| \leq
Kr$. For a Lorentzian distribution $g(\omega) = \frac{\gamma/\pi}{(\omega -
\bar\omega)^2 + \gamma^2}$, this yields:</p>
$$r = \sqrt{1 - \frac{K_c}{K}}, \qquad K_c = 2\gamma,$$<p>which is the exact self-consistency equation for the Kuramoto model with
Lorentzian frequency spread (Strogatz, 2000).</p>
<p>The physical reading is direct: whether an ensemble locks into a shared pulse or
drifts apart is a threshold phenomenon. A group of musicians with similar
preferred tempos has a peaked $g(\bar\omega)$, giving a low $K_c$ — they
synchronise easily with minimal attentive listening. A group with widely varying
individual tempos needs stronger, more sustained coupling to cross the threshold.
This is not a matter of musical discipline; it is a material property of the
ensemble.</p>
<hr>
<h2 id="concert-hall-applause-neda-et-al-2000">Concert Hall Applause: Neda et al. (2000)</h2>
<p>The Kuramoto model is not only a theoretical construction. Neda et al. (2000)
applied it to concert hall applause — one of the most direct real-world
demonstrations of coupled-oscillator dynamics in a musical context.</p>
<p>They recorded applause in Romanian and Hungarian theaters and found that audiences
spontaneously alternate between two distinct states. In the <em>incoherent</em> regime,
each audience member claps at their own preferred rate (typically 2–3 Hz). Through
acoustic coupling — each person hears the room-averaged sound and adjusts their
clapping — the audience gradually synchronises to a shared, slower frequency
(around 1.5 Hz): the <em>synchronised</em> regime.</p>
<p>The transitions between the two regimes are quantitatively consistent with the
Kuramoto phase transition: the emergence of synchrony corresponds to $K$ crossing
$K_c$ as people progressively pay more attention to the collective sound.
Furthermore, Neda et al. document a characteristic phenomenon when synchrony
breaks down: individual clapping frequency approximately <em>doubles</em> as audience
members attempt to re-establish coherence. This frequency-doubling — a feature of
nonlinear oscillator systems near instability — is exactly what the delayed
response of coupling near $K_c$ predicts.</p>
<p>The paper is a useful pedagogical artefact: every music student has experienced
concert hall applause, and hearing that it undergoes a physically measurable phase
transition makes the connection between physics and musical experience concrete.</p>
<hr>
<h2 id="latency-and-the-limits-of-networked-ensemble-performance">Latency and the Limits of Networked Ensemble Performance</h2>
<p>In standard acoustic ensemble playing, the coupling delay is the propagation time
for sound to cross the ensemble: at $343\ \text{m/s}$, across a ten-metre stage,
roughly 30 ms. This is why orchestral seating is arranged with attention to who
needs to hear whom first.</p>
<p>In networked music performance (NMP), the coupling delay $\tau$ is much larger:
tens to hundreds of milliseconds depending on geographic distance and network
infrastructure. The Kuramoto model generalises naturally to include this delay:</p>
$$\frac{d\theta_i}{dt} = \omega_i + \frac{K}{N} \sum_{j=1}^{N} \sin\!\bigl(\theta_j(t - \tau) - \theta_i(t)\bigr).$$<p>Each musician hears the others&rsquo; phases as they were $\tau$ seconds ago, not as
they are now.</p>
<p>In a synchronised state where all oscillators share the collective frequency
$\bar\omega$ and phase $\psi(t) = \bar\omega t$, the delayed phase signal is
$\psi(t - \tau) = \bar\omega t - \bar\omega\tau$. The effective coupling
force contains a factor $\cos(\bar\omega\tau)$: the delay introduces a phase
shift that reduces the useful component of the coupling. The critical coupling
with delay is therefore:</p>
$$K_c(\tau) = \frac{K_c(0)}{\cos(\bar\omega \tau)}.$$<p>As $\tau$ increases, $K_c(\tau)$ grows: synchronisation requires progressively
stronger coupling (more attentive adjustment) to compensate for the information
lag. The denominator $\cos(\bar\omega\tau)$ reaches zero when
$\bar\omega\tau = \pi/2$. At this point $K_c(\tau) \to \infty$: no finite coupling
strength can maintain synchrony. The critical delay is:</p>
$$\tau_c = \frac{\pi}{2\bar\omega}.$$<p>For an ensemble performing at 120 BPM, the beat frequency is
$\bar\omega = 2\pi \times 2\ \text{Hz} = 4\pi\ \text{rad/s}$:</p>
$$\tau_c = \frac{\pi}{2 \times 4\pi} = \frac{1}{8}\ \text{s} = 125\ \text{ms}.$$<p>This is a remarkably clean result. The Kuramoto model with delay predicts that
ensemble synchronisation collapses at around 125 ms one-way delay for a standard
performance tempo. The empirical literature on NMP — from LoLa deployments across
European conservatories to controlled latency studies in the lab — consistently
finds that rhythmic coherence degrades noticeably above 50–80 ms and becomes
essentially unworkable above 100–150 ms one-way. The model and the data agree.</p>
<p>The derivation also shows why faster tempos are harder in NMP: $\tau_c \propto
1/\bar\omega$, so doubling the tempo halves the tolerable latency. An ensemble
performing at 240 BPM in a distributed setting faces a theoretical ceiling of
62 ms — which rules out transcontinental performance for most repertoire.</p>
<hr>
<h2 id="brains-in-sync-eeg-hyperscanning">Brains in Sync: EEG Hyperscanning</h2>
<p>The Kuramoto framework has recently been applied at a neural level.
EEG hyperscanning — simultaneous EEG recording from multiple participants during
a shared musical activity — has shown that musicians performing together exhibit
<em>inter-brain synchronisation</em>: coherent cortical oscillations at the frequency of
the music are measurable between players (Lindenberger et al., 2009; Müller et
al., 2013). The phase coupling between brains during joint performance is
significantly higher than during solo performance and higher than for musicians
playing simultaneously but without acoustic coupling.</p>
<p>This suggests that the Kuramoto coupling operates at two levels: the acoustic
(each musician hears the other and adjusts physical timing) and the neural (each
musician&rsquo;s cortical oscillators entrain to the shared musical pulse). The
question of which level is primary — whether neural synchrony causes or follows
from acoustic synchrony — remains open.</p>
<p>A 2023 review by Demos and Palmer argues that pairwise Kuramoto-type coupling is
insufficient to capture full ensemble dynamics. Group-level effects — the
differentiation between leader and follower roles, the emergence of collective
timing that no individual would produce alone — require nonlinear dynamical
frameworks that go beyond mean-field averaging. The model that adequately
describes a string quartet may need to be richer than the one that describes a
population of identical fireflies.</p>
<hr>
<h2 id="what-this-means-for-teaching">What This Means for Teaching</h2>
<p>The Kuramoto model reframes standard rehearsal intuitions in physical terms.</p>
<p><strong>&ldquo;Listen more&rdquo;</strong> translates to &ldquo;increase your effective coupling constant $K$.&rdquo;
A musician who plays without attending to others has set $K \approx 0$ and will
drift freely according to their own $\omega_i$. Listening — actively adjusting
tempo in response to what you hear — is not metaphorical. It is the physical
mechanism of coupling, and its effect is to pull you toward the mean phase $\psi$
with a force $Kr\sin(\psi - \theta_i)$.</p>
<p><strong>&ldquo;Our tempos are too different&rdquo;</strong> is a claim about $g(\bar\omega)$ and therefore
about $K_c$. A group with a wide spread of natural tempos needs more and stronger
listening to synchronise. This is not a moral failing but a parameter; it
suggests that ensemble warm-up time or explicit tempo negotiation before a
performance serves to reduce the spread of natural frequencies before the coupling
has to do all the work.</p>
<p><strong>Latency as a rehearsal experiment</strong> can be made explicit. Artificially delaying
the acoustic return to one musician in an ensemble — via headphone monitoring with
variable delay — allows students to experience directly how the coordination
degrades as $\tau$ increases toward $\tau_c$. They feel the system approaching
the phase transition without the theoretical framework, but the framework makes
the experience interpretable afterward.</p>
<p><strong>The click track</strong> replaces peer-to-peer Kuramoto coupling with an external
forcing term: each musician locks to a shared reference with fixed $\omega$
rather than adjusting dynamically to the group mean. This eliminates the phase
transition but also eliminates the adaptive dynamics — the micro-timing
fluctuations and expressive rubato — that characterise live ensemble playing. It
is a pedagogically important distinction, even if studios routinely make the
pragmatic choice.</p>
<hr>
<h2 id="references">References</h2>
<ul>
<li>
<p>Demos, A. P., &amp; Palmer, C.
(2023). Social and nonlinear dynamics unite: Musical group synchrony. <em>Trends
in Cognitive Sciences</em>, 27(11), 1008–1018.
<a href="https://doi.org/10.1016/j.tics.2023.08.005">https://doi.org/10.1016/j.tics.2023.08.005</a></p>
</li>
<li>
<p>Huygens, C. (1665). Letter to his father Constantijn Huygens, 26 February
1665. In <em>Œuvres complètes de Christiaan Huygens</em>, Vol. 5, p. 243. Martinus
Nijhoff, 1893.</p>
</li>
<li>
<p>Kuramoto, Y. (1975). Self-entrainment of a population of coupled non-linear
oscillators. In H. Araki (Ed.), <em>International Symposium on Mathematical
Problems in Theoretical Physics</em> (Lecture Notes in Physics, Vol. 39,
pp. 420–422). Springer.</p>
</li>
<li>
<p>Kuramoto, Y. (1984). <em>Chemical Oscillations, Waves, and Turbulence.</em> Springer.</p>
</li>
<li>
<p>Lindenberger, U., Li, S.-C., Gruber, W., &amp; Müller, V. (2009). Brains swinging
in concert: Cortical phase synchronization while playing guitar.
<em>BMC Neuroscience</em>, 10, 22. <a href="https://doi.org/10.1186/1471-2202-10-22">https://doi.org/10.1186/1471-2202-10-22</a></p>
</li>
<li>
<p>Müller, V., Sänger, J., &amp; Lindenberger, U. (2013). Intra- and inter-brain
synchronization during musical improvisation on the guitar. <em>PLOS ONE</em>, 8(9),
e73852. <a href="https://doi.org/10.1371/journal.pone.0073852">https://doi.org/10.1371/journal.pone.0073852</a></p>
</li>
<li>
<p>Neda, Z., Ravasz, E., Vicsek, T., Brechet, Y., &amp; Barabási, A.-L. (2000).
Physics of the rhythmic applause. <em>Physical Review E</em>, 61(6), 6987–6992.
<a href="https://doi.org/10.1103/PhysRevE.61.6987">https://doi.org/10.1103/PhysRevE.61.6987</a></p>
</li>
<li>
<p>Strogatz, S. H. (2000). From Kuramoto to Crawford: Exploring the onset of
synchronization in populations of coupled oscillators. <em>Physica D: Nonlinear
Phenomena</em>, 143(1–4), 1–20.
<a href="https://doi.org/10.1016/S0167-2789(00)00094-4">https://doi.org/10.1016/S0167-2789(00)00094-4</a></p>
</li>
<li>
<p>Strogatz, S. H. (2003). <em>Sync: How Order Emerges from Chaos in the Universe,
Nature, and Daily Life.</em> Hyperion.</p>
</li>
</ul>
<hr>
<h2 id="changelog">Changelog</h2>
<ul>
<li><strong>2026-01-14</strong>: Updated the author list for the Demos (2023) <em>Trends in Cognitive Sciences</em> reference to the published two authors (Demos &amp; Palmer). The five names previously listed were from a different Demos paper.</li>
<li><strong>2026-01-14</strong>: Changed &ldquo;period-doubling&rdquo; to &ldquo;frequency-doubling.&rdquo; When the clapping frequency doubles, the period halves; &ldquo;frequency-doubling&rdquo; is the precise term in this context.</li>
</ul>
]]></content:encoded>
    </item>
    <item>
      <title>How Low Can You Go? Measuring Latency for Networked Music Performance Across Europe</title>
      <link>https://sebastianspicker.github.io/posts/nmp-latency-lola-mvtp/</link>
      <pubDate>Sat, 26 Aug 2023 00:00:00 +0000</pubDate>
      <guid>https://sebastianspicker.github.io/posts/nmp-latency-lola-mvtp/</guid>
      <description>We measured end-to-end audio and video latency for LoLa and MVTP across six European research-network links. One-way audio latency ranged from 7.5 to 22.5 ms. Routing topology mattered more than geographic distance. Enterprise firewalls were a disaster. Here is what we found.</description>
      <content:encoded><![CDATA[<p><em>This post summarises a manuscript submitted with Benjamin Bentz and colleagues
from the RAPP Lab network. The paper is not yet peer-reviewed; numbers and
conclusions are based on operational measurements collected 2020–2023.
Feedback welcome — particularly from anyone who has run similar measurements
on non-European or wireless-last-mile links.</em></p>
<hr>
<h2 id="the-problem">The Problem</h2>
<p>Musicians playing together in the same room experience acoustic propagation
delay of roughly 3 ms per metre of separation — essentially free latency that
most ensembles never consciously register. When you distribute musicians across
a network, you inherit that propagation cost plus everything the signal chain
adds on top: buffers, codec processing, routing hops, switching overhead.</p>
<p>Conventional video-conferencing (Zoom, Teams, etc.) operates at end-to-end
delays of roughly 100–300 ms. That is comfortable for speech — human
conversation tolerates round-trip delays up to about 250 ms before it starts
to feel wrong — but it is well above the threshold at which ensemble timing
breaks down. The NMP literature generally puts the upper bound for
synchronous rhythmic playing somewhere between 20 and 30 ms one-way, with
considerable variation by tempo, instrument, and whether the performers can
see each other [Carôt 2011; Tsioutas &amp; Xylomenos 2021; Medina Victoria 2019].</p>
<p>Specialised low-latency systems cut the processing overhead by avoiding
compression, using hardware-accelerated video pipelines, and riding
research-and-education networks that offer better jitter characteristics than
commodity internet. Two of the better-known ones are <strong>LoLa</strong> (Low Latency
Audio Visual Streaming System, developed at Conservatorio G. Tartini Trieste)
and <strong>MVTP</strong> (Modular Video Transmission Platform, developed at CESNET in
Prague). We deployed both at Hochschule für Musik und Tanz Köln as part of
the RAPP Lab collaboration and spent about two and a half years measuring them.</p>
<hr>
<h2 id="the-latency-budget">The Latency Budget</h2>
<p>End-to-end latency in NMP is cumulative and non-recoverable. Once delay enters
the chain, nothing downstream can subtract it. The budget looks like:</p>
\[
  L_\text{total} = L_\text{capture} + L_\text{buffer} + L_\text{network} + L_\text{playback}
\]<p>Network latency \( L_\text{network} \) includes propagation (roughly
\( d / (2 \times 10^8) \) seconds for a fibre link of distance \( d \) metres,
accounting for the refractive index of glass) plus per-hop processing.
Everything else is system-dependent.</p>
<p>The key insight is that \( L_\text{buffer} \) is not fixed — it is a
consequence of jitter. A jittery link forces larger buffers to avoid
underruns, which directly adds to perceived latency. This is why raw bandwidth
is almost irrelevant for NMP: a 1 Gbps link with erratic jitter will perform
worse than a 100 Mbps link with deterministic behaviour.</p>
<hr>
<h2 id="what-we-measured-and-how">What We Measured and How</h2>
<p><strong>Network RTT.</strong> ICMP ping, 1,000 packets per run. We report the median as a
robust summary; the mean is too sensitive to the occasional rogue packet.</p>
<p><strong>End-to-end audio latency.</strong> An audio signal-loop: transmit a test signal
from site A to site B, have site B return it immediately, estimate round-trip
delay by cross-correlation. One-way latency = signal-loop RTT / 2. This method
captures local processing and buffering at both ends in addition to the network
leg, which is what actually matters for a musician.</p>
<p><strong>Video latency.</strong> Component-based estimation (capture frame cadence +
processing pipeline + display). We did not have a frame-accurate video
loopback method, so treat these numbers as estimates rather than precision
measurements. That caveat matters less than it might seem because, as you will
see, video was always slower than audio by a wide enough margin that it did not
drive the operational decisions.</p>
<p><strong>Firewall impact.</strong> A controlled 4-hour session on the Cologne–Vienna link,
alternating between a DMZ configuration (direct research-backbone access) and
a transparent enterprise firewall, logging packet loss and decoder instability.</p>
<p>Six partner institutions, air distances from 175 to 1,655 km, measurements
collected between October 2020 and March 2023.</p>
<hr>
<h2 id="results">Results</h2>
<h3 id="audio-latency">Audio latency</h3>
<table>
  <thead>
      <tr>
          <th>Partner (from Cologne)</th>
          <th>Air distance (km)</th>
          <th>Median RTT (ms)</th>
          <th>One-way audio latency (ms)</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Prague</td>
          <td>535</td>
          <td>5.0</td>
          <td>7.5</td>
      </tr>
      <tr>
          <td>Vienna</td>
          <td>745</td>
          <td>7.0</td>
          <td>9.5</td>
      </tr>
      <tr>
          <td>Detmold</td>
          <td>175</td>
          <td>7.5</td>
          <td>10.0</td>
      </tr>
      <tr>
          <td>Trieste</td>
          <td>775</td>
          <td>10.0</td>
          <td>12.5</td>
      </tr>
      <tr>
          <td>Rome</td>
          <td>1,090</td>
          <td>17.5</td>
          <td>20.0</td>
      </tr>
      <tr>
          <td>Tallinn</td>
          <td>1,465</td>
          <td>19.5</td>
          <td>22.0–22.5</td>
      </tr>
  </tbody>
</table>
<p>The number that jumps out immediately: <strong>Detmold (175 km away) has higher
latency than Vienna (745 km away).</strong> This is a routing issue, not a physics
one. The Detmold link was traversing a less efficient campus path that added
extra hops before reaching the research backbone. Prague, by contrast, was
connected via a particularly short routing path and achieved the lowest latency
of any link despite not being the geographically closest.</p>
<p>The practical implication: geographic distance is a poor predictor of
achievable latency. Measure RTT; do not estimate from a map.</p>
<h3 id="video-latency">Video latency</h3>
<p>Estimated one-way video latency was 20–35 ms across all configurations,
with the dominant contributions coming from frame cadence (at 60 fps, you wait
up to 16.7 ms for a frame to be captured regardless of what the network is
doing) and buffering at the decoder. In every deployment, video consistently
lagged audio. Musicians unsurprisingly fell back on audio for synchronization
and treated video as a supplementary cue — useful for expressive and social
information, not for timing.</p>
<h3 id="the-firewall-experiment">The firewall experiment</h3>
<p>This is the result I find most important for anyone planning a similar
deployment.</p>
<table>
  <thead>
      <tr>
          <th>Metric</th>
          <th>DMZ (no firewall)</th>
          <th>With enterprise firewall</th>
          <th>Change</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td>Dropped audio packets</td>
          <td>0.002%</td>
          <td>0.052%</td>
          <td>+26×</td>
      </tr>
      <tr>
          <td>Audio buffer realignments/hour</td>
          <td>0.3</td>
          <td>3.9</td>
          <td>+13×</td>
      </tr>
      <tr>
          <td>Dropped video frames</td>
          <td>0.04%</td>
          <td>0.74%</td>
          <td>+18×</td>
      </tr>
      <tr>
          <td>Additional latency</td>
          <td>—</td>
          <td>0.5–1.0 ms</td>
          <td>—</td>
      </tr>
  </tbody>
</table>
<p>The raw latency increase (0.5–1.0 ms) is small and largely irrelevant. The
packet loss and buffer event increases are not. A 26-fold increase in dropped
audio packets on an otherwise uncongested link means the firewall is doing
something — likely deep packet inspection or stateful tracking — that
introduces enough irregularity to destabilise small audio buffers. This forces
you to either accept dropouts or increase buffer size, and increasing buffer
size increases latency.</p>
<p>The message is: if your institution requires traffic inspection for
security policy compliance, you are paying a latency tax that is more about
<em>stability</em> than the raw delay number, and that tax is substantial.</p>
<hr>
<h2 id="discussion">Discussion</h2>
<p>Based on the measured latencies and reported musical tolerances from the
literature, I would roughly characterise the links as follows:</p>
<ul>
<li>
<p><strong>Prague, Vienna, Detmold, Trieste (7.5–12.5 ms):</strong> Compatible with
most repertoire including rhythmically demanding chamber music.
Musicians in our sessions reported the interaction as &ldquo;natural&rdquo; or
&ldquo;like being in the same room&rdquo; at these latencies.</p>
</li>
<li>
<p><strong>Rome (20 ms):</strong> Usable with attention to repertoire and tempo.
Slower movements and music where tight rhythmic locking is not the
primary aesthetic concern work well. Rhythmically dense passages at
fast tempi become harder.</p>
</li>
<li>
<p><strong>Tallinn (22–22.5 ms):</strong> At the upper edge of the comfortable range.
Still usable — we ran a concert collaboration in March 2023 — but
musicians adapt their interaction strategies, leaning more on musical
anticipation than reactive synchronization.</p>
</li>
</ul>
<p>What is notably absent from this data: anything outside the European
research-network context. All six links ran on GÉANT or national backbone
equivalents with favourable jitter characteristics. The numbers almost
certainly do not transfer directly to commodity internet, satellite links, or
mixed-topology paths.</p>
<p><strong>Limitations I want to be explicit about.</strong> The video latency estimates are
component-based, not directly measured, so treat that 20–35 ms range with
appropriate skepticism. The firewall comparison is a single 4-hour session on
a single link; I would not want to extrapolate too aggressively to other
firewall vendors or configurations. And this is an operational measurement
study, not a controlled perceptual experiment — I cannot tell you from this
data at precisely what latency threshold a given ensemble will declare a
session unusable, because that depends on the music, the musicians, and
factors I did not measure.</p>
<hr>
<h2 id="practical-takeaways">Practical Takeaways</h2>
<p>For anyone setting up a similar system:</p>
<ol>
<li><strong>Measure RTT before committing to a partner institution.</strong> A 100 km
difference in air distance can easily be swamped by routing differences.</li>
<li><strong>Get DMZ placement if at all possible.</strong> The firewall results suggest
this matters more than any other single configuration decision.</li>
<li><strong>Minimise campus hops between your endpoint and the research backbone.</strong>
Each additional switching layer adds jitter risk.</li>
<li><strong>Use small audio buffers and monitor for underruns.</strong> If your baseline
RTT is good, your buffer can be small; if underruns increase, that is an
early warning that network stability is degrading before packet loss
becomes audible.</li>
<li><strong>Accept that video will lag audio and design your session accordingly.</strong>
This is not a system failure; it is a consequence of how video pipelines
work at low latency. Plan for it.</li>
</ol>
<hr>
<h2 id="references">References</h2>
<p>Carôt, A. (2011). Low latency audio streaming for Internet-based musical
interaction. <em>Advances in Multimedia and Interactive Technologies</em>.
<a href="https://doi.org/10.4018/978-1-61692-831-5.ch015">https://doi.org/10.4018/978-1-61692-831-5.ch015</a></p>
<p>Drioli, C., Allocchio, C., &amp; Buso, N. (2013). Networked performances and
natural interaction via LOLA. <em>LNCS</em>, 7990, 240–250.
<a href="https://doi.org/10.1007/978-3-642-40050-6_21">https://doi.org/10.1007/978-3-642-40050-6_21</a></p>
<p>Medina Victoria, A. (2019). <em>A method for the measurement of the latency
tolerance range of Western musicians</em>. Ph.D. dissertation, Cork Institute
of Technology (now Munster Technological University).</p>
<p>Rottondi, C., Chafe, C., Allocchio, C., &amp; Sarti, A. (2016). An overview on
networked music performance technologies. <em>IEEE Access</em>, 4, 8823–8843.
<a href="https://doi.org/10.1109/ACCESS.2016.2628440">https://doi.org/10.1109/ACCESS.2016.2628440</a></p>
<p>Tsioutas, K. &amp; Xylomenos, G. (2021). On the impact of audio characteristics
to the quality of musicians experience in network music performance. <em>JAES</em>,
69(12), 914–923. <a href="https://doi.org/10.17743/jaes.2021.0041">https://doi.org/10.17743/jaes.2021.0041</a></p>
<p>Ubik, S., Halak, J., Kolbe, M., Melnikov, J., &amp; Frič, M. (2021). Lessons
learned from distance collaboration in live culture. <em>AISC</em>, 1378, 608–615.
<a href="https://doi.org/10.1007/978-3-030-74009-2_77">https://doi.org/10.1007/978-3-030-74009-2_77</a></p>
<hr>
<h2 id="changelog">Changelog</h2>
<ul>
<li><strong>2026-01-20</strong>: Updated the Drioli et al. (2013) LNCS volume number to 7990 (ECLAP 2013 proceedings). Updated the Ubik et al. (2021) AISC volume number to 1378 and page range to 608–615. Updated the fifth author&rsquo;s surname to &ldquo;Frič.&rdquo;</li>
</ul>
]]></content:encoded>
    </item>
  </channel>
</rss>
