Every piece of tracking data you've ever worked with is wrong. Not by a lot — but GPS chips drift, optical systems lose players behind each other, and even the best sensors add noise. When SkillCorner or Metrica reports a player at (34.2, 18.7), the real position might be (34.5, 18.4). The question is: how do you get the best possible estimate of where a player actually is?
GPS has ±0.5–2m error. Optical tracking loses players during occlusions. Broadcast-derived systems have variable framerate and perspective distortion. If you naively use raw sensor readings, you get jittery trajectories, impossible accelerations, and velocity estimates that spike wildly between frames.
Here's the key idea. A sensor tells you "the player is at (34.2, 18.7)." But you also know physics: the player was at (33.8, 18.9) last frame, moving at 5 m/s northeast. They can't teleport. Combining what you predict from physics with what you observe from the sensor gives a better answer than either alone.
This is exactly what a Kalman filter does. Invented by Rudolf Kálmán in 1960, it's an algorithm that optimally fuses a model prediction ("where should the player be based on physics?") with a noisy measurement ("where does the sensor say they are?"), weighting each by how uncertain it is.
- • Apollo moon navigation
- • GPS receivers
- • Self-driving cars
- • Drone flight controllers
- • Player tracking systems
- • Smooth trajectories
- • Velocity estimates for free
- • Gap interpolation
- • Uncertainty bounds
- • Sensor fusion
- • A motion model (simple physics)
- • Sensor noise estimate
- • Process noise estimate
- • Initial state guess
- • Linear or linearisable system
The Kalman filter operates in a loop with exactly two steps. First, you predict where you think the state will be next, using a model (usually simple physics). Then, when a sensor measurement arrives, you correct your prediction by blending it with the measurement. The blend ratio is called the Kalman gain.
Imagine you're a commentator tracking a striker. Between camera cuts, you predict where they are: "They were sprinting towards the box at 8 m/s, so they're probably near the penalty spot now." When the camera cuts back, you correct: "Ah, they actually slowed down — they're a few metres behind where I expected." Your brain naturally does predict-update. The Kalman filter does it mathematically optimally.
Before we can filter anything, we need to formalise our problem. Kalman filters operate on state space models — a way of separating the true hidden state (the player's actual position and velocity) from the noisy observations (what the sensor reports).
The Two Equations
Concrete Example: Tracking a Player
For a footballer moving on a 2D pitch, a common state vector tracks position and velocity:
[0 1 0 dt]
[0 0 1 0 ]
[0 0 0 1 ]
[0 1 0 0]
Notice the sensor only reports position, but the state includes velocity. The Kalman filter infers velocity from position changes, giving you smooth speed/acceleration estimates for free. This is far better than naive finite differences (vₓ ≈ Δx/Δt) which amplify noise.
The predict step uses your motion model to project the current state estimate forward in time. No sensor data is used — this is pure physics. You also need to project the uncertainty forward, because predictions become less certain over time.
A midfielder is at (40.0, 25.0) moving at (3.0, 1.5) m/s. At 25 Hz (dt = 0.04s), the prediction step says: "Next frame they should be at (40.12, 25.06)." The Q matrix accounts for the fact that footballers don't move at constant velocity — they accelerate, decelerate, and change direction. A higher Q means "I expect the player to be unpredictable" (e.g., during a dribble).
When a sensor measurement zₖ arrives, we correct our prediction. The magic is in the Kalman gain K — it automatically decides how much to trust the prediction vs. the measurement, based on their relative uncertainties.
Understanding the Kalman Gain
The Kalman gain K is the heart of the filter. It's a number between 0 and 1 (for scalar systems) that answers: "how much should I trust the sensor vs. my prediction?"
Sensor is very noisy (R is large) relative to prediction uncertainty (P⁻ is small). The filter mostly ignores the measurement.
Prediction is very uncertain (P⁻ is large) relative to sensor noise (R is small). The filter jumps to the measurement.
For linear systems with Gaussian noise, the Kalman filter gives the minimum variance unbiased estimate. No other linear filter can do better. This is why it's been used for everything from Moon landings to your phone's GPS.
That's it. Five equations, repeated every frame. The elegance is that all the complexity — sensor noise, motion uncertainty, variable trust — is captured in these matrix operations. The filter automatically adapts: when the sensor is reliable, K is high and it tracks measurements closely. When the sensor is noisy or missing, K drops and the filter coasts on its physics model.
The Kalman filter is optimal if Q and R are correct. In practice, you rarely know the exact noise values. Tuning these is where the engineering judgement comes in.
How much you distrust the sensor.
- • Large R: Smoother output, slower to react — filter "ignores" jitter
- • Small R: Responsive but jittery — trusts every reading
For optical: R ≈ diag(0.01, 0.01) (±0.1m std)
How much the player deviates from constant velocity.
- • Large Q: Expects lots of acceleration — more responsive to changes
- • Small Q: Assumes smooth motion — over-smooths sudden turns
Some systems use adaptive Q that increases during high-acceleration phases
The ratio Q/R is what really matters. High Q/R = trust sensor more, track aggressively. Low Q/R = trust model more, smooth aggressively. For football tracking, you typically want moderate smoothing — enough to remove GPS jitter, but responsive enough to catch a sudden change of direction.
The standard Kalman filter assumes everything is linear: state transitions, observations, Gaussian noise. But real football motion is nonlinear — players turn, accelerate in curves, and the relationship between raw camera coordinates and pitch coordinates involves perspective transforms.
The Extended Kalman Filter (EKF) handles nonlinear systems by linearising the model at each time step using a first-order Taylor expansion (i.e., the Jacobian). Instead of fixed matrices F and H, you compute their Jacobians at the current state estimate.
Other Variants
Unscented Kalman Filter. Uses sigma points instead of Jacobians — better for highly nonlinear systems. No derivatives needed.
Uses Monte Carlo sampling. Handles arbitrary distributions (not just Gaussian). Computationally expensive but very flexible.
Runs forward and backward through data. Uses future information too. Perfect for offline processing of tracking data.
For most football tracking, the standard Kalman filter with a constant-velocity model works surprisingly well. Players move roughly linearly between 25 Hz frames (0.04s). Use EKF if you're fusing sensors with nonlinear observation models (e.g., raw camera pixel coordinates → pitch coordinates). Use the Kalman smoother for offline analysis when you have the complete match data.
Kalman filters are everywhere in football analytics infrastructure — you're almost certainly using Kalman-filtered data even if you don't realise it:
The most direct application. Run one Kalman filter per player (and one for the ball). State = [x, y, vₓ, vᵧ]. The filter removes GPS jitter, interpolates through short dropouts, and provides smooth trajectories suitable for downstream models.
Ball tracking is much harder than player tracking — the ball is small, fast, and often occluded. A Kalman filter with a ballistic motion model (including gravity for aerial balls) can maintain estimates during occlusions and predict where the ball will land.
In computer vision pipelines (like the video analysis pipeline), Kalman filters are the backbone of tracking-by-detection. Each detected bounding box is a measurement. The Kalman filter predicts where each player should be in the next frame, then Hungarian assignment matches predictions to detections.
Some systems combine multiple sensors: GPS + accelerometer, or optical tracking + LPS (Local Positioning System). The Kalman filter naturally fuses these by giving each sensor its own observation matrix H and noise R. More sensors = lower uncertainty.
Rather than computing speed as Δposition/Δtime (which amplifies noise), the Kalman filter state includes velocity directly. You can extend it to include acceleration too: state = [x, y, vₓ, vᵧ, aₓ, aᵧ]. This gives reliable speed, acceleration, and even jerk estimates for physical load monitoring.
The innovation ỹₖ = zₖ − Hx̂ₖ⁻ tells you how surprising each measurement is. Large innovations (normalised by Sₖ) indicate sensor glitches, ID swaps, or genuine sudden events. You can flag these for manual review or automatic correction.
If you've read the earlier articles in this series on RNNs and Transformers, you might notice something familiar. The Kalman filter's predict-update cycle is structurally very similar to an RNN — both maintain a hidden state that gets updated with each new observation.
- • Explicit physics model (F, H)
- • Optimal for linear + Gaussian
- • Interpretable: you know what each state means
- • No training data needed
- • Runs in microseconds
- • One filter per player (independent)
- • Learned dynamics (no explicit physics)
- • Handles nonlinear, non-Gaussian patterns
- • Black box: hidden state is opaque
- • Needs lots of training data
- • Slower inference
- • Can model inter-player interactions
In practice, you use both. The Kalman filter is your preprocessing layer — clean raw sensor data into smooth trajectories. Then feed that clean data into your neural network (STGNN, foundation model, etc.) for high-level analysis. The Kalman filter handles the physics; the neural network handles the tactics.
Recent research replaces the hand-tuned Q and R matrices with neural networks that predict them. The Kalman structure (predict-update with gain) is kept for its optimality guarantees, but the noise parameters adapt to context. A player sprinting gets different Q than one standing still. This is an active research frontier.
- ✓ Why raw tracking data needs filtering (sensor noise, gaps)
- ✓ State space models: hidden state vs. observations
- ✓ The predict-update cycle
- ✓ All five Kalman filter equations
- ✓ Kalman gain: automatic trust weighting
- ✓ Tuning Q and R for football tracking
- ✓ Extended Kalman Filter for nonlinear systems
- ✓ Six football applications
The Kalman filter is the unsung hero of football analytics. Every time you load "clean" tracking data from SkillCorner, Metrica, or any provider, a Kalman filter (or something very like it) has already smoothed the noise, filled the gaps, and estimated the velocities. Understanding it gives you intuition for why tracking data looks the way it does, when to trust it, and how to improve it when you're working with noisier sources.