💷📊
Research Deep DivearXiv 2026
GenTac: Generative Tactics for Open-Play Football
A diffusion-based framework that generates realistic, controllable open-play football tactics — conditioning multi-agent trajectories on team identity, opponent, league style, and high-level tactical objectives.
Diffusion ModelsMulti-Agent GenerationConditional SamplingTacBench Benchmark40 min read
Authors: Jiayuan Rao et al. — Shanghai Jiao Tong University (2026)
The Tactical Problem

Most trajectory models in football tackle deterministic sub-problems: predict where a defender will be 4 seconds from now, or impute a missing player's position. But open-play tactics are inherently non-deterministic — there are many plausible ways a possession could unfold, and tactical analysis cares about the distribution of plausible futures, not a single average.

GenTac reframes the task as generation, not prediction: given a short context window of 22 player + ball trajectories, sample multiple realistic continuations of open play that obey football's tactical regularities. Crucially, the user can steer the generation with conditioning signals — e.g. "simulate how Manchester City would attack against a Burnley low block".

Why Existing Methods Fall Short

The paper argues that prior trajectory work — TranSPORTmer, Diffoot, CausalTraj, SportsNGEN — solves pieces of the problem, but none of them give analysts a single controllable knob over tactical style:

❌ Single-Mode Outputs

Point-prediction models (Social-LSTM, TranSPORTmer's forecasting head) collapse a multi-modal distribution into one trajectory. Tactical analysis needs "what could happen", not just "what is most likely".

❌ No Tactical Control

Existing diffusion approaches (Diffoot) generate diverse samples but are conditioned only on the observed past — there's no way to ask the model for a specific style, opponent, or objective.

❌ No Standard Evaluation

There is no shared benchmark that measures whether a generated possession is actually tactically meaningful — recognisable as a counter-attack, a build-up, a press-trigger, etc.

GenTac's Solution

Build a single conditional diffusion backbone that supports five conditioning modes over the same generative process — and ship a paired benchmark (TacBench) that scores both trajectory realism and downstream tactical-event recognisability.

GenTac Architecture: Conditional Diffusion over 23 Agents
Causal sliding window + spatiotemporal denoiser + multi-modal conditioning

The denoiser predicts the joint trajectory of 22 players + 1 ball over a short horizon, using a causal sliding window of 0.2s hops. Each forward pass denoises a chunk of trajectories from Gaussian noise back to a tactically coherent rollout, conditioned on:

Backbone: Spatiotemporal Denoiser

Set-attention over the 23 agents at each frame, plus a temporal transformer along the time axis. Permutation-invariant in the player dimension; ball is treated as a special token with its own embedding.

  • • 0.2s causal window (5 frames at 25 FPS)
  • • v-prediction loss (stable, well-known to work)
  • • DDIM sampling at inference
  • K = 20 samples per context for evaluation
Conditioning: Five Modes, One Model

The same backbone is trained with classifier-free guidance over a multiplexed conditioning vector, so a single checkpoint serves five inference regimes:

  • Unconditioned — pure prior over open play
  • Opponent-conditioned — given opposing team ID
  • Team-conditioned — generate in a chosen team's style
  • League-conditioned — EPL vs. La Liga vs. Bundesliga, etc.
  • Objective-conditioned — e.g. counter-attack, sustained build-up
Why Five Modes Matter

An unconditioned model is a prior. An opponent-conditioned model is a scout report. A team-conditioned model is a style transfer. A league-conditioned model captures tactical culture. An objective-conditioned model is closest to what coaches actually want — "show me what a press-bypass through midfield looks like for this side". GenTac is the first paper to put all five behind a unified API.

TacBench: A Benchmark for Tactical Generation
Two paired tasks — trajectory forecasting and tactical-event recognition

Generative trajectory models have historically been evaluated with ADE/FDE — average and final displacement error against ground truth. But ADE/FDE penalises a sample for being different from the one rollout that actually happened, even when it's perfectly plausible. TacBench fixes this by adding a tactical-recognisability task on top.

TacBench-Trajectory

2,838 open-play segments with held-out continuations. Models generate K = 20 samples per context; metrics include best-of-K ADE/FDE plus diversity / coverage scores.

  • • Trajectory realism (kinematic plausibility)
  • • Min-of-K ADE/FDE against ground truth
  • • Sample diversity within each context
TacBench-Event

423 segments labelled with 5 tactical event types covering 15 sub-types (e.g. counter-attack, half-space combination, switch of play, high press trigger). Generated rollouts are scored on whether a downstream classifier still recognises the intended tactic.

  • • 5 high-level event types
  • • 15 fine-grained sub-types
  • • Top-1 type / sub-type accuracy on samples
Headline Results
Trajectory Forecasting

Best-of-K ADE/FDE on TacBench-Trajectory beats Diffoot, TranSPORTmer and Social-STGCNN baselines, particularly in the longer-horizon regime where multi-modality matters most.

Event Recognition

71.2% top-1 accuracy on tactical event type and 53.7% on the harder sub-type task — generated rollouts remain recognisable as the intended tactic to a downstream classifier.

Conditioning Works

Team-, league- and objective-conditioned samples shift summary statistics (possession shape, defensive line height, vertical progression rate) in directions consistent with the requested style.

Cross-Sport Generalisation
The same backbone runs on basketball, American football and ice hockey

One of the more striking results: because the denoiser is permutation-invariant over agents and only mildly assumes pitch geometry, GenTac fine-tunes onto other team-sport tracking data with minimal changes. The paper reports plausible generations on:

🏀 Basketball

10 agents + ball, half-court tracking. Generated possessions reproduce pick-and-roll geometry and motion-offense spacing.

🏈 American Football

22 agents on a longer field. Captures route concepts and coverage shells from snap to first read.

🏒 Ice Hockey

12 agents + puck, faster cycle times. Conditional samples respect zone-entry and forecheck structure.

Why This Matters

Cross-sport portability is a soft argument that GenTac has learned something about multi-agent coordination under spatial constraints, not just memorised the geometry of one league's pitch. That is precisely the kind of representation a foundation model for team-sport tracking should learn.

GenTac vs. Diffoot vs. TranSPORTmer vs. CausalTraj
AspectTranSPORTmerDiffootCausalTrajGenTac
Output TypePointMulti-modal samplesJoint multi-modalJoint multi-modal + controllable
Agents Modelled22 + ballDefenders only (11)22 + ball22 + ball
ConditioningPast trajectoriesPast + graphPast + causal structure5 modes (team / opp / league / obj.)
EvaluationADE/FDE + impute/classifyADE/FDE + directionADE/FDE + coherenceTacBench (trajectory + event)
Cross-SportFootball / basketballFootball onlyFootball onlyFootball, basketball, NFL, hockey
Best Use CaseReal-time multi-taskDefensive scoutingCoherent rolloutsTactical scenario design, "what-if" analysis
Limitations & Open Questions
1. Short Horizon

The 0.2s causal window keeps generation tractable but means GenTac is really a stitched short-rollout model rather than a true minute-long tactical simulator. Long horizons compound sampling noise.

2. Event Sub-types Are Hard

53.7% sub-type accuracy is impressive but leaves a lot on the table — fine-grained tactical concepts (e.g. half-space underlap vs. third-man combination) are still partially out of reach.

3. Inference Cost

Multi-step DDIM sampling × K = 20 samples is meaningfully slower than a single-shot point predictor. Fine for off-line scouting; tight for real-time use.

4. Conditioning Granularity

"Team identity" and "objective" are coarse handles. Coaches typically think in terms of structures and triggers; bridging that to a useful conditioning vocabulary is open work.

Resources & Further Reading
Key References

DDPM (Ho et al., 2020) — Denoising Diffusion Probabilistic Models

DDIM (Song et al., 2021) — Faster deterministic sampling

Classifier-Free Guidance (Ho & Salimans, 2022) — Conditioning trick used by GenTac

Diffoot (2025) — Graph-conditioned diffusion for defensive trajectories

CausalTraj — Coherent multi-agent forecasting via temporal causality

TranSPORTmer — Set-attention unified trajectory model

Where GenTac Sits in the Landscape

Diffoot showed diffusion works for football trajectories. CausalTraj showed how to keep joint rollouts coherent. TranSPORTmer showed one architecture can do many tasks. GenTac is the natural next step: a single conditional generative model over the whole 22-vs-22 system, with a benchmark that explicitly grades whether the samples are tactically meaningful.