💷📊
Literature Review: Spatiotemporal Graph Neural Networks
From foundational neural networks to state-of-the-art architectures for modeling dynamic relational data in football analytics.
Deep LearningGraph TheorySports Analytics
Abstract

Spatiotemporal Graph Neural Networks (STGNNs) have emerged as a powerful paradigm for learning from data exhibiting both spatial structure and temporal dynamics. This review traces the development from foundational neural networks through graph-based methods, with particular focus on applications in football analytics where player interactions form natural evolving graph structures. We examine seminal architectures including GCNs (Kipf & Welling, 2017), attention mechanisms (Vaswani et al., 2017), and their synthesis in spatiotemporal domains, concluding with current football-specific applications and future research directions.

1. Neural Network Foundations

The perceptron (Rosenblatt, 1958) computes a weighted sum of inputs with a non-linear activation. Multi-Layer Perceptrons (MLPs) stack these units, enabling complex function approximation as proven by Hornik (1991). Training via backpropagation (Rumelhart et al., 1986) enabled deep learning.

h(l) = σ(W(l)h(l-1) + b(l))
MLP layer computation with weights W, bias b, and activation σ
ReLU
max(0, x)
Sigmoid
1/(1+e-x)
Softmax
exi/Σexj
GELU
x·Φ(x)
2. Convolutional & Recurrent Networks

CNNs for Spatial Patterns

CNNs (LeCun et al., 1998) exploit spatial structure via weight-sharing convolutions. ResNet (He et al., 2016) enabled very deep networks through skip connections.

(X * K)i,j = Σm,n Xi+m,j+n·Km,n

RNNs for Temporal Sequences

RNNs maintain hidden states across time. LSTMs (Hochreiter & Schmidhuber, 1997) solve vanishing gradients via gating mechanisms.

ht = σ(Whht-1 + Wxxt + b)
3. Attention Mechanisms

Attention (Bahdanau et al., 2015) allows direct access to all sequence positions. The Transformer (Vaswani et al., 2017) uses scaled dot-product attention enabling parallelization and long-range dependency modeling.

Attention(Q, K, V) = softmax(QKT/√dk)V
Scaled dot-product attention with queries Q, keys K, values V

Multi-head attention runs parallel attention operations, enabling the model to attend to different representation subspaces. This forms the backbone of BERT (Devlin et al., 2019) and GPT models.

4. Graph Neural Networks

GNNs extend deep learning to non-Euclidean domains. The Message Passing framework (Gilmer et al., 2017) unifies various architectures: nodes aggregate information from neighbors and update representations.

mv = ⊕u∈N(v) M(hv, hu, euv)
hv' = U(hv, mv)
Message passing: aggregate neighbor messages, then update node state
GCN

Kipf & Welling (2017) — spectral convolutions simplified to spatial operations

GAT

Veličković et al. (2018) — attention-weighted neighbor aggregation

GraphSAGE

Hamilton et al. (2017) — inductive learning for unseen nodes

5. Spatiotemporal Graph Neural Networks

STGNNs combine spatial graph learning with temporal modeling for dynamic relational data (Wu et al., 2020). They address scenarios where both node features and graph structure evolve over time.

[X(t+1),...,X(t+τ)] = f(G(t-T'+1),...,G(t); θ)
Given T' historical graph snapshots, predict τ future states

Key Architectures

DCRNN (Li et al., 2018)

Diffusion convolution within GRU cells; models traffic as diffusion on directed graphs.

STGCN (Yu et al., 2018)

Interleaves 1D temporal convolutions with graph convolutions; efficient and parallelizable.

Graph WaveNet (Wu et al., 2019)

Learns adaptive adjacency matrices; discovers hidden spatial dependencies.

ST-GCN (Yan et al., 2018)

Skeleton-based action recognition; directly applicable to player tracking data.

6. Applications in Football Analytics

Football provides an ideal STGNN testbed: matches generate rich spatiotemporal data with natural graph structure (22 players as nodes, interactions as edges) evolving continuously. Modern tracking systems capture positions at 25Hz.

6.1 Existing Research

Expected Possession Value (Fernández et al., 2019)

Introduced EPV models that value every game state probabilistically. While not explicitly using GNNs, this work established the framework for valuing spatiotemporal football data that graph methods can enhance.

Graph Networks for Basketball (Yeh et al., 2019)

Applied graph networks to model player-ball interactions for trajectory prediction in basketball. The methodology transfers directly to football tracking data.

SoccerNet (Deliège et al., 2021)

Large-scale dataset with action spotting annotations enabling STGNN research on event detection, highlight generation, and broadcast analytics.

Pitch Control Models (Spearman, 2018)

Probabilistic models of space control. Graph-based extensions can capture how control emerges from coordinated player movements and interactions.

Expected Passes (Anzer & Bauer, 2022)

Used spatiotemporal features to model pass difficulty, demonstrating the value of considering player positions and movements jointly.

6.2 Application Areas

Trajectory Prediction

Predicting future player positions from historical movement patterns.

Pressing Analysis

Modeling coordinated team pressing as graph dynamics.

Formation Recognition

Classifying tactical formations from positional data.

Event Prediction

Anticipating shots, passes, and other actions from game state.

7. Challenges & Future Directions
Scalability

Full attention over long sequences faces O(n²T²) complexity. Sparse attention and graph sampling are active research areas.

Interpretability

For coaching applications, outputs must be explainable. GNNExplainer (Ying et al., 2019) offers promising directions.

Uncertainty Quantification

Sports outcomes are stochastic. Bayesian GNNs can quantify prediction confidence—crucial for betting applications.

Real-Time Inference

Live tactical recommendations require efficient architectures suitable for edge deployment.

Next Steps for This Project

Implement baseline STGCN on publicly available tracking data (e.g., Metrica Sports open dataset)

Explore adaptive adjacency learning to discover latent player relationships

Integrate event data with tracking for multi-modal STGNN architectures

Develop probabilistic prediction heads for match outcome forecasting with calibrated uncertainty

References

Anzer, G., & Bauer, P. (2022). Expected passes: Determining the difficulty of a pass in football using spatio-temporal data. Data Mining and Knowledge Discovery, 36(1), 295-317.

Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. ICLR 2015.

Deliège, A., et al. (2021). SoccerNet-v2: A dataset and benchmarks for holistic understanding of broadcast soccer videos. CVPR 2021 Workshops.

Devlin, J., et al. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. NAACL 2019.

Fernández, J., Bornn, L., & Cervone, D. (2019). Decomposing the immeasurable sport. MIT Sloan Sports Analytics Conference.

Gilmer, J., et al. (2017). Neural message passing for quantum chemistry. ICML 2017.

Hamilton, W., Ying, R., & Leskovec, J. (2017). Inductive representation learning on large graphs. NeurIPS 2017.

He, K., et al. (2016). Deep residual learning for image recognition. CVPR 2016.

Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735-1780.

Hornik, K. (1991). Approximation capabilities of multilayer feedforward networks. Neural Networks, 4(2), 251-257.

Kipf, T., & Welling, M. (2017). Semi-supervised classification with graph convolutional networks. ICLR 2017.

LeCun, Y., et al. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.

Li, Y., et al. (2018). Diffusion convolutional recurrent neural network. ICLR 2018.

Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization. Psychological Review, 65(6), 386.

Rumelhart, D., Hinton, G., & Williams, R. (1986). Learning representations by back-propagating errors. Nature, 323, 533-536.

Spearman, W. (2018). Beyond expected goals. MIT Sloan Sports Analytics Conference.

Vaswani, A., et al. (2017). Attention is all you need. NeurIPS 2017.

Veličković, P., et al. (2018). Graph attention networks. ICLR 2018.

Wu, Z., et al. (2020). A comprehensive survey on graph neural networks. IEEE TNNLS, 32(1), 4-24.

Wu, Z., et al. (2019). Graph WaveNet for deep spatial-temporal graph modeling. IJCAI 2019.

Yan, S., Xiong, Y., & Lin, D. (2018). Spatial temporal graph convolutional networks for skeleton-based action recognition. AAAI 2018.

Yeh, R., et al. (2019). Diverse generation for multi-agent sports games. CVPR 2019.

Ying, R., et al. (2019). GNNExplainer: Generating explanations for graph neural networks. NeurIPS 2019.

Yu, B., Yin, H., & Zhu, Z. (2018). Spatio-temporal graph convolutional networks. IJCAI 2018.