MLSE Sport Performance Lab Competition

Predicting basketball shot biomechanics using markerless motion data.

PythonScikit-learnTime SeriesEnsemble ModelsFeature Engineering

Overview

Built a machine learning pipeline to predict basketball free throw outcomes (angle, depth, left-right deviation) using 3D biomechanical time-series data from ~450 shots.

The challenge focused on modeling how a shot behaves based on body movement, not just the outcome. With limited data, low FPS, and high player variability, the main difficulty was ensuring strong generalization across participants.

I addressed this using:

  • release-phase feature engineering
  • motion normalization (alignment + orientation)
  • GroupKFold validation to prevent leakage

Final result: 9th place on the public leaderboard.

Key Features

  • 3D joint trajectory feature engineering
  • Release-focused kinematic features (velocity, coordination)
  • Motion alignment and normalization
  • GroupKFold + out-of-fold validation
  • Residual analysis for debugging
  • Ensemble modeling (linear + tree-based)

Approach

  • Extracted features around shot release timing
  • Standardized motion across players to reduce variance
  • Compared linear vs nonlinear models
  • Used residual + correlation analysis to build weighted ensembles
  • Combined stable models with higher-variance models to reduce MSE

Final Notebook

Reflection

  • This project emphasized that generalization > model complexity.
  • Group-based validation was critical to avoid learning player-specific patterns
  • Feature engineering provided more impact than complex models
  • Residual analysis helped identify model weaknesses and improve ensembles

Limitations

  • Small dataset (~450 samples)
  • Low temporal resolution (low FPS)
  • No ball tracking (must infer from body motion)
  • High variability across players
  • Left-right prediction was especially noisy

Results

  • 9th place (public leaderboard)
  • Strong performance driven by feature engineering + ensembles

Links