Skip to content

danoff/basketball

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

basketball

This repository is for my basketball analytics projects, the first one is predicting DBPM from combine data. A draft readme is below that was made with help from LLMs. Everyhing below is DRAFT as of now. If you have suggestions please email c@danoff.org

DRAFT NBA Combine Rookie Defensive Model Data

License: BSD-3-Clause

An exploratory basketball analytics project asking whether pre-draft NBA Draft Combine measurements can help predict rookie-season defensive outcomes.

This repository centers on a 2025 model-building dataset and a 2026 prediction dataset. The 2025 data links combine measurements to rookie-season defensive statistics. The 2026 data applies the preferred model to generate pre-rookie predicted DBPM values.

Primary outcome variables explored:

  • Defensive Box Plus/Minus (DBPM)
  • Defensive Win Shares (DWS)

The current preferred model uses DBPM as the dependent variable because DBPM behaved more like a rate-style defensive impact measure, while DWS was more sensitive to cumulative role, opportunity, and playing-time context.


Single-page web app

This repo now includes a browser-only model explorer in rookie_dbpm_model_explorer.html. Open it directly in a browser or serve the repo locally:

python3 -m http.server 8000

Then visit http://localhost:8000/rookie_dbpm_model_explorer.html. The app embeds the current 2025 preferred-model complete rows and 2026 prediction rows, recalculates predictions with the preferred DBPM coefficients, supports search/filter/sort, includes a live custom-player calculator with save/delete custom rows, and can export the current board as CSV.

BMad Method project context for this app lives in _bmad-output/project-context.md, with technical research, a tech spec, a Quick Flow spec, and sprint status under _bmad-output/. The full BMad installer could not be fetched in this environment because npm registry access returned HTTP 403, so the project context and technical-spec workflow artifacts were activated manually following BMad's documented output patterns.

Research Question

Can pre-draft physical and athletic measurements help predict rookie defensive value?

The project tests whether combine variables such as wingspan, shuttle run, lane agility, and hand width contain useful information for projecting early NBA defensive translation.

Important modeling principle:

Rookie-season outcomes such as DWS and DBPM are used as dependent variables only. They should not be used as predictors when the goal is a clean pre-draft model.


Repository Files

.
├── README.md
├── LICENSE
├── combine_rookie_defensive_model_data_v2_20260516_031444_CT.csv
├── combine_rookie_defensive_prediction_data_2026_v2_full_model_inputs_20260516_034524_CT.csv
├── rookie_dws_dataset_bibliography_v4_20260516_040733_CT.md
├── 20260516 meta ai and deepseek HTML draft.html
├── rookie_dbpm_model_explorer.html
├── index.html
├── styles.css
└── draft readme from codex

2025 model-building file

combine_rookie_defensive_model_data_v2_20260516_031444_CT.csv

This file includes 2025 combine measurements, rookie NBA defensive outcomes, agility/shuttle values, hand measurements, and model-ready flags.

2026 prediction file

combine_rookie_defensive_prediction_data_2026_v2_full_model_inputs_20260516_034524_CT.csv

This file includes 2026 combine measurements and predicted rookie DBPM values generated from the preferred 2025 model.

Bibliography

rookie_dws_dataset_bibliography_v4_20260516_040733_CT.md

This file records public sources, API notes, and player-level validation references.

Interactive tools

  • 20260516 meta ai and deepseek HTML draft.html — Approach A predictor tool (DeepSeek / Meta AI), mobile-first with CSV import/export, inline editing, localStorage persistence, and gem detection.
  • rookie_dbpm_model_explorer.html — Approach B model explorer with embedded data, live recalculation, search/filter/sort, and custom-player calculator.
  • index.html / styles.css — Landing page and stylesheet.

Other

  • draft readme from codex — Early README draft generated by OpenAI Codex.
  • LICENSE — BSD 3-Clause license.

Dataset Summary

2025 model-building dataset

Field Count
Rows 78
Original 75 invitees 75
Additional/StatLab-only rows 3
Height values 63
Wingspan values 59
Lane agility values 74
Shuttle run values 71
Hand length values 77
Hand width values 77
Rows with full preferred-model inputs 41

2026 prediction dataset

Field Count
Rows 75
Wingspan values 75
Hand width values 75
Lane agility values 71
Shuttle run values 71
Rows with full prediction inputs 71

Key Columns

Column Description
Player Player name
Roster_Status Whether the row came from the original invitee list or additional Stat Lab context
Original_75_Invitee Flag for announced 75-player invitee backbone
Additional_StatLab_Only Flag for retained additional Stat Lab rows
Height, Height_in Combine height as text and decimal inches
Wingspan, Wingspan_in Combine wingspan as text and decimal inches
Standing_Reach, Standing_Reach_in Standing reach as text and decimal inches, where available
Lane_Agility_Time Lane agility time in seconds
Shuttle_Run Shuttle run time in seconds
Hand_Length Hand length in inches
Hand_Width Hand width in inches
MP Rookie-season NBA minutes played
DWS Rookie-season Defensive Win Shares
DBPM Rookie-season Defensive Box Plus/Minus
DWS_per_1000_MP Defensive Win Shares per 1,000 minutes
Predicted_DBPM Predicted DBPM from the preferred model, for 2026 prediction data
Modeling_Ready Flag for baseline model-ready rows
Measurement_Source Measurement provenance notes
Stats_Source Season-stat provenance notes
Notes Audit notes and corrections

Data Sources

Core sources include:

  • Basketball Reference rookie and advanced stats pages
  • Basketball Reference Win Shares methodology page
  • NBA.com Draft Combine Anthropometric Stats
  • NBA/AWS Draft Combine Stat Lab
  • Hoops Rumors 2025 combine invitee announcement
  • Sporting News / Yahoo Sports combine measurement tables
  • Sports Reference college basketball player pages for validation examples
  • NBADraft.net measurement and athleticism notes
  • ESPN, CBS Sports, Floor & Ceiling, and other combine coverage

See the bibliography Markdown file for the running source list and API-attempt notes.


Important Data Notes

This dataset is a working research artifact, not a canonical database.

Known caveats:

  1. NBA Stat Lab is interactive and difficult to scrape directly in this environment.
  2. The NBA Stats API endpoint draftcombineplayeranthro was identified as likely relevant, but requests from the working environment returned HTTP 403 or timed out.
  3. Some measurements were manually transcribed from NBA Stat Lab screenshots.
  4. Some players have missing height, wingspan, standing reach, lane agility, or shuttle values.
  5. Adou Thiero's wingspan was corrected to 7' 0.00" after a likely 7' 9" source typo was identified.
  6. Three 2025 rows are retained as Additional_StatLab_Only: Lachlan Olbrich, Mackenzie Mgbako, and Yanic Konan Niederhauser.
  7. Position labels are intentionally not used as modeling features in the current version.
  8. The model is exploratory and should not be treated as a complete scouting grade.

Model Development Summary

Early models tested height, wingspan, and a WH interaction term:

WH = Height_in × Wingspan_in

Those size-only models performed weakly, especially for DWS. DBPM was more promising than DWS as a dependent variable.

The model improved when movement variables were added, especially shuttle run. Later, hand width was added and modestly improved model fit.


Preferred Model: Rookie DBPM

Model specification

DBPM = -8.895696
       + 0.065647 × Wingspan_in
       + 0.638983 × Lane_Agility_Time
       - 3.392342 × Shuttle_Run
       + 0.628937 × Hand_Width

Equivalent formula:

DBPM ~ Wingspan_in + Lane_Agility_Time + Shuttle_Run + Hand_Width

Model fit

Metric Value
n 41
0.258
Adjusted R² 0.176
AIC 158.908
RMSE 1.487

Coefficients

Predictor Coefficient p-value
Intercept -8.896 0.211
Wingspan_in 0.066 0.404
Lane_Agility_Time 0.639 0.389
Shuttle_Run -3.392 0.062
Hand_Width 0.629 0.156

Interpretation

This model explains about 25.8% of the variation in rookie DBPM, with an adjusted R² of 17.6%. For a small exploratory combine-only model, that is a meaningful signal, but not enough to treat the model as a standalone prediction engine.

The clearest signal is Shuttle_Run. Its coefficient is negative, which fits basketball logic: lower shuttle time means better change-of-direction speed, and faster shuttle performance is associated with higher rookie DBPM. The effect is marginally significant at p = 0.062.

Hand_Width has a positive coefficient and modestly improves model fit. It is not statistically significant, but it is directionally plausible because wider hands may help with ball disruption, rebounding control, and defensive playmaking.

Wingspan_in is positive but not statistically significant once mobility and hand width are included.

Lane_Agility_Time is not statistically significant and has a counterintuitive positive sign. It should be retained as a measurement field, but interpreted cautiously.

Practical takeaway

Rookie defensive impact appears more connected to change-of-direction ability, especially shuttle performance, than to raw size alone in this 2025 combine sample.


DWS Model Results

The same combine variables were also tested against Defensive Win Shares.

A simple DWS model:

DWS ~ Wingspan_in + Lane_Agility_Time + Shuttle_Run

performed weakly:

Metric Value
n 41
0.065
Adjusted R² -0.011
AIC 101.550
RMSE 0.757

None of the predictors were statistically significant. DWS appears less suitable for this initial pre-draft combine-only model, likely because it is a cumulative stat influenced by minutes, team context, role, and opportunity.


2026 Prediction Results

The preferred 2025 DBPM model was applied to the 2026 combine class.

Top predicted rookie DBPM values from the current 2026 prediction dataset:

Rank Player Predicted DBPM
1 Flory Bidunga 1.187
2 Reuben Chinyelu 1.052
3 Trevon Brazile 1.045
4 Karim Lopez 0.852
5 Tarris Reed Jr. 0.838
6 Baba Miller 0.707
7 Aaron Nkrumah 0.581
8 Zuby Ejiofor 0.580
9 Yaxel Lendeborg 0.537
10 Matthew Able 0.354

Validation example: Flory Bidunga

Flory Bidunga was the top 2026 predicted rookie DBPM player in the current model. His college profile provides a useful sanity check:

  • 2024-25 Kansas DBPM: 6.2
  • 2025-26 Kansas DBPM: 5.6
  • Career college DBPM: 5.8

This does not prove the model is correct, but it suggests that the model's top prediction aligns with a player who also showed strong college defensive impact.


Interactive Predictor Tool Approach A (DeepSeek and Meta AI)

Live demo — May 16, 2026

The app is a mobile-first, single-page tool that runs entirely in the browser:

  • shows the preferred model in the header
  • loads the 2026 prediction dataset (75 players, 71 with full inputs)
  • calculates Predicted DBPM live: DBPM = -8.896 + 0.066×Wingspan + 0.639×Lane_Agility - 3.392×Shuttle + 0.629×Hand_Width
  • searchable / sortable leaderboard (desktop table, mobile cards)
  • CSV Import — drop in your updated 2026 file
  • Inline Editor — add/edit players, live recalculation
  • Export CSV — download filtered view
  • Persistence — saves to localStorage
  • Gem detection — flags players with projection rank >30 and Predicted DBPM > median
  • dark slate UI built with Tailwind

Interactive Predictor Tool Approach B

A mobile-friendly single-page web app is planned for this project.

The app should allow someone to open the page on a phone and quickly see:

  • the preferred model,
  • the key findings,
  • the 2026 predicted DBPM leaderboard,
  • a player search/filter,
  • caveats,
  • and source/provenance notes.

Suggested app stack:

  • static HTML/CSS/JavaScript, or
  • React with a simple static build,
  • deployable through GitHub Pages, Netlify, or Vercel.

Current model formula for the app:

Predicted DBPM = -8.896
               + 0.066 × Wingspan_in
               + 0.639 × Lane_Agility_Time
               - 3.392 × Shuttle_Run
               + 0.629 × Hand_Width

Recommended app sections:

  1. Hero section
  2. Key metric cards
  3. Model formula
  4. Coefficient table
  5. 2026 prediction leaderboard
  6. Search/filter
  7. Caveats
  8. Data sources

The tool should be mobile-first and suitable for sharing by link or QR code.


LLM Collaboration Note

This project was developed through an interactive multi-LLM workflow.

Contributions included:

  • ChatGPT 5.5 in the web browser: spreadsheet/file handling, dataset consolidation, CSV creation, model fitting, API probing, documentation drafting, bibliography maintenance, and interpretation.
  • Microsoft Copilot in the browser: dataset audits, Mode B audit checks, README revision suggestions, and validation notes.
  • DeepSeek in the browser: model exploration notes, final model framing, and interactive predictor concept refinements.
  • Meta AI (Muse Spark) in the Meta AI app: built the production HTML predictor (May 16, 2026 draft), integrated both 2025 and 2026 CSVs as seed data, implemented the 4-variable DBPM formula with live recalculation, added CSV import/export, inline add-edit-delete with localStorage persistence, responsive desktop table and mobile card views, gem logic, median tracking, and stats bar; generated the QR code asset and prepared the file for GitHub Pages deployment.
  • Anthropic Claude in the web browser (alternating between Opus 4.6 and other models in claude.ai): scraped and compiled 2025 and 2024 NBA Draft Combine measurement tables from Yahoo Sports, Draft Central, and other sources into structured spreadsheets; cross-referenced combine participant lists against Basketball-Reference 2025-26 advanced stats to match minutes played and Defensive Win Shares; compared datasets produced by different LLM workflows (Claude vs. Copilot vs. ChatGPT) and identified discrepancies including 5 missing players with NBA minutes and the Adou Thiero wingspan typo; reviewed the GitHub repo and provided a structured code-review-style assessment covering repo structure, statistical methodology (sample size, p-values, VIF, cross-validation), and prioritized next steps; revised the README to reflect actual file layout and added new suggested next steps.
  • Additional LLM outputs may be added as separate notes or audits over time.

The goal is not to hide the multi-agent workflow. The goal is to make it inspectable.

Suggested notes structure:

notes/
├── chatgpt_5_5_worklog.md
├── copilot_audit.md
├── deepseek_model_notes.md
├── claude_review.md
└── future_llm_reviews.md

Other LLMs are welcome to add their own explanations, critiques, model checks, and source audits.


Quickstart

Load the processed 2025 model dataset

import pandas as pd

df = pd.read_csv("combine_rookie_defensive_model_data_v2_20260516_031444_CT.csv")

model_df = df[
    df["MP"].notna()
    & (df["MP"] > 0)
    & df["Wingspan_in"].notna()
    & df["Lane_Agility_Time"].notna()
    & df["Shuttle_Run"].notna()
    & df["Hand_Width"].notna()
    & df["DBPM"].notna()
].copy()

print(model_df.shape)

Compute predicted DBPM

df["Predicted_DBPM"] = (
    -8.895696
    + 0.065647 * df["Wingspan_in"]
    + 0.638983 * df["Lane_Agility_Time"]
    - 3.392342 * df["Shuttle_Run"]
    + 0.628937 * df["Hand_Width"]
)

Suggested Next Steps

Repository organization

  1. Reorganize files into a structured layout (e.g. data/processed/, notes/, bibliography/, webapp/).
  2. Add a .gitignore for Python environments, notebooks checkpoints, and OS files.
  3. Add a requirements.txt or environment spec listing Python dependencies (e.g. pandas, statsmodels).
  4. Move manual screenshot transcriptions into structured audit files.

Model diagnostics and validation

  1. Add the F-test p-value for the overall model and VIF (variance inflation factor) diagnostics for each predictor, especially to check for multicollinearity between Lane_Agility_Time and Shuttle_Run.
  2. Run leave-one-out cross-validation on the preferred model to assess how stable the top-10 predictions are when individual observations are dropped.
  3. Add prediction intervals to the 2026 leaderboard (even ±1 RMSE bands) to convey uncertainty around individual predictions.
  4. Extend the dataset to 2024 and earlier combine classes for out-of-sample validation.

Data and documentation

  1. Add scripts that rebuild processed CSVs from raw/interim sources.
  2. Add a complete data dictionary.
  3. Attempt NBA Stats API access from a local machine with browser-compatible headers.

Modeling extensions

  1. Add baseline regression notebooks.
  2. Add versioned model outputs.
  3. Compare DBPM, DWS, and rate-adjusted defensive outcomes.
  4. Track hand width as a potential additional defensive signal in future combine classes.

Web app and sharing

  1. Host the mobile-friendly predictor app via GitHub Pages.
  2. Add QR-code sharing for the web app.

License

This project is licensed under the BSD 3-Clause "New" or "Revised" License.

SPDX identifier:

BSD-3-Clause

See LICENSE for the full license text.

Licensing note for data

The repository license covers code, scripts, documentation, and original project materials in this repository unless otherwise noted. The underlying basketball statistics and combine measurements were compiled from public third-party sources. Those source materials may have their own terms of use. This repository should preserve source attribution and provenance notes for all derived datasets.


Generated / Updated

README updated with:

  • ChatGPT 5.5 in the web browser
  • Microsoft Copilot in the browser
  • DeepSeek in the browser
  • Meta AI (Muse Spark) in the Meta AI app
  • Anthropic Claude (Opus 4.6 and other models) in the web browser

Date: 2026-05-16

About

Basketball analytics projects

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors