This repository is for my basketball analytics projects, the first one is predicting DBPM from combine data. A draft readme is below that was made with help from LLMs. Everyhing below is DRAFT as of now. If you have suggestions please email c@danoff.org
An exploratory basketball analytics project asking whether pre-draft NBA Draft Combine measurements can help predict rookie-season defensive outcomes.
This repository centers on a 2025 model-building dataset and a 2026 prediction dataset. The 2025 data links combine measurements to rookie-season defensive statistics. The 2026 data applies the preferred model to generate pre-rookie predicted DBPM values.
Primary outcome variables explored:
- Defensive Box Plus/Minus (DBPM)
- Defensive Win Shares (DWS)
The current preferred model uses DBPM as the dependent variable because DBPM behaved more like a rate-style defensive impact measure, while DWS was more sensitive to cumulative role, opportunity, and playing-time context.
This repo now includes a browser-only model explorer in rookie_dbpm_model_explorer.html. Open it directly in a browser or serve the repo locally:
python3 -m http.server 8000Then visit http://localhost:8000/rookie_dbpm_model_explorer.html. The app embeds the current 2025 preferred-model complete rows and 2026 prediction rows, recalculates predictions with the preferred DBPM coefficients, supports search/filter/sort, includes a live custom-player calculator with save/delete custom rows, and can export the current board as CSV.
BMad Method project context for this app lives in _bmad-output/project-context.md, with technical research, a tech spec, a Quick Flow spec, and sprint status under _bmad-output/. The full BMad installer could not be fetched in this environment because npm registry access returned HTTP 403, so the project context and technical-spec workflow artifacts were activated manually following BMad's documented output patterns.
Can pre-draft physical and athletic measurements help predict rookie defensive value?
The project tests whether combine variables such as wingspan, shuttle run, lane agility, and hand width contain useful information for projecting early NBA defensive translation.
Important modeling principle:
Rookie-season outcomes such as DWS and DBPM are used as dependent variables only. They should not be used as predictors when the goal is a clean pre-draft model.
.
├── README.md
├── LICENSE
├── combine_rookie_defensive_model_data_v2_20260516_031444_CT.csv
├── combine_rookie_defensive_prediction_data_2026_v2_full_model_inputs_20260516_034524_CT.csv
├── rookie_dws_dataset_bibliography_v4_20260516_040733_CT.md
├── 20260516 meta ai and deepseek HTML draft.html
├── rookie_dbpm_model_explorer.html
├── index.html
├── styles.css
└── draft readme from codex
combine_rookie_defensive_model_data_v2_20260516_031444_CT.csv
This file includes 2025 combine measurements, rookie NBA defensive outcomes, agility/shuttle values, hand measurements, and model-ready flags.
combine_rookie_defensive_prediction_data_2026_v2_full_model_inputs_20260516_034524_CT.csv
This file includes 2026 combine measurements and predicted rookie DBPM values generated from the preferred 2025 model.
rookie_dws_dataset_bibliography_v4_20260516_040733_CT.md
This file records public sources, API notes, and player-level validation references.
20260516 meta ai and deepseek HTML draft.html— Approach A predictor tool (DeepSeek / Meta AI), mobile-first with CSV import/export, inline editing, localStorage persistence, and gem detection.rookie_dbpm_model_explorer.html— Approach B model explorer with embedded data, live recalculation, search/filter/sort, and custom-player calculator.index.html/styles.css— Landing page and stylesheet.
draft readme from codex— Early README draft generated by OpenAI Codex.LICENSE— BSD 3-Clause license.
| Field | Count |
|---|---|
| Rows | 78 |
| Original 75 invitees | 75 |
| Additional/StatLab-only rows | 3 |
| Height values | 63 |
| Wingspan values | 59 |
| Lane agility values | 74 |
| Shuttle run values | 71 |
| Hand length values | 77 |
| Hand width values | 77 |
| Rows with full preferred-model inputs | 41 |
| Field | Count |
|---|---|
| Rows | 75 |
| Wingspan values | 75 |
| Hand width values | 75 |
| Lane agility values | 71 |
| Shuttle run values | 71 |
| Rows with full prediction inputs | 71 |
| Column | Description |
|---|---|
Player |
Player name |
Roster_Status |
Whether the row came from the original invitee list or additional Stat Lab context |
Original_75_Invitee |
Flag for announced 75-player invitee backbone |
Additional_StatLab_Only |
Flag for retained additional Stat Lab rows |
Height, Height_in |
Combine height as text and decimal inches |
Wingspan, Wingspan_in |
Combine wingspan as text and decimal inches |
Standing_Reach, Standing_Reach_in |
Standing reach as text and decimal inches, where available |
Lane_Agility_Time |
Lane agility time in seconds |
Shuttle_Run |
Shuttle run time in seconds |
Hand_Length |
Hand length in inches |
Hand_Width |
Hand width in inches |
MP |
Rookie-season NBA minutes played |
DWS |
Rookie-season Defensive Win Shares |
DBPM |
Rookie-season Defensive Box Plus/Minus |
DWS_per_1000_MP |
Defensive Win Shares per 1,000 minutes |
Predicted_DBPM |
Predicted DBPM from the preferred model, for 2026 prediction data |
Modeling_Ready |
Flag for baseline model-ready rows |
Measurement_Source |
Measurement provenance notes |
Stats_Source |
Season-stat provenance notes |
Notes |
Audit notes and corrections |
Core sources include:
- Basketball Reference rookie and advanced stats pages
- Basketball Reference Win Shares methodology page
- NBA.com Draft Combine Anthropometric Stats
- NBA/AWS Draft Combine Stat Lab
- Hoops Rumors 2025 combine invitee announcement
- Sporting News / Yahoo Sports combine measurement tables
- Sports Reference college basketball player pages for validation examples
- NBADraft.net measurement and athleticism notes
- ESPN, CBS Sports, Floor & Ceiling, and other combine coverage
See the bibliography Markdown file for the running source list and API-attempt notes.
This dataset is a working research artifact, not a canonical database.
Known caveats:
- NBA Stat Lab is interactive and difficult to scrape directly in this environment.
- The NBA Stats API endpoint
draftcombineplayeranthrowas identified as likely relevant, but requests from the working environment returned HTTP 403 or timed out. - Some measurements were manually transcribed from NBA Stat Lab screenshots.
- Some players have missing height, wingspan, standing reach, lane agility, or shuttle values.
- Adou Thiero's wingspan was corrected to
7' 0.00"after a likely7' 9"source typo was identified. - Three 2025 rows are retained as
Additional_StatLab_Only: Lachlan Olbrich, Mackenzie Mgbako, and Yanic Konan Niederhauser. - Position labels are intentionally not used as modeling features in the current version.
- The model is exploratory and should not be treated as a complete scouting grade.
Early models tested height, wingspan, and a WH interaction term:
WH = Height_in × Wingspan_in
Those size-only models performed weakly, especially for DWS. DBPM was more promising than DWS as a dependent variable.
The model improved when movement variables were added, especially shuttle run. Later, hand width was added and modestly improved model fit.
DBPM = -8.895696
+ 0.065647 × Wingspan_in
+ 0.638983 × Lane_Agility_Time
- 3.392342 × Shuttle_Run
+ 0.628937 × Hand_Width
Equivalent formula:
DBPM ~ Wingspan_in + Lane_Agility_Time + Shuttle_Run + Hand_Width
| Metric | Value |
|---|---|
| n | 41 |
| R² | 0.258 |
| Adjusted R² | 0.176 |
| AIC | 158.908 |
| RMSE | 1.487 |
| Predictor | Coefficient | p-value |
|---|---|---|
| Intercept | -8.896 | 0.211 |
| Wingspan_in | 0.066 | 0.404 |
| Lane_Agility_Time | 0.639 | 0.389 |
| Shuttle_Run | -3.392 | 0.062 |
| Hand_Width | 0.629 | 0.156 |
This model explains about 25.8% of the variation in rookie DBPM, with an adjusted R² of 17.6%. For a small exploratory combine-only model, that is a meaningful signal, but not enough to treat the model as a standalone prediction engine.
The clearest signal is Shuttle_Run. Its coefficient is negative, which fits basketball logic: lower shuttle time means better change-of-direction speed, and faster shuttle performance is associated with higher rookie DBPM. The effect is marginally significant at p = 0.062.
Hand_Width has a positive coefficient and modestly improves model fit. It is not statistically significant, but it is directionally plausible because wider hands may help with ball disruption, rebounding control, and defensive playmaking.
Wingspan_in is positive but not statistically significant once mobility and hand width are included.
Lane_Agility_Time is not statistically significant and has a counterintuitive positive sign. It should be retained as a measurement field, but interpreted cautiously.
Rookie defensive impact appears more connected to change-of-direction ability, especially shuttle performance, than to raw size alone in this 2025 combine sample.
The same combine variables were also tested against Defensive Win Shares.
A simple DWS model:
DWS ~ Wingspan_in + Lane_Agility_Time + Shuttle_Run
performed weakly:
| Metric | Value |
|---|---|
| n | 41 |
| R² | 0.065 |
| Adjusted R² | -0.011 |
| AIC | 101.550 |
| RMSE | 0.757 |
None of the predictors were statistically significant. DWS appears less suitable for this initial pre-draft combine-only model, likely because it is a cumulative stat influenced by minutes, team context, role, and opportunity.
The preferred 2025 DBPM model was applied to the 2026 combine class.
Top predicted rookie DBPM values from the current 2026 prediction dataset:
| Rank | Player | Predicted DBPM |
|---|---|---|
| 1 | Flory Bidunga | 1.187 |
| 2 | Reuben Chinyelu | 1.052 |
| 3 | Trevon Brazile | 1.045 |
| 4 | Karim Lopez | 0.852 |
| 5 | Tarris Reed Jr. | 0.838 |
| 6 | Baba Miller | 0.707 |
| 7 | Aaron Nkrumah | 0.581 |
| 8 | Zuby Ejiofor | 0.580 |
| 9 | Yaxel Lendeborg | 0.537 |
| 10 | Matthew Able | 0.354 |
Flory Bidunga was the top 2026 predicted rookie DBPM player in the current model. His college profile provides a useful sanity check:
- 2024-25 Kansas DBPM: 6.2
- 2025-26 Kansas DBPM: 5.6
- Career college DBPM: 5.8
This does not prove the model is correct, but it suggests that the model's top prediction aligns with a player who also showed strong college defensive impact.
Live demo — May 16, 2026
- GitHub view: 20260516 meta ai and deepseek HTML draft.html
- Raw HTML: https://raw.githubusercontent.com/danoff/basketball/main/20260516%20meta%20ai%20and%20deepseek%20HTML%20draft.html
- GitHub Pages (after enabling Pages): https://danoff.github.io/basketball/20260516%20meta%20ai%20and%20deepseek%20HTML%20draft.html
The app is a mobile-first, single-page tool that runs entirely in the browser:
- shows the preferred model in the header
- loads the 2026 prediction dataset (75 players, 71 with full inputs)
- calculates Predicted DBPM live:
DBPM = -8.896 + 0.066×Wingspan + 0.639×Lane_Agility - 3.392×Shuttle + 0.629×Hand_Width - searchable / sortable leaderboard (desktop table, mobile cards)
- CSV Import — drop in your updated 2026 file
- Inline Editor — add/edit players, live recalculation
- Export CSV — download filtered view
- Persistence — saves to localStorage
- Gem detection — flags players with projection rank >30 and Predicted DBPM > median
- dark slate UI built with Tailwind
A mobile-friendly single-page web app is planned for this project.
The app should allow someone to open the page on a phone and quickly see:
- the preferred model,
- the key findings,
- the 2026 predicted DBPM leaderboard,
- a player search/filter,
- caveats,
- and source/provenance notes.
Suggested app stack:
- static HTML/CSS/JavaScript, or
- React with a simple static build,
- deployable through GitHub Pages, Netlify, or Vercel.
Current model formula for the app:
Predicted DBPM = -8.896
+ 0.066 × Wingspan_in
+ 0.639 × Lane_Agility_Time
- 3.392 × Shuttle_Run
+ 0.629 × Hand_Width
Recommended app sections:
- Hero section
- Key metric cards
- Model formula
- Coefficient table
- 2026 prediction leaderboard
- Search/filter
- Caveats
- Data sources
The tool should be mobile-first and suitable for sharing by link or QR code.
This project was developed through an interactive multi-LLM workflow.
Contributions included:
- ChatGPT 5.5 in the web browser: spreadsheet/file handling, dataset consolidation, CSV creation, model fitting, API probing, documentation drafting, bibliography maintenance, and interpretation.
- Microsoft Copilot in the browser: dataset audits, Mode B audit checks, README revision suggestions, and validation notes.
- DeepSeek in the browser: model exploration notes, final model framing, and interactive predictor concept refinements.
- Meta AI (Muse Spark) in the Meta AI app: built the production HTML predictor (May 16, 2026 draft), integrated both 2025 and 2026 CSVs as seed data, implemented the 4-variable DBPM formula with live recalculation, added CSV import/export, inline add-edit-delete with localStorage persistence, responsive desktop table and mobile card views, gem logic, median tracking, and stats bar; generated the QR code asset and prepared the file for GitHub Pages deployment.
- Anthropic Claude in the web browser (alternating between Opus 4.6 and other models in claude.ai): scraped and compiled 2025 and 2024 NBA Draft Combine measurement tables from Yahoo Sports, Draft Central, and other sources into structured spreadsheets; cross-referenced combine participant lists against Basketball-Reference 2025-26 advanced stats to match minutes played and Defensive Win Shares; compared datasets produced by different LLM workflows (Claude vs. Copilot vs. ChatGPT) and identified discrepancies including 5 missing players with NBA minutes and the Adou Thiero wingspan typo; reviewed the GitHub repo and provided a structured code-review-style assessment covering repo structure, statistical methodology (sample size, p-values, VIF, cross-validation), and prioritized next steps; revised the README to reflect actual file layout and added new suggested next steps.
- Additional LLM outputs may be added as separate notes or audits over time.
The goal is not to hide the multi-agent workflow. The goal is to make it inspectable.
Suggested notes structure:
notes/
├── chatgpt_5_5_worklog.md
├── copilot_audit.md
├── deepseek_model_notes.md
├── claude_review.md
└── future_llm_reviews.md
Other LLMs are welcome to add their own explanations, critiques, model checks, and source audits.
import pandas as pd
df = pd.read_csv("combine_rookie_defensive_model_data_v2_20260516_031444_CT.csv")
model_df = df[
df["MP"].notna()
& (df["MP"] > 0)
& df["Wingspan_in"].notna()
& df["Lane_Agility_Time"].notna()
& df["Shuttle_Run"].notna()
& df["Hand_Width"].notna()
& df["DBPM"].notna()
].copy()
print(model_df.shape)df["Predicted_DBPM"] = (
-8.895696
+ 0.065647 * df["Wingspan_in"]
+ 0.638983 * df["Lane_Agility_Time"]
- 3.392342 * df["Shuttle_Run"]
+ 0.628937 * df["Hand_Width"]
)- Reorganize files into a structured layout (e.g.
data/processed/,notes/,bibliography/,webapp/). - Add a
.gitignorefor Python environments, notebooks checkpoints, and OS files. - Add a
requirements.txtor environment spec listing Python dependencies (e.g. pandas, statsmodels). - Move manual screenshot transcriptions into structured audit files.
- Add the F-test p-value for the overall model and VIF (variance inflation factor) diagnostics for each predictor, especially to check for multicollinearity between Lane_Agility_Time and Shuttle_Run.
- Run leave-one-out cross-validation on the preferred model to assess how stable the top-10 predictions are when individual observations are dropped.
- Add prediction intervals to the 2026 leaderboard (even ±1 RMSE bands) to convey uncertainty around individual predictions.
- Extend the dataset to 2024 and earlier combine classes for out-of-sample validation.
- Add scripts that rebuild processed CSVs from raw/interim sources.
- Add a complete data dictionary.
- Attempt NBA Stats API access from a local machine with browser-compatible headers.
- Add baseline regression notebooks.
- Add versioned model outputs.
- Compare DBPM, DWS, and rate-adjusted defensive outcomes.
- Track hand width as a potential additional defensive signal in future combine classes.
- Host the mobile-friendly predictor app via GitHub Pages.
- Add QR-code sharing for the web app.
This project is licensed under the BSD 3-Clause "New" or "Revised" License.
SPDX identifier:
BSD-3-Clause
See LICENSE for the full license text.
The repository license covers code, scripts, documentation, and original project materials in this repository unless otherwise noted. The underlying basketball statistics and combine measurements were compiled from public third-party sources. Those source materials may have their own terms of use. This repository should preserve source attribution and provenance notes for all derived datasets.
README updated with:
- ChatGPT 5.5 in the web browser
- Microsoft Copilot in the browser
- DeepSeek in the browser
- Meta AI (Muse Spark) in the Meta AI app
- Anthropic Claude (Opus 4.6 and other models) in the web browser
Date: 2026-05-16