← golf.propsbot.ai MODEL DOCS

Predictive Model Methodology

Full transparency on how PropsBot Golf Intelligence builds matchup win probabilities, scoring props, and confidence scores. Everything below is the actual pipeline — inputs, methods, parameters, and live backtest performance.

What we model

We generate probabilistic forecasts for every PGA Tour event. Some outputs are directly observed (odds, scores, lines); most are derived from a Monte Carlo simulation over player skill distributions.

Derived outputs

Matchup win probability — 2-ball and 3-ball head-to-head probabilities (with push/dead-heat handling)
Scoring over/unders — round and tournament totals, birdies, bogeys, eagles
Make-cut probability — simulated through Friday using live-round variance
Finishing position — top-5, top-10, top-20 probabilities from the full sim
Outright winner — win-the-tournament probability from 10k runs

Observed inputs we surface

Live odds from DraftKings, FanDuel, BetMGM, Bovada (BDL + The Odds API)
Actual strokes-gained splits, live leaderboard, weather observations

Inputs

Season SG — strokes-gained total & categories from DataGolf and PGA Tour season aggregates
Live tournament SG — round-by-round strokes-gained from BallDontLie (BDL)
Course fit scores — per-venue fit model mapping player strengths (SG:OTT, SG:APP, SG:ARG, SG:PUTT, driving distance, accuracy) to course archetype
Recent form — rolling SG trend windows at L5 and L10 events, hot/cold flags
Weather — Open-Meteo hourly wind forecasts aligned to tee times; used to widen variance and apply directional penalties
Book odds — BallDontLie (BDL) + The Odds API, consensus and best-price per market

Methods

(a) Monte Carlo matchup simulation

We run 10,000 simulated tournaments per event. Each player's per-round score is drawn from a Normal distribution centered on their blended SG projection and widened by per-player variance estimated from history. Scores are integer-rounded (golf is a discrete game) and all players in a group share a round shock term so matchup correlation is preserved on bad-weather days.

(b) Bayesian confidence score

Each projection blends a season prior (long-run SG baseline) with a recent-course-type likelihood (how the player has fared at similar venues). The posterior width determines the confidence score — tight posterior = high confidence, wide posterior = low confidence.

(c) Dead-heat EV (3-ball) & push EV (2-ball)

3-ball markets settle with dead-heat rules on ties, so our EV calculation weights a tied outcome at 1/N payout. 2-ball markets typically push on ties (stake returned) — we account for the tie probability explicitly from the sim instead of rolling it into win probability.

(d) Anomaly detection

For every market, we compute a z-score of the book's implied probability against the model's probability across the field. High-|z| entries surface as Edge Finder picks; low-|z| entries are labeled "book-aligned."

(e) Vig removal before EV

Every edge percentage on this site is computed against a de-vigged fair implied probability, not the raw book line. For outright winner markets we sum implied probabilities across the full field (the overround is typically 1.20-1.30 on a PGA event, i.e. 20-30% of hold), then normalize each player's implied probability by that overround. Without this correction every "+5% edge" we displayed would have been roughly break-even after vig. The current winner-market overround is published live in /golf-data.json under bookVigInfo.winner.overround.

(f) Learned course fit

Course fit is a sample-size-weighted blend of (1) a hand-curated trait prior — does the course reward power, accuracy, scrambling, putting? — and (2) a learned signal computed from each player's historical strokes-gained residual vs the field at the same course over the last 5 years. The blend uses Bayesian shrinkage with k=2: a player with n=2 prior appearances at this course gets 50% weight on the learned signal, n=5 gets 71%, players with no prior data fall back fully to the trait prior. The residuals are derived from BDL's player_round_stats endpoint with round_number=-1 for tournament totals. The per-player learned-vs-prior breakdown is exposed under player.learnedCourseFit in the JSON.

(g) Isotonic calibration of confScore

The raw confScore is a 12-weight composite signal whose 0-100 range has no inherent probability interpretation. We fit pool-adjacent-violators isotonic regression against historical make-cut outcomes from the history/ snapshot archive, producing a monotonic lookup table that maps raw scores to calibrated make-cut probabilities. The calibration is persisted in model_params.json under calibration.table and applied at scrape time, so every player carries a confScoreCalibratedMakeCutProb field. In the most recent training run, the calibrated Brier score (0.196) beat both the no-information baseline (0.249) and the raw score/100 baseline (0.220) on the same data — measurable predictive skill.

(h) Monte Carlo cut-line prediction

The cut line is simulated, not pulled from history. Every player's 36-hole total is drawn from the same per-round Normal distribution used in the matchup model, and the cut score is the 65th-best total (PGA standard cut size) averaged over 4,000 sims. When Round 1 scores are in, we hold R1 fixed and simulate only R2 — this sharply tightens the prediction Friday morning. The output includes per-player make-cut probabilities, which power the "Cut Bubble" filter on the scorecard view.

(i) Field-strength normalization

Comparing strokes-gained residuals across events is unfair without adjusting for field strength: beating a major field by +0.5 SG is more impressive than the same +0.5 at an opposite-field event. We compute a field-strength multiplier per event from the OWGRs of the top quartile of entrants — strong fields (majors) score above 1.0, opposite-field events below. The multiplier is applied to each player's learned course-fit residual before averaging, so historical residuals are weighted by how hard they were earned. The current event's multiplier is exposed at /golf-data.json under fieldStrength.

(j) Per-player tournament position probabilities

A separate 4-round Monte Carlo (4,000 sims) produces per-player probabilities for win, top-5, top-10, top-20, and make-cut. These are real model probabilities — no derivation from win odds, no heuristic scaling. They feed the scorecard's edge calculations for every market where BDL also ships a real book line (winner, top-5, make-cut). Top-10 and Top-20 don't exist as native BDL futures markets, so we expose our model probability but don't display edge — the comparison line would be our own derivation. Per-player probabilities are attached to each player object as modelTop5Prob, modelTop10Prob, modelTop20Prob, modelMakeCutProb.

(k) Per-hole Monte Carlo (round-score / birdie / bogey / eagle props)

For each player we run a separate per-hole simulation (1,500 sims) that draws each hole's outcome from a 5-way categorical distribution — eagle, birdie, par, bogey, double-plus. The base distribution per hole comes from BDL tournament_course_stats historical scoring counts (when available), adjusted multiplicatively by each player's sgTotal. The skill multipliers are tuned so a +1 SG player picks up roughly 0.5 strokes per round (matching the empirical DataGolf relationship): 25% more birdies, 20% fewer bogeys, 40% more eagles, 35% fewer doubles.

Outputs per player: round-score distribution (P(score ≤ k) for the relevant range), and per-round distributions for birdies / bogeys / eagles / double+ counts (P(count ≥ k)). These are stored in perHoleProps and summary fields (modelExpectedRoundScore, modelExpectedBirdies, etc.) attach to each player object.

When BDL's /odds/player_props ships over/under lines for these markets, we compute model fair probability from the distribution, de-vig against the over/under pair, and surface an edge percentage per side — pricing the previously "unmodeled" markets (birdies o/u, bogeys o/u, round-score o/u). Edges live at pricedPlayerProps in the JSON.

(l) Cut-line confidence interval

The cut Monte Carlo already produces a distribution of cut scores across sims — we now surface the 5th and 95th percentiles as a 90% CI. A tight CI (±1 stroke) reads "high confidence"; a wide one (±4) tells you the cut is genuinely uncertain. Exposed at cutPrediction.predictedCutCI90.

(m) Bayesian skill update mid-tournament

After each round, the actual tournament SG observation is conjugate-Gaussian updated against the season-SG prior. Prior std assumed N(season, 0.5); per-round observation noise N(round, 1.5). After 4 rounds the posterior weights the observation roughly 50% when the player has been notably different from season baseline; after 1 round, only ~10%. Exposed as sgTotalUpdated + sgTotalUpdatedStd, and every downstream predictor (matchup, cut, position, per-hole) consumes the updated value when present via effective_sg(player).

(n) Matchup decomposition

Predicted matchup edge is decomposed into per-factor contributions in strokes: skill (SG), course fit, recent form, weather. Stored under player.contrib on each matchup output and visualized in the Compare drawer so users see exactly where the edge comes from rather than a black-box win probability.

(o) Gaussian copula for cross-hole correlation

The per-hole simulator now uses a full Gaussian copula rather than a single round-wide momentum factor. The structure:

Global factor M ~ N(0, 1) shared across all 18 holes — captures round-wide hot/cold streaks. Loaded by √ρ_global.
Local AR(1) chain L_i = ρ_local · L_{i-1} + √(1-ρ²_local) · z_i with iid z_i ~ N(0, 1) — captures adjacent-hole correlation that decays with hole distance.
Latent u_i = √ρ_global · M + √(1-ρ_global) · L_i
Uniform q_i = Φ(u_i) (standard normal CDF)
Inverse-CDF map: q_i is mapped to a hole outcome via the cumulative of the player's skill-adjusted categorical distribution for that hole. This preserves each hole's marginal distribution exactly while imposing the desired joint correlation.

Defaults: ρ_global = 0.01, ρ_local = 0.06, calibrated empirically so total round-score std lands ~3.5 strokes on a par-72 — within the PGA empirical range (2.9-3.5). Both parameters live in model_params.json and are tunable by the weekly backtest job.

What this unlocks: the per-hole simulator now emits a true joint distribution over per-hole scores, not just round totals. The pricing engine uses it to price (a) front-9 over/unders via frontNine.pLte, (b) back-9 over/unders via backNine.pLte, (c) single-hole over/unders like "Player X on Hole 5 over 3.5" via holes[N].pLte. None of these were priceable with the old shared-momentum approximation because the joint structure wasn't accessible — only round-total marginals were.

(s) Historical calibration backfill

The live confScore calibration trains on whatever history/ snapshots we've archived locally (~3 events at any given time). We backfilled the training set with 2024 + 2025 completed PGA events from BDL: ~50 events × ~150 players each. For each historical event we pull /tournament_results for outcomes and /player_season_stats for the contemporaneous season SG, building a proxy "confScore-equivalent" via 50 + 15·sgTotal (matches the empirical confScore distribution). The isotonic table is refit on the combined live + backfill corpus, extending cleanly into the 80-95 score range where the live-only data previously plateaued. Generated by scripts/backfill_calibration.py; consumed by scripts/calibrate.py.

(t) Per-player empirical round variance

Previously every player used a global baseStd = 2.85 in the matchup Monte Carlo. Real golfers have meaningfully different round-to-round volatility — Scheffler is consistent (~2.4), Wyndham Clark is wild (~3.5). We now pull 1-2 seasons of /player_round_results from BDL, compute the empirical std of par-relative scores per player, and attach as scoreStd. Predictors consume this when available, falling back to the global base only for players with fewer than 8 historical rounds. Refines matchup edge, cut-line tightness, and Sharpe-adjusted edge rankings.

(u) Per-course SG category weights (learned)

Different courses reward different SG categories: Augusta loads heavily onto SG: Approach, Pebble onto Putting, Whistling Straits onto Driving. The hand-curated trait dictionaries capture this priors-style, but we can do better: regress historical finish position onto the four SG categories at each course (5 years × 5 events per course, ridge-regularized least squares with λ=0.5) to get per-course coefficients {ott, app, arg, putt}.

The coefficients are negative when better SG predicts a better finish (which is what we expect); the magnitudes show which SG categories matter most at the venue. They're used to refine the effective-skill computation: a player strong in SG: Approach gets an upward skill bump at Augusta beyond what their overall sgTotal would suggest. Applied via effective_sg(player, course_weights); capped at ±0.5 strokes/round to prevent runaway overlay from noisy small-sample fits. Output surfaced at courseSgWeights[course_key] with r2 for transparency.

(p) Closing-line value proxy

For every player with meaningful model edge (|edge| ≥ 2), we look back 24 hours in odds_history.json (already snapshotted at every cron run) and compute whether the implied probability moved in the model's predicted direction. The aggregate "% CLV-positive picks" is the sharp's metric for predictive skill — stronger signal than win/loss because line movement is incrementally repriced by sharp money, not by random outcomes. Sharp models target >55%. Exposed at clvSummary and per-player clvLineMoveBp + clvPositive. Zero new API calls — entirely uses existing snapshots.

(q) Player similarity (kNN on SG profiles)

For each player, the top-5 most similar peers are computed by Euclidean distance on the z-score-normalized vector of (sgOtt, sgApp, sgArg, sgPutt, drivingDistance, scoringAvg). Useful for rookies/first-timers who lack course history — their similar peers may have. Attached to each player as similarPlayers: [{name, distance}, ...].

(r) Sharpe-style edge sorting

The scorecard view's leaderboard adds an "Edge ÷ σ" sort that divides each player's de-vigged edge by the outcome std √(p·(1-p)), boosted by posterior skill uncertainty. Surfaces confident edges over noisy ones — a small edge on a tight prediction can be better than a larger edge on a coin-flip. Pure frontend computation against the model probabilities already in the JSON.

Tunable parameters

These are the actual values driving the current model, fetched live from /model_params.json. They are overwritten weekly by the backtest job when calibration drifts.

Loading live model parameters...

Performance (last backtest)

Live backtest output from /backtest-report.json. Runs every Monday at 2AM ET against all completed tournament weeks in history. Brier score below 0.25 means the model beats the 50/50 coin-flip benchmark on matchup markets.

Loading backtest report...

What we don't do (honest accounting)

No shot-level data — we work with round-level strokes gained, not per-shot tracking. Shot dispersion, ball flight, and Trackman data are outside our scope.
No sub-60-minute live scoring granularity — ESPN leaderboard pulls every 60s; we don't model intra-hole swings.
No DFS ownership projections — this is a betting model, not a DFS tool.
Top-10 and Top-20 markets: model side is now real, book side is still estimated. The BDL /futures endpoint only ships book lines for tournament_winner, top_5_finish, and make_cut. We now derive real Monte Carlo probabilities for top-10 and top-20 finish positions (exposed as modelTop10Prob / modelTop20Prob) but the comparison odds are our own estimate, so we still don't display an edge percentage for these markets.
Multi-hole and single-hole props are now modeled. The Gaussian copula (see method (o)) produces a true joint distribution over per-hole scores, so we can price front-9 over/unders, back-9 over/unders, and single-hole markets like "Hole 12 over 3.5" — all surfaced in pricedPlayerProps when BDL ships matching lines. The remaining gap is BDL-side: /odds/player_props rarely ships these markets for golf today; we'll consume them when they appear.
Field-strength normalization is applied to historical residuals, not raw season SG. Each historical event's residual is scaled by an OWGR-derived strength multiplier before being averaged into the learned course-fit signal. The season-aggregate sgTotal coming from DataGolf is consumed as-is — DataGolf claims to already strength-adjust their season totals; we trust that for now.
No injury/WD prediction — we react to field-change news, we don't forecast it.
Course fit prior is hand-curated — the trait dictionaries (power/accuracy/scramble/putting weights per course) are designed, not learned. The learned course-fit residual mitigates this by anchoring fit in historical outcomes for players with prior appearances, but rookies still lean on the trait prior.
Only DraftKings and FanDuel for outright lines — these are the only PGA vendors in BDL. We supplement with The Odds API for matchups (BetMGM, Bovada, BetRivers) but the de-vig overround is computed from the DK/FD field.
Backtest has small sample — out-of-sample test set is roughly 30-50 matchup pairs depending on history depth. The OOS Brier currently beats the uniform baseline by ~20% but the confidence interval is wide. We re-run the backtest weekly and only adjust model parameters when calibration drift is confirmed across two consecutive runs.

Update cadence

Scraper (weekly-scrape.yml)

Mon 1AM ET — weekly reset, next tournament detection, field + early odds
Tue 1AM ET — BallDontLie catch-up, previous tournament data settled
Wed (5 windows) — 5AM, 11AM, 2PM, 7PM, 10PM ET (opening lines → sharp action)
Thu–Sun, 7AM–7PM ET hourly — tournament-week live updates (odds, confidence, props, analytics)

Backtest (weekly-backtest.yml)

Mon 2AM ET — re-fit tunable parameters against the full history, overwrite model_params.json if calibration drifted