{"baseline_gate":{"baseline":"predict_yesterday (forecast[cc,t] = label[cc,t-1])","comparison":{"v1_lift_over_predict_yesterday_pp":-79.0438,"v2_lift_over_predict_yesterday_pp":-51.8691,"v2_minus_v1_lift_pp":27.1747},"gates":{"A_lift_ge_threshold":false,"B_no_temporal_regression":true,"C_legacy_delta":0.20993532159540074,"C_no_catastrophic_legacy_regression":true},"generated_at":"2026-05-22T12:09:34.919368+00:00","legacy_regression_floor":-0.1,"lift_gate_pp":8,"promote":false,"reason":"HONEST NEGATIVE — v2 beats predict-yesterday by only -51.87pp F1 (gate requires >= 8.0pp). After real momentum + acceleration + volatility + event-anticipation + cross-country leading-indicator feature engineering, the 7-day censorship target remains persistence-dominated: what is blocked stays blocked. The forecast's value is calibration + explanation, not lift.","schema":"voidly-forecast-v2-momentum-baseline-gate/v1","v1":{"stratified":{"auc":0.9546571995724539,"brier":0.024374562088325717,"f1":0.6413043478260869,"n":2303,"pos_rate":0.052105948762483714,"precision":0.921875,"recall":0.49166666666666664,"threshold":0.42000000000000004},"temporal_holdout":{"holdout_date_range":["2026-03-23","2026-05-21"],"holdout_days":60,"lift_auc_pp":-36.7633,"lift_f1_pp":-79.0438,"model":{"auc":0.58900920694706,"brier":0.14816522666227108,"f1":0.13186813186813187,"n":1260,"pos_rate":0.15714285714285714,"precision":0.24,"recall":0.09090909090909091,"threshold":0.05},"n_holdout":1260,"n_pos_holdout":198,"predict_yesterday":{"auc":0.9566426981681219,"brier":0.024603174603174603,"f1":0.9223057644110275,"n":1260,"pos_rate":0.15714285714285714,"precision":0.9154228855721394,"recall":0.9292929292929293,"threshold":0.05},"tag":"v1","transition_only":{"model":{"auc":0.32352941176470584,"brier":0.5086887620017835,"f1":0,"n":31,"pos_rate":0.45161290322580644,"precision":0,"recall":0,"threshold":0.5},"model_auc_on_transitions":0.32352941176470584,"n_transition_rows":31,"note":"Persistence is wrong on every transition row by definition. Model AUC on transition rows is the honest test of whether the forecast has skill where it actually matters.","persistence":{"auc":0,"brier":1,"f1":0,"n":31,"pos_rate":0.45161290322580644,"precision":0,"recall":0,"threshold":0.5},"transition_pos_rate":0.45161290322580644}}},"v2":{"loco_auc_median":0.7105554307770374,"loco_f1_mean":0.278832902542672,"loco_f1_median":0.3294573643410853,"stratified":{"auc":0.9893189799969462,"brier":0.01243044068264709,"f1":0.8512396694214877,"n":2303,"pos_rate":0.052105948762483714,"precision":0.8442622950819673,"recall":0.8583333333333333,"threshold":0.23000000000000004},"temporal_holdout":{"holdout_date_range":["2026-03-24","2026-05-22"],"holdout_days":60,"lift_auc_pp":-27.1569,"lift_f1_pp":-51.8691,"model":{"auc":0.6850734273050656,"brier":0.13645653541121952,"f1":0.4036144578313253,"n":1260,"pos_rate":0.15714285714285714,"precision":0.5,"recall":0.3383838383838384,"threshold":0.05},"n_holdout":1260,"n_pos_holdout":198,"predict_yesterday":{"auc":0.9566426981681219,"brier":0.024603174603174603,"f1":0.9223057644110275,"n":1260,"pos_rate":0.15714285714285714,"precision":0.9154228855721394,"recall":0.9292929292929293,"threshold":0.05},"tag":"v2_momentum","transition_only":{"model":{"auc":0.3277310924369748,"brier":0.51263981511153,"f1":0.10526315789473684,"n":31,"pos_rate":0.45161290322580644,"precision":0.2,"recall":0.07142857142857142,"threshold":0.25000000000000006},"model_auc_on_transitions":0.3277310924369748,"n_transition_rows":31,"note":"Persistence is wrong on every transition row by definition. Model AUC on transition rows is the honest test of whether the forecast has skill where it actually matters.","persistence":{"auc":0,"brier":1,"f1":0,"n":31,"pos_rate":0.45161290322580644,"precision":0,"recall":0,"threshold":0.5},"transition_pos_rate":0.45161290322580644}}}},"evaluation_rule":"forward-temporal split ONLY (train on past, test on strictly-future 60-day window); compared against a persistence baseline (predict tomorrow = today's label). Never a shuffled split.","finding_url":"/atlas/findings/forecast-v2-momentum-vs-persistence-2026-05","honest_caveats":["The production v1 'AUC 0.954' comes from a SHUFFLED train_test_split that leaks the target's day-to-day autocorrelation. Under a forward-temporal split v1's real AUC is ~0.59.","target_7day is a sliding 7-day window — adjacent days share 6 of 7 lookahead days, so the label is ~98.9% autocorrelated day-to-day. That makes persistence a ~0.92 F1 baseline BY CONSTRUCTION; it is not evidence forecasting is solved.","On the transition rows where the label actually moves, v2's AUC is below 0.5 — the forecast has no skill on the days that matter (shutdown onset / block lift).","v2's value, like v1's, is calibration + explanation (SHAP drivers, conformal intervals), not predictive lift."],"model":"forecast-v2-momentum","production_model":"forecast v1 (unchanged)","status":"NOT PROMOTED — honest negative result","summary":{"feature_family_importance":{"contagion_chain":0.06350120529532433,"cross_country_leading":0.025261081755161285,"event_anticipation":0.15009485743939877,"v1_base":0.274764571338892,"volatility":0.022364437580108643},"generated_at":"2026-05-22T12:09:34.929392+00:00","loco_auc_median":0.7105554307770374,"loco_f1_mean":0.278832902542672,"loco_f1_median":0.3294573643410853,"model_version":"v2_momentum","n_features":69,"n_new_features":30,"promote":false,"reason":"HONEST NEGATIVE — v2 beats predict-yesterday by only -51.87pp F1 (gate requires >= 8.0pp). After real momentum + acceleration + volatility + event-anticipation + cross-country leading-indicator feature engineering, the 7-day censorship target remains persistence-dominated: what is blocked stays blocked. The forecast's value is calibration + explanation, not lift.","schema":"voidly-forecast-v2-momentum-summary/v1","stratified_auc":0.9893189799969462,"stratified_f1":0.8512396694214877,"temporal_holdout_auc":0.6850734273050656,"temporal_holdout_f1":0.4036144578313253,"v1_lift_over_predict_yesterday_pp":-79.0438,"v2_lift_over_predict_yesterday_pp":-51.8691},"onset_skill":{"schema":"voidly-forecast-onset-skill/v1","model":"forecast-v1 (XGBoost + isotonic) — production 7-day shutdown forecast","headline":"The 7-day forecast is a current-regime risk signal, NOT a shutdown-onset predictor.","plain_english":"The forecast reliably reflects 'this country is currently in a censored regime', but has essentially zero skill at predicting a NEW shutdown before it happens.","label":{"name":"target_7day","construction":"7-day sliding window — 1 if any censorship/mixed incident in [T+1, T+7]","autocorrelation_day_to_day":0.989,"note":"Adjacent days share 6 of 7 look-ahead days, so the label barely moves day-to-day. High AUC on this label rewards persistence, not forecasting skill."},"metrics":{"v1_stratified_auc_leaky":0.954,"v1_stratified_auc_note":"Reported by /v1/forecast/model/info. Comes from a shuffled train_test_split that scatters adjacent autocorrelated days across folds — inflated, not deployment-representative.","v1_forward_temporal_auc":0.589,"v1_forward_temporal_f1":0.132,"v1_forward_temporal_note":"Honest split: train on the past, test only on the strictly-future 60-day window. Near chance.","predict_yesterday_baseline_auc":0.957,"predict_yesterday_baseline_f1":0.922,"predict_yesterday_baseline_note":"Trivial baseline forecast[t] = label[t-1]. Beats the model because the sliding window overlaps by construction — high score here is an autocorrelation artifact, not skill.","transition_row_auc":0.328,"transition_row_f1":0.105,"transition_row_note":"Restricted to rows where the label actually changes (a shutdown begins or ends). AUC below 0.5 means the model is worse than a coin flip on the days that matter."},"correct_use":"Use the forecast to see which countries are entrenched in a sustained blocking regime today, and to read SHAP drivers + the 90% conformal interval. Do NOT treat it as an early-warning siren for a new shutdown.","variants_share_this":"The multi-horizon (1d/7d/30d) and per-region forecasts use the same target_Nday sliding-window label and inherit the same autocorrelation inflation. The hourly forecast uses a proper temporal holdout but its sliding K-hour label is still autocorrelated.","finding_url":"/atlas/findings/forecast-v2-momentum-vs-persistence-2026-05"},"_voidly_pay":{"claim_did":"https://voidly.ai/pay/claim","for_builders":"https://voidly.ai/pay/for-builders","marketplace":"https://api.voidly.ai/v1/pay/marketplace","live_demo":"https://huggingface.co/spaces/emperor-mew/voidly-pay","universal_proxy":{"url":"https://api.voidly.ai/v1/pay/proxy","example":"https://api.voidly.ai/v1/pay/proxy?u=<https-url>&to=did:voidly:<your-did>&price=0.01","page":"https://voidly.ai/pay/proxy","note":"Paywall any public HTTPS URL with one query param. No SDK install required."},"install":{"typescript":"npm install @voidly/pay","python":"pip install voidly-pay","langchain":"pip install voidly-pay-langchain","vercel_ai":"npm install @voidly/pay-vercel-ai","mcp":"npx @voidly/pay-mcp","cli":"npm install -g @voidly/pay-cli"},"note":"Voidly Pay is the open agent-to-agent payment rail. Claim a DID + 10 starter credits in 60s, or paywall any URL via the universal proxy with one query param."}}