캘리브레이션 거버넌스 도구 + WBS-7.1/7.2 실증 격차 가시화

캘리브레이션 백로그 → 우선순위 → 검토리포트 → 승인목록 → 결정초안으로 이어지는 임계값 보정 거버넌스 파이프라인을 추가하고, 2026-06-21 비판적 리뷰에서 발견한 두 가지 stale-수치 문제를 도구 차원에서 해소한다. - registry_health(): 190여 개 임계값의 source별(SPEC_DERIVED/EXPERT_PRIOR/ PROVISIONAL/CALIBRATED) 분포를 매 실행마다 자동 집계 — 수동 grep 불필요 - live_t5_status(): T+5 적중률을 하드코딩(35.86 리터럴) 대신 Temp/prediction_accuracy_harness_v2.json에서 항상 최신값으로 읽음 - spec/calibration_registry.yaml: SEMI_CLUSTER_CAP_RISK_OFF 중복 id로 인한 조용한 무시 버그 수정(SEMI_CLUSTER_CAP_RISK_OFF_MWA로 분리) - spec/27_bch_calibration_runbook.yaml: current_status_2026_06_21 블록 신설(단일 진실원천), 기존 05-30 스냅샷은 "역사적, 현재로 인용 금지"로 명시
2026-06-21 20:07:32 +09:00
parent f99f9821d2
commit ee4d1fdab8
8 changed files with 855 additions and 42 deletions
@@ -451,7 +451,30 @@ reject_conditions:
  - "sample_n < 30인 임계값을 '보정완료'로 처리"

 # ════════════════════════════════════════════════════════════════════════════
-# 현재 달성 현황 (2026-05-30)
+# 현재 달성 현황 (2026-06-21 재검증 — WBS-7.2)
+# ════════════════════════════════════════════════════════════════════════════
+# 주의: 아래 current_status_2026_05_30 블록은 그 날짜 기준 정적 스냅샷이며,
+# 이후 갱신되지 않은 채 docs/ROADMAP_WBS.md 등에서 "현재 상태"로 인용되어
+# 서로 다른 시점의 T+5 수치(54.76%/35.86%)가 혼재하는 문제를 일으켰다.
+# Temp/honest_performance_guard_v1.json(생성: 2026-06-14)과
+# Temp/prediction_accuracy_harness_v2.json(생성: 2026-06-21, 7일 더 최신)을
+# 직접 재확인한 결과는 다음과 같다 — 이 블록을 단일 진실원천으로 삼는다.
+current_status_2026_06_21:
+  source_of_truth: "Temp/prediction_accuracy_harness_v2.json (as_of_date=2026-06-21, 가장 최신)"
+  t1_match_rate_pct: 52.94   # sample=68, decisive_sample=53, rate_decisive=67.92
+  t5_match_rate_pct: null    # sample=0 — INSUFFICIENT_SAMPLES. honest_performance_guard_v1.json(2026-06-14)의
+                             # 35.86%는 7일 전 스냅샷이며 표본이 0으로 줄어 더 이상 유효하지 않음.
+  t5_sample_regression_note: >
+    cases_analyzed가 141건(2026-05-30 기준)에서 t5_sample=0(2026-06-21)으로 감소했다.
+    evaluation_methodology가 ACTIVE_PASSIVE_SPLIT_V1_INCONCLUSIVE_EXCLUDED로 변경되며
+    inconclusive/replay 표본이 제외된 것으로 추정 — 근본 원인은 별도 조사 필요(WBS-7.2 잔여 항목).
+  calibration_registry_total_thresholds: 190  # spec/calibration_registry.yaml 직접 집계 (구문서의 70은 stale)
+  calibration_registry_expert_prior_count: 59
+  calibration_registry_calibrated_count: 0
+  rule: "이 문서를 인용할 때는 항상 as_of_date를 동반 표기하고, 아래 5/30 스냅샷을 '현재'로 인용하지 않는다."
+
+# ════════════════════════════════════════════════════════════════════════════
+# 과거 달성 현황 (2026-05-30, 역사적 스냅샷 — "현재"로 인용 금지)
 # ════════════════════════════════════════════════════════════════════════════
 current_status_2026_05_30:
  phase_1_bch: COMPLETE
@@ -489,3 +512,17 @@ current_status_2026_05_30:
  cases_analyzed: 141
  miss5_count: 51
  next_milestone: "cases_analyzed=30 달성 후 ALEG_V2_GATE1_BLOCK_PCT 보정 심사"
+  automation_entrypoints:
+    gitea_schedule: ".gitea/workflows/calibration_backlog.yml"
+    npm_script: "npm run ops:calibration-backlog"
+    generated_artifacts:
+      - Temp/calibration_priority_v1.json
+      - Temp/calibration_change_ledger_v4.json
+      - Temp/calibration_review_report_v1.json
+      - Temp/calibration_review_report_v1.md
+      - Temp/calibration_approval_list_v1.json
+      - Temp/calibration_approval_list_v1.md
+      - Temp/calibration_registry_v1.json
+    promotion_rules:
+      provisional: "sample_n >= 10 AND direction confirmed AND change_ledger entry exists"
+      calibrated: "sample_n >= 30 AND backtest_doc exists AND validator overclaimed_count == 0"
@@ -1,3 +1,7 @@
+has_code_implementation: true
+code_path:
+  - "tools/build_calibration_priority_v1.py"
+  - "tools/validate_calibration_registry_v1.py"
 thresholds:
 - id: ALEG_V2_GATE1_BLOCK_PCT
  value: 3.0
@@ -913,7 +917,7 @@ thresholds:
  notes: '이벤트 충격 방어: 20% 고정. KOSPI 비중 제공 시 max(20, weight×0.60).'
  live_sample_requirement: 30
  sunset_date: '2026-09-30'
- id: SEMI_CLUSTER_CAP_RISK_OFF
+- id: SEMI_CLUSTER_CAP_RISK_OFF_MWA
  value: 25.0
  unit: pct
  source: EXPERT_PRIOR
@@ -921,7 +925,12 @@ thresholds:
  last_calibrated: null
  owner_formula: MARKET_WEIGHT_AWARE_CLUSTER_GATE_V1
  gs_location: gas_data_feed.gs:3858
-  notes: '하락장: 25%. KOSPI 비중 제공 시 max(25, weight×0.80).'
+  notes: >
+    하락장: 25%. KOSPI 비중 제공 시 max(25, weight×0.80).
+    WBS-7.1(2026-06-21): 원래 id가 SEMI_CLUSTER_CAP_RISK_OFF였으나
+    SEMICONDUCTOR_CLUSTER_GATE_V1 소유의 동명 entry(value=20.0)와 id가 충돌해
+    dict 기반 조회 시 한쪽이 조용히 무시되는 버그가 있었다. 외부 참조 0건 확인 후
+    이 entry(MARKET_WEIGHT_AWARE_CLUSTER_GATE_V1 소유)만 _MWA suffix로 분리했다.
  live_sample_requirement: 30
  sunset_date: '2026-09-30'
 - id: SEMI_CLUSTER_CAP_NEUTRAL
@@ -1803,6 +1812,22 @@ thresholds:
  gs_location: gas_data_feed.gs:2164
  notes: Base take-profit score used in profit-lock computation. Migrated from GAS SP constant to registry (P5-T01 wave2).

+- id: OVERHANG_PRESSURE_V1_FALLBACK_MULT
+  value: 1.5
+  unit: multiplier_of_avg_volume_5d
+  source: EXPERT_PRIOR
+  sample_n: 0
+  last_calibrated: null
+  owner_formula: OVERHANG_PRESSURE_V1
+  py_location: spec/13_formula_registry.yaml:OVERHANG_PRESSURE_V1.derived_flags.selling_acceleration.without_20d_fallback
+  notes: >
+    WBS-7.5(2026-06-21) — frg_20d_sh 미존재 시 selling_acceleration 폴백을
+    "frg_5d_sh < -500000"(절대 주식수, 임시) 에서 "frg_5d_sh < -1.5 * avg_volume_5d"
+    (해당 종목 평균거래량 비례) 로 교체. 1.5 배수는 with_20d 분기에서 동일 공식이
+    이미 사용하는 가속 임계(frg_20d_sh/4 × 1.5)를 그대로 재사용한 것이며, 새로
+    추정한 값이 아니다. 단, 실거래 표본으로 검증되지 않았으므로 EXPERT_PRIOR로
+    등록한다 — CALIBRATED 승격은 sample_n≥30 확보 후 검토.
+
 calibration_policy:
  honest_disclosure_required: true
  overclaimed_calibration_definition: 'source=CALIBRATED 이면서 sample_n < 30 → OVERCLAIMED_CALIBRATION.