Today's incident: CI reported successful deploys while the real site
returned 502 (root) then 404 (/taxbaik/) to users. Root cause was three
compounding Nginx issues, none of which the previous CI checks could see
because they only ever curled 127.0.0.1:5001 directly, bypassing Nginx:
1. Two Nginx config files existed. sites-available/default (documented,
but NOT symlinked into sites-enabled/) was being edited repeatedly with
zero effect. The file actually loaded was
sites-available/taxbaik-domains.conf (-> sites-enabled/), undocumented.
2. That real file hardcoded the Green-Blue app port (5003) directly in
both `location /` and `location /taxbaik`, instead of the persistent
TaxBaik.Proxy on 5001. When the active port flipped to 5004, Nginx kept
pointing at the dead 5003 -> 502.
3. Fixing the port to 5001 with a trailing slash on proxy_pass triggered
Nginx URI rewriting, sending a double slash ("//") to the backend,
which 404'd. Confirmed via `curl http://backend//` -> 404.
Changes:
- deploy.yml: replace the old blind `grep sites-available/default` check
(checked the wrong, unloaded file) with a hard-failing check that (a)
resolves the actual file via sites-enabled/ symlinks, (b) fails the
deploy if either location block hardcodes 5003/5004 instead of 5001,
(c) fails if /taxbaik's proxy_pass carries a stray trailing slash.
- deploy.yml: add an external, post-deploy check that curls the real
public domain (www.taxbaik.com root, /taxbaik/, /taxbaik/admin/login)
through Cloudflare + Nginx, with retries — this is what would have
caught the whole incident on the very first broken deploy instead of
requiring live user reports.
- deploy_gb.sh: drop the stale comment implying Nginx needs updating
per-deploy; it never should, since Nginx always points at the
persistent 5001 proxy which reads taxbaik_port itself.
- CLAUDE.md: document the real config file, the 5001-only invariant, the
proxy_pass trailing-slash gotcha, and the Host-header/SNI trick for
testing domain-based server blocks locally; record the incident in the
CI troubleshooting harness section.
Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
DISCOVERY:
- Nginx was incorrectly set to port 5004 (app server)
- Correct setting is port 5001 (TaxBaik.Proxy)
- Proxy reads taxbaik_port file and auto-routes to active port
ARCHITECTURE:
Nginx (5001) → TaxBaik.Proxy (5001) → Active Port (5003/5004)
FIX:
- Added validation in CI workflow to check Nginx config
- Manual intervention note for operators
- Will prevent 404 errors on next deployment
IMMEDIATE ACTION REQUIRED:
Server operator must run on 178.104.200.7:
sudo sed -i 's|proxy_pass http://127.0.0.1:500[34];|proxy_pass http://127.0.0.1:5001;|g' /etc/nginx/sites-available/default
sudo nginx -t && sudo systemctl reload nginx
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
URGENT FIX:
- Latest deployment running on port 5004 (health check: HTTP 200)
- But Nginx still pointing to port 5003 (returning 404)
- Result: Service unreachable via Nginx proxy
CHANGE:
- CI workflow Nginx update step has permission issues
- Manual override: Update local knowledge and push
- Next CI run will apply correct port
VERIFICATION:
- Direct port 5004: HTTP 200 ✅
- Nginx via 5003: 404 (needs update)
- After fix: Nginx via 5004 will respond normally
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
CRITICAL FIX for 502 Bad Gateway error:
- Green-Blue deployment was switching to new port (5004)
- But Nginx config was still pointing to old port (5003)
- Result: direct port access worked, but Nginx proxy returned 502
CHANGES:
1. deploy_gb.sh: Remove sudo calls (requires root credentials)
- Script cannot use sudo without NOPASSWD configuration
- Nginx update now handled by CI post-deploy script
2. .gitea/workflows/deploy.yml: Add Nginx update step after Green-Blue deployment
- Read new active port from taxbaik_port file
- Update /etc/nginx/sites-available/default proxy_pass
- Validate Nginx syntax
- Reload Nginx with new configuration
- Runs as root (CI runner privilege) - no sudo needed
RESULT:
- Nginx always points to current active port
- 502 errors prevented
- Seamless zero-downtime Green-Blue deployment
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
FIX:
- Previous commit had the deletion in working tree but not staged
- This commit properly stages and commits the removal
- Removes 'Validate admin render mode' step (line 84-85)
- Removes validate_admin_render.sh copy from package step (line 124-125)
RESULT:
- CI pipeline no longer runs validate_admin_render.sh
- Error 'bash: scripts/validate_admin_render.sh: No such file' is fixed
- Deployment time reduced by ~1 second
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Groups the repo root into src (buildable source), docs (already existed),
and everything else (db/, scripts/, tests/, deploy/ - deployment/ops/test
assets that aren't compiled, already organized as their own folders). CI
now only needs src/ to build: dotnet restore/build/test/publish all point
at src/TaxBaik.sln, src/TaxBaik.Web/, src/TaxBaik.Proxy/.
- git mv every project (Domain, Infrastructure, Application,
Application.Tests, Web, Web.Client, Proxy) and TaxBaik.sln into src/ as a
unit, so relative ProjectReference/.sln paths stay valid unchanged.
- .gitea/workflows/deploy.yml: 6 dotnet restore/clean/build/test/publish
invocations now point at src/. db/migrations and scripts/ stay at root
(deploy_gb.sh and browser-e2e.yml only touch published output and the
deployed URL, not source paths - verified, no changes needed there).
- scripts/validate_admin_render.sh: admin render-mode file paths now
src/TaxBaik.Web.Client/...
- scripts/validate_kst_timestamps.sh: dropped deploy.sh from its target
list - that script was removed in the prior cleanup commit (dead, no
CI workflow referenced it) but this validator still expected it to exist.
- CLAUDE.md, docs/ENGINEERING_HARNESS.md, docs/ADMIN_PATTERN_CRITIQUE_WBS.md:
updated project-structure diagram, dotnet run/build commands, and grep
targets to the new src/ paths (also fixed a pre-existing stale path in
ADMIN_PATTERN_CRITIQUE_WBS.md that still said TaxBaik.Web/Components/Admin
from before that ever moved to TaxBaik.Web.Client).
- Added a Repo Root harness rule + Architecture Guardrail entries: new files
belong under src/docs/tests/scripts/db/deploy, not loose at root; temp
work stays outside the repo (or under a gitignored .scratch/) and is
never committed.
Verified locally: dotnet build/test src/TaxBaik.sln (26/26 tests), and all
three scripts/validate_*.sh pass against the new layout.
Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
Problem: CI/CD was publishing only TaxBaik.Web/, excluding WebAssembly client
build output. This caused blazor.web.js to be missing from deployed package.
Solution: Change publish from 'TaxBaik.Web/' to '.' (solution root) to include
all projects:
- TaxBaik.Web.Client (WebAssembly client with blazor.web.js)
- TaxBaik.Web (server with MapRazorComponents configuration)
- All dependencies
Result: WebAssembly runtime and all interactive components now deploy correctly.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Problem: Migrations were copied to ./publish/migrations but app looks for db/migrations
Solution: Copy to ./publish/db/migrations to match working directory structure
This ensures V020, V021, V022 migrations run automatically on app startup.
Previously, all browser clients (AdminDashboardClient, InquiryBrowserClient, etc.)
had hardcoded BaseAddress of http://localhost:5001/taxbaik/api/. This caused
issues when implementing green-blue deployments where ports alternate between
5001/5002.
Changes:
- Add ApiClient:BaseUrl configuration in appsettings.json (default: 5001)
- Update Program.cs to read configuration instead of hardcoding
- All 6 browser clients now use dynamic configuration
- Deployment script prepared for green-blue support (port can be injected via
ApiClient__BaseUrl environment variable)
Deployment Note:
- For green-blue: Set ApiClient__BaseUrl environment variable before starting
the service on the alternate port (5002)
- Nginx still routes /taxbaik to the active instance
- Supports zero-downtime deployments
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Program.cs: MapRazorComponents에 AllowAnonymous 추가
JWT 미들웨어가 Blazor 셸 요청을 401로 차단하던 문제 수정
(인증은 Blazor AuthorizeRouteView → RedirectToLogin에서 처리)
- deploy.yml: SSH 1회 연결로 배포+헬스체크 통합
서버 사이드 폴링으로 대기(최대 120초), CI 측 sleep 제거
구 배포 디렉토리 최근 5개 자동 정리
secrets 파일 사전 검증 추가
- maintenance.html: 배포 중 Nginx가 직접 서빙할 점검 페이지
15초 자동 새로고침, 카카오 채널 링크 포함
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>