Today's incident: CI reported successful deploys while the real site
returned 502 (root) then 404 (/taxbaik/) to users. Root cause was three
compounding Nginx issues, none of which the previous CI checks could see
because they only ever curled 127.0.0.1:5001 directly, bypassing Nginx:
1. Two Nginx config files existed. sites-available/default (documented,
but NOT symlinked into sites-enabled/) was being edited repeatedly with
zero effect. The file actually loaded was
sites-available/taxbaik-domains.conf (-> sites-enabled/), undocumented.
2. That real file hardcoded the Green-Blue app port (5003) directly in
both `location /` and `location /taxbaik`, instead of the persistent
TaxBaik.Proxy on 5001. When the active port flipped to 5004, Nginx kept
pointing at the dead 5003 -> 502.
3. Fixing the port to 5001 with a trailing slash on proxy_pass triggered
Nginx URI rewriting, sending a double slash ("//") to the backend,
which 404'd. Confirmed via `curl http://backend//` -> 404.
Changes:
- deploy.yml: replace the old blind `grep sites-available/default` check
(checked the wrong, unloaded file) with a hard-failing check that (a)
resolves the actual file via sites-enabled/ symlinks, (b) fails the
deploy if either location block hardcodes 5003/5004 instead of 5001,
(c) fails if /taxbaik's proxy_pass carries a stray trailing slash.
- deploy.yml: add an external, post-deploy check that curls the real
public domain (www.taxbaik.com root, /taxbaik/, /taxbaik/admin/login)
through Cloudflare + Nginx, with retries — this is what would have
caught the whole incident on the very first broken deploy instead of
requiring live user reports.
- deploy_gb.sh: drop the stale comment implying Nginx needs updating
per-deploy; it never should, since Nginx always points at the
persistent 5001 proxy which reads taxbaik_port itself.
- CLAUDE.md: document the real config file, the 5001-only invariant, the
proxy_pass trailing-slash gotcha, and the Host-header/SNI trick for
testing domain-based server blocks locally; record the incident in the
CI troubleshooting harness section.
Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
CRITICAL FIX for 502 Bad Gateway error:
- Green-Blue deployment was switching to new port (5004)
- But Nginx config was still pointing to old port (5003)
- Result: direct port access worked, but Nginx proxy returned 502
CHANGES:
1. deploy_gb.sh: Remove sudo calls (requires root credentials)
- Script cannot use sudo without NOPASSWD configuration
- Nginx update now handled by CI post-deploy script
2. .gitea/workflows/deploy.yml: Add Nginx update step after Green-Blue deployment
- Read new active port from taxbaik_port file
- Update /etc/nginx/sites-available/default proxy_pass
- Validate Nginx syntax
- Reload Nginx with new configuration
- Runs as root (CI runner privilege) - no sudo needed
RESULT:
- Nginx always points to current active port
- 502 errors prevented
- Seamless zero-downtime Green-Blue deployment
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
PROBLEM: Dashboard page was stuck loading forever
ROOT CAUSE:
- AdminDashboardClient requires ApiClient:BaseUrl configuration
- deploy_gb.sh was missing ApiClient__BaseUrl environment variable
- HttpClient had no BaseAddress, causing all API calls to fail
SOLUTION:
- Remove timeout bandaid from App.razor
- Add ApiClient__BaseUrl to deploy_gb.sh environment variables
- API requests will now properly route to http://127.0.0.1:${TARGET_PORT}/taxbaik/api/
EXPECTED RESULT:
- Dashboard API calls succeed
- Dashboard page loads normally
- Blog management page becomes clickable
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Root cause analysis:
- App.razor referenced blazor.web.js (legacy filename)
- ASP.NET Core 10 publish outputs blazor.webassembly.js (standard)
- Build/publish mismatch caused 'SyntaxError: Invalid or unexpected token'
Solution (proper fix, not workaround):
- App.razor: change script src to blazor.webassembly.js
- Remove deploy_gb.sh file-copy workaround
- Program.cs: remove unnecessary comment
Result: Single source of truth - blazor.webassembly.js is the standard ASP.NET Core 10
filename. No file duplication, no symlinks, no publish-time workarounds needed.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Problem: ASP.NET Core static files middleware may not handle symlinks correctly,
causing blazor.web.js to be served as a 21-byte stub instead of the full 60KB file.
This caused 'SyntaxError: Invalid or unexpected token' in browser.
Solution: Replace symlink with actual file copy in deploy_gb.sh:
- cp blazor.webassembly.js blazor.web.js (+ .gz and .br variants)
This ensures both filenames are actual files that the static files middleware
can properly serve.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
Problem: ASP.NET Core 10 WASM runtime file is named blazor.webassembly.js but
Blazor pages still reference blazor.web.js, causing 404 Not Found errors and
complete failure to load admin UI.
Solution: In deploy_gb.sh, create symlink before starting the app:
ln -s blazor.webassembly.js blazor.web.js
This allows both filenames to work, ensuring backward compatibility.
Result: WASM runtime loads correctly in deployed environments.
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
- Add ConnectionStrings__Default env var to deploy_gb.sh for production deployment
- Add DOTNET_PRINT_TELEMETRY_MESSAGE=false to suppress telemetry
- Update E2E tests to support env vars (E2E_BASE_URL, E2E_ADMIN_USERNAME, E2E_ADMIN_PASSWORD)
- Fixes 'Missing connection string' error on new deployments
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>