Run 549's wait-for-health logged two HTTP 502s before its third
attempt succeeded — nginx → docker forwarding hit the new container
during the ~4s window between \`docker run -d\` returning and
Node.js inside finishing its boot. The retry loop covered it but the
log was noisy and a slower boot could blow past the 30×2s budget.
Added a post-run readiness probe inside swap: poll
http://127.0.0.1:${PORT}/ on the host (docker container is published
to 127.0.0.1, runner uses host network mode) until it answers 2xx,
up to 30 attempts × 1s. Skipped under --dry-run so the tests/ci/
shell tests still pass without touching the network.
Net effect: wait-for-url against the public URL now succeeds first
attempt, and the run aborts cleanly if the SSR doesn't come up at
all instead of looking healthy because nginx happens to keep a
warmed connection.
Two design pivots discovered during Phase B prerequisites:
Routing: Replace static-route + NAT plan with persistent ssh -L tunnel
from pve-201 to webzavod (deployment/systemd/flights-tim-tunnel.service).
nginx proxies /api/ and /map/api/ to https://127.0.0.1:8443 with SNI/Host
overrides so cert validation still targets the real hostname. No webzavod
kernel changes (no ip_forward/MASQUERADE), no /etc/hosts pin needed.
Workflow B: Drop Jenkins trigger/poll automation (operator lacks Jenkins
job-configure access and user API token access). release.yml now stops
after MR merge with a Telegram message containing the Jenkins job URL.
release-verify.yml (new, workflow_dispatch only) runs the customer-URL
e2e suite once the operator has triggered Jenkins manually and it has
completed.
Other:
- SSR loopback port 8081 -> 3002 (8081 was taken by openwebui on pve-201)
- notify-telegram.sh skips cleanly when TG secrets unset (was: hard-fail)
- README + spec addendum cover the new prereqs and removed steps