pve-201 Deployment Runbook
This is the bootstrap procedure for hosting https://ui-dashboard.gnerim.ru/ on pve-201, plus rehearsal recipes for the CI/CD pipeline failure paths. The full design rationale lives in docs/superpowers/specs/2026-04-25-cicd-pipeline-design.md.
One-time setup
1. SSH tunnel pve-201 → webzavod (TIM API access)
The customer WAF on flights.test.aeroflot.ru only accepts requests from corp-VPN egress IPs. nginx proxies /api/ and /map/api/ to https://127.0.0.1:8443, which is forwarded over SSH to webzavod (which terminates the corp VPN on ppp0). A systemd unit keeps the tunnel up.
On webzavod (192.168.88.58) — append the pve-201 pubkey to ~gnezim/.ssh/authorized_keys with permitopen restricting it to one host:port (one-time, read pve-201's ~gnezim/.ssh/id_rsa.pub first):
command="exit 1",no-pty,no-X11-forwarding,no-agent-forwarding,no-user-rc,permitopen="flights.test.aeroflot.ru:443" ssh-rsa AAAA…== pve-201-flights-tim-tunnel
On pve-201 — install + enable the systemd unit:
cd /path/to/Aeroflot.Flights.Web
sudo cp deployment/systemd/flights-tim-tunnel.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now flights-tim-tunnel.service
sudo systemctl status flights-tim-tunnel.service --no-pager
Smoke test:
ss -ltn | grep ':8443\b' # expect: a 127.0.0.1:8443 LISTEN line
curl -k --resolve flights.test.aeroflot.ru:8443:127.0.0.1 \
-o /dev/null -w 'swagger: %{http_code}\n' \
https://flights.test.aeroflot.ru:8443/swagger/index.html # expect 401
curl -k --resolve flights.test.aeroflot.ru:8443:127.0.0.1 \
-o /dev/null -w 'api/health: %{http_code}\n' \
https://flights.test.aeroflot.ru:8443/api/health # expect 200
If swagger returns 200 with HTML body instead of 401, the tunnel is bypassed and the request egressed directly — fix the listener / SSH unit before proceeding.
2. nginx vhost
cd /path/to/Aeroflot.Flights.Web
sudo cp deployment/nginx/ui-dashboard.gnerim.ru.conf /etc/nginx/sites-available/
sudo ln -sf /etc/nginx/sites-available/ui-dashboard.gnerim.ru.conf /etc/nginx/sites-enabled/
sudo mkdir -p /etc/nginx/htpasswd
sudo nginx -t
sudo systemctl reload nginx
The htpasswd file is created by scripts/ci/install-htpasswd.sh on first deploy.
3. Gitea runner setup
The runner must be in the docker group (so it can talk to the Docker socket without sudo) and reach all upstream services:
sudo usermod -aG docker <runner-user> # then re-login the runner service
docker ps # must work without sudo for the runner user
Reachability checks the runner must pass:
curl -fsS https://git.gnerim.ru/ # Gitea
curl -fsSI https://teamscore.gitlab.yandexcloud.net/ # GitLab
The customer Jenkins URL and the customer site (flights-ui.devwebzavod.ru) are NOT reachable from the runner directly — Workflow B does not call them. Customer-side e2e (Workflow C, release-verify) only runs after the operator has manually triggered the Jenkins build, and it reaches the customer URL the same way the upstream API is reached: direct egress where possible, or through additional tunnels added on demand.
4. GitLab Personal Access Token
GitLab → User Settings → Access Tokens → create with scopes api and write_repository. Store as Gitea Actions secret GITLAB_PAT.
5. Allow self-approve on GitLab project
GitLab → flights-front project → Settings → Merge requests → Approval rules → uncheck "Prevent approval by author" (skip if you can already approve your own MRs in the GitLab UI).
Verify by running (locally, after PAT is in place):
GITLAB_PAT=<pat> ./scripts/ci/check-gitlab-project.sh
It prints the numeric project ID (store as GITLAB_PROJECT_ID secret) and confirms self-approve is allowed.
6. Telegram bot (optional)
Use existing bot or create via @BotFather. Get the chat_id by sending a message and querying https://api.telegram.org/bot<TOKEN>/getUpdates. Store as TELEGRAM_BOT_TOKEN and TELEGRAM_CHAT_ID.
If either secret is unset, all notify-telegram.sh calls in the workflows skip cleanly with no error — the pipeline runs end-to-end without Telegram configured.
7. Gitea Actions secrets summary
Repo → Settings → Actions → Secrets — set all of:
| Secret | Required | Purpose |
|---|---|---|
BASIC_AUTH_USER, BASIC_AUTH_PASS |
yes | nginx htpasswd for ui-dashboard.gnerim.ru |
MAP_TILE_URL |
optional | Default /map/api/tile/{z}/{x}/{y}.jpeg |
API_BASE_URL |
optional | Default /api |
GITLAB_PAT, GITLAB_PROJECT_ID |
yes (release only) | GitLab MR API |
TELEGRAM_BOT_TOKEN, TELEGRAM_CHAT_ID |
optional | Notifications |
GITHUB_TOKEN |
auto | Provided by Gitea Actions — no manual setup required |
Jenkins is triggered manually after the release workflow merges to GitLab; no Jenkins secret is required.
Verifying failure paths
Run at least the rollback and "release blocked" rehearsals once before declaring the pipeline production-grade.
A: e2e fail → rollback
Push a commit that adds console.error('rehearsal') somewhere that runs on every page (e.g. src/routes/layout.tsx). Workflow A runs, e2e fails on the console-gate, rollback to :previous triggers. Verify:
- Telegram message:
❌ ci-deploy FAILED at step "Run Playwright e2e" — rolled back to <prev-sha> https://ui-dashboard.gnerim.ru/still serves the previous version (check the page ordocker inspect flights-web).
Revert the rehearsal commit when done.
A: rollback itself fails
ssh pve-201 'docker rmi flights-web:previous'
Then push a commit that fails e2e. Rollback step finds no :previous and bails. Verify:
- Telegram message:
🔥 ci-deploy ROLLBACK FAILED — site is DOWN https://ui-dashboard.gnerim.ru/returns 502.- Manual recovery:
ssh pve-201 'docker stop flights-web 2>/dev/null; docker rm flights-web 2>/dev/null; docker run -d --name flights-web --restart unless-stopped -p 127.0.0.1:3002:8080 flights-web:<known-good-sha>'.
B: blocked on A not green
Trigger Workflow B (manual or tag) for a SHA that has no green Workflow A run. Verify:
- Telegram message:
⚠️ release blocked — workflow ci-deploy is not green for <sha> - B exits early; nothing changes in GitLab.
Manual recovery scenarios
Workflow B succeeded but Jenkins build failed
GitLab is at the new commit; customer site is stale. Recovery:
- Open Jenkins UI → check the failing build's console log
- Fix the issue (in this repo if it's our bug, in customer's infra otherwise)
- Push fix → Workflow A → Workflow B → trigger Jenkins again
Container running but nginx returns 502
Check the bind:
ssh pve-201
docker ps --filter name=flights-web
curl -v http://127.0.0.1:3002/ # should return 200 (or whatever the SSR root returns)
sudo nginx -t && sudo systemctl reload nginx
If the container died, the Restart policy unless-stopped should bring it back. If not:
docker logs flights-web --tail 200
docker stop flights-web 2>/dev/null; docker rm flights-web 2>/dev/null
docker run -d --name flights-web --restart unless-stopped -p 127.0.0.1:3002:8080 flights-web:current
TIM tunnel is down (502 on /api/* but / works)
sudo systemctl status flights-tim-tunnel.service --no-pager
sudo journalctl -u flights-tim-tunnel.service -n 50 --no-pager
sudo systemctl restart flights-tim-tunnel.service
ss -ltn | grep ':8443\b' # confirm listener is back
If the tunnel won't come up, verify SSH key is still authorised on webzavod and that webzavod's ppp0 is up (ssh webzavod 'ip -br addr show ppp0').