Run 546 surfaced the second half of the cache-poisoning bug. /api/health
(which goes through the /api/ location, not /api/dictionary/) showed
`x-cache-status: STALE` text/html — meaning nginx had cached the WAF
HTML block page as a 200 entry, then served it via proxy_cache_use_stale
when the upstream returned 403 on a fresh fetch. The browser saw
text/html for an endpoint that should be JSON, console-gate flagged the
fail, and 5+ specs broke despite /api/dictionary/* being healthy.
Fix is the same one-liner already applied to /api/dictionary/: require
$no_cache_html (set in flights-api-cache.conf based on upstream's
Content-Type) so HTML responses are never stored. Future WAF spasms
return 403 directly to the client instead of dispensing months-old
poisoned HTML.
Run 544 failed because the /api/dictionary/* nginx cache had been
poisoned with the upstream WAF's HTML block page (HTTP 200 + text/html,
"Доступ к сайту временно ограничен"). The previous pre-warm step only
checked %{http_code}, so the WAF response looked valid and got cached
for the full 6h TTL — every subsequent SSR render then resolved city
names via that HTML, breadcrumbs showed raw IATA codes, and 7 schedule
e2e specs failed.
Three changes that together close this hole:
1. ci-deploy pre-warm: two-step warm with body validation. Step 1 is
a cache-bust query (?_=ns timestamp) that proves upstream is healthy
independent of nginx cache. Step 2 fetches the canonical URL and
validates the response is JSON (starts with [/{ and is >1KB). If
the canonical body is HTML, retry once with `Cache-Control:
no-cache` to force a fresh upstream fetch (works once the matching
nginx config below is deployed); if still HTML, fail loudly with a
manual-purge instruction so the operator can rm the cache files.
2. nginx /api/dictionary/ location: add `proxy_cache_bypass
$http_cache_control` so the CI workflow can force-refresh on demand,
and `proxy_no_cache $no_cache_html` so HTML responses are never
stored in the first place.
3. flights-api-cache.conf: add `map $upstream_http_content_type
$no_cache_html` that flips to "1" when upstream returns text/html.
Drives the `proxy_no_cache` filter above.
Note: the nginx changes only take effect after setup-pve201.sh is
re-run on pve-201. Until then, any cache poisoning still stays poisoned
until the 6h TTL expires (or manual purge).
The smoke test was getting 403 from the upstream WAF (rate-limit on
webzavod's egress IP). 403 doesn't indicate a tunnel/routing problem
— it confirms the egress IP IS the WAF-recognized one and is being
throttled. Don't abort the rest of setup over a transient throttle;
the only response that should hard-fail is HTTP 200 with HTML body
(WAF interstitial), which means the tunnel was bypassed.
Adds a workflow step that fetches the four dictionary endpoints
(world_regions, countries, cities, airports — see api.ts) before
playwright runs. With the longer 6h TTL on /api/dictionary, every
e2e spec hits cache for the same 4 URLs that drive most of the
data-driven tests (breadcrumb city names, etc).
2s sleeps between warm-up calls keep the cold-cache pass under the
WAF rate-limit window.
(A) Add proxy_cache zone for ui-dashboard.gnerim.ru. /api/ caches 200 for
1m, /map/api/ for 24h. proxy_cache_use_stale serves cached content during
upstream errors (incl. 403 from WAF rate limit). proxy_cache_lock collapses
concurrent fetches for the same URI. Cache zone declared in conf.d/ (must
be in http{} context).
(B) Playwright workers=2, retries=2 in CI. Cuts the parallel burst that
trips the WAF before nginx cache warms up; retries handle the residual
flake.
setup-pve201.sh now installs the conf.d cache file and pre-creates the
cache dir with nginx-user ownership.
Two design pivots discovered during Phase B prerequisites:
Routing: Replace static-route + NAT plan with persistent ssh -L tunnel
from pve-201 to webzavod (deployment/systemd/flights-tim-tunnel.service).
nginx proxies /api/ and /map/api/ to https://127.0.0.1:8443 with SNI/Host
overrides so cert validation still targets the real hostname. No webzavod
kernel changes (no ip_forward/MASQUERADE), no /etc/hosts pin needed.
Workflow B: Drop Jenkins trigger/poll automation (operator lacks Jenkins
job-configure access and user API token access). release.yml now stops
after MR merge with a Telegram message containing the Jenkins job URL.
release-verify.yml (new, workflow_dispatch only) runs the customer-URL
e2e suite once the operator has triggered Jenkins manually and it has
completed.
Other:
- SSR loopback port 8081 -> 3002 (8081 was taken by openwebui on pve-201)
- notify-telegram.sh skips cleanly when TG secrets unset (was: hard-fail)
- README + spec addendum cover the new prereqs and removed steps
Deployed build had MAP_TILE_URL truncated to 'https://.../tile/{z' —
Leaflet then URL-encoded it to '%7Bz' and fetched garbage tile paths.
Root cause: build-docker.sh used
: "\${MAP_TILE_URL:=https://flights.test.aeroflot.ru/map/api/tile/{z}/{x}/{y}.jpeg}"
and bash parameter expansion terminates the default value at the
FIRST unescaped '}', leaving '{z' and discarding the rest. The env
passed to `pnpm build:standalone` was already truncated, so every
downstream step (base64 encode → HTML inject → client decode) faithfully
carried the broken value through.
Fix by moving the defaults to Dockerfile's ARG lines — ARG defaults
are plain strings, not shell-parsed — and simplify build-docker.sh to
only forward MAP_TILE_URL / API_BASE_URL as --build-arg when the
caller explicitly sets them. Quote the k8s env values for defensive
YAML hygiene as well.
Two gaps blocked http://flights-ui.devwebzavod.ru/ru/flights-map:
1. The inline <script>window.__ENV__=...</script> was written with the
Leaflet tile template ('/map/api/tile/{z}/{x}/{y}.jpeg') embedded
directly. Rspack's html-plugin pre-processes the children string and
ate the '{z}' placeholder, truncating the injected JS literal to
'/map/api/tile/{z' — MAP_TILE_URL on the client ended up broken and
getEnv() fell back to the default.
Escape every '{'/'}' inside the stringified value as '\u007B'/'\u007D'.
JS decodes the Unicode escapes back to '{}' at parse time; the html
plugin's template engine sees no placeholders to eat. Object-literal
braces outside the string stay raw (Unicode escapes aren't valid in
operator positions in JS source).
2. API_BASE_URL was still hard-defaulting to 'http://localhost:8080/api',
so every dictionary fetch on the deployed cluster died with
ERR_CONNECTION_REFUSED. Thread API_BASE_URL through the same
PUBLIC_ENV_KEYS/html.tags path as MAP_TILE_URL, add matching Docker
ARG/ENV, and forward it in deployment/build-docker.sh + k8s manifest.
The devwebzavod default for both is https://flights.test.aeroflot.ru
— where the real Aeroflot ingress terminates /map/api/** and /api/**.
Prod keeps overriding with same-origin URLs.
http://flights-ui.devwebzavod.ru/ru/flights-map was still hitting the
same-origin tile path after adding the k8s env: Modern.js renders the
<Suspense> fallback on the server (i18n isn't preloaded), so the route
component that reads getEnv() never actually runs during SSR. The page
hydrates client-side, where process.env is Rspack's empty stub and
MAP_TILE_URL is never set — getEnv() falls back to the default.
Move the value into window.__ENV__ instead:
- modern.config.ts: inline a <script> into html.tags that sets
window.__ENV__ = { MAP_TILE_URL: <value> } at SSR-server startup.
The snippet is authored once at server boot, so the HTML template
baked into dist/standalone/html/main/index.html always carries the
pod's tile URL.
- src/env/index.ts: merge window.__ENV__ on top of process.env so the
browser prefers the injected value (process.env only has NODE_ENV
after Rspack's polyfill).
- Dockerfile.react: accept MAP_TILE_URL as a build ARG and expose it
as ENV before `pnpm build:standalone`, so Modern.js picks it up when
building the HTML template. k8s env still flows into the Node SSR
process; the build-arg path guarantees correctness even when the
runtime env is stripped.
- deployment/build-docker.sh: forward MAP_TILE_URL through as a
build-arg (default keeps the same-origin path). CI on the
devwebzavod cluster can export MAP_TILE_URL=https://flights.test.aeroflot.ru/map/api/tile/{z}/{x}/{y}.jpeg
before running build-docker.sh and the resulting image will serve
tiles from the upstream the real Aeroflot ingress terminates.
The flights-front deploy repo ships k8s manifests at deployment/k8s/,
a sibling of Aeroflot.Flights.Front/. Previously the sync script only
copied the app source, so any env change landed on the k8s side had
to be hand-edited in the deploy repo and was never reflected back.
- Bring deployment/k8s/flights-ui.yaml into this repo (with the new
MAP_TILE_URL env pointing at flights.test.aeroflot.ru) so the
cluster config lives next to the code that reads it.
- sync-to-flights-front.sh resolves the deploy-repo root from the
target path and mirrors this repo's deployment/ directory there,
mkdir'ing and copying contents without wiping unrelated files.
- Bump step numbering (1/6..6/6) and the summary now lists the synced
deployment files in addition to the app files.