flights_web/docs/agent-memory/README.md

# Agent Memory And Prompt Evolution

This directory is the shared, reviewed memory layer for Pi-driven work on Aeroflot Flights Web.

The design follows a three-layer pattern:

- Raw observations: local, append-only session notes and error/fix snippets. Keep private runtime files under `.agent-memory/raw/`.
- Compiled memory: reviewed, structured Markdown under `docs/agent-memory/`.
- Schema and workflows: `.pi/teams/agents/`, `.pi/teams/workflows/`, `.pi/teams/teams/`, `.pi/prompts/`, and this README define how memory is captured, queried, and used to improve prompts.

Do not store secrets, API keys, customer data, credentials, or full private transcripts. Prefer short, sanitized lessons with enough evidence to reproduce the issue.

## Daily Log Format

Daily entries live in `docs/agent-memory/daily/YYYY-MM-DD.md` when they are safe to share with the project.

```markdown
# Daily Agent Memory: YYYY-MM-DD

## Sessions

### Session HH:MM - short-title

**Context:** One sentence about the work.

**Manual Prompts Worth Preserving:**
- Prompt or prompt pattern that improved results.

**Errors And Fixes:**
- Symptom:
- Cause:
- Fix:
- Evidence:

**Decisions Made:**
- Decision and rationale.

**Lessons Learned:**
- Stable lesson, not a one-off accident.

**Prompt/Agent Candidates:**
- Candidate update:
- Target file:
- Confidence:
```

## Compiled Knowledge

- `index.md` is the catalog. Read it first.
- `concepts/` contains stable lessons, preferences, project conventions, and recurring gotchas.
- `connections/` links multiple concepts or workflows.
- `qa/` stores useful answers that should compound into future work.
- `prompt-evolution/` stores proposed prompt and workflow changes before they are applied.
- `prompt-change-log.md` records accepted prompt, agent, and workflow changes.

## Automation

The project-local Pi extension lives at `.pi/extensions/agent-memory.ts`.

It does these things automatically:

- injects the reviewed memory index and prompt change log into each agent turn through `before_agent_start`
- records private raw metrics and session snapshots under `.agent-memory/raw/`
- creates a reviewable pending memory candidate under `.agent-memory/review/pending/` after each completed agent turn
- records provider header latency, agent duration, turn duration, tool counts, prompt submission gaps, and compaction/shutdown events
- detects likely agent questions, starts a blocked-on-user wait timer, and stops it on your next interactive prompt
- exposes memory and timing commands

Raw metrics and review candidates are intentionally gitignored. They are evidence for later compilation, not reviewed memory.

The extension cannot infer true keystroke-level active typing time from Pi's current public extension events. It records explicit active work blocks plus automatic gap metrics:

- `activeWorkMs`: time between `/prompt-start` and `/prompt-stop`, excluding `/prompt-pause` to `/prompt-resume`
- `pauseInclusiveGapMs`: time between previous agent completion and next prompt submission
- `idleExcludedMs`: the portion above the 5-minute idle cap
- `blockedOnUserMs`: waiting time between a detected agent question and your next prompt; this is not counted as active user effort
- `activeAnswerMs`: explicit time between `/answer-start` and `/answer-stop` while you are composing an answer to an agent question
- `agentDurationMs`: prompt submission to `agent_end`
- `turnDurationMs`: one LLM/tool turn duration
- `headerLatencyMs`: provider request to HTTP response headers

Use `npx @ccusage/pi@latest session` and LiteLLM metrics for token/cost and provider-side inference timing.

## Commands

Inside Pi:

```text
/memory-status
/memory-capture [note]
/memory-compile [goal]
/memory-review
/memory-show [filename-fragment]
/memory-approve [filename-fragment]
/memory-discard [filename-fragment]
/memory-clear
/prompt-start [label]
/prompt-pause
/prompt-resume
/prompt-stop
/answer-start [label]
/answer-stop
/blocked-status
/time-report
/pi-remember <lesson-or-error-fix>
/pi-memory <question>
/pi-evolve <evidence-or-goal>
```

`/memory-compile` captures a private snapshot and then asks Pi to run `/pi-evolve` against the latest evidence.

After each agent turn, inspect pending candidates:

```text
/memory-review
/memory-show
```

Approve the latest or matching candidate:

```text
/memory-approve
/memory-approve 20260429T120000Z
```

Approval moves the candidate to `.agent-memory/review/approved/` and launches `/pi-evolve` so reviewed memory and prompt changes can be proposed. Discard candidates with `/memory-discard` or clear all pending candidates with `/memory-clear`.

Use `/prompt-start` when you begin an active prompting/planning block and `/prompt-stop` when you finish. Use `/prompt-pause` before leaving for an extended wait and `/prompt-resume` when you return.

When the agent asks a question and you need time to compose the answer, use `/answer-start` before you start thinking/writing and `/answer-stop` after you send the answer. The extension also starts a blocked-on-user wait timer automatically when the final assistant message looks like a direct question. That wait time stops on your next interactive prompt and is reported separately from active user work time.

`/blocked-status` shows whether the automatic blocked-on-user timer is currently running. `/time-report` writes a private report under `.agent-memory/reports/`.

## Guardrails

- Memory can suggest prompt changes; it must not silently rewrite prompts.
- Prompt changes require a critic/reviewer pass and `/team-validate`.
- Commit prompt changes on a feature branch.
- Prefer small, testable prompt changes over broad rewrites.
- If a lesson is only true for one feature, store it with that scope.
- If evidence is weak, classify it as `hypothesis`, not `rule`.