Files
clawsec/wiki/exploitability-scoring.md
davida-ps 073e771b73 Exploitability Context for CVE Advisories (#89)
* feat(advisories): add exploitability context for CVE advisories

* fix(ci): align exploitability workflow with signing model

* docs(skills): add patch release changelog entries

* chore(clawsec-feed): bump version to 0.0.5

* chore(clawsec-suite): bump version to 0.1.4

* fix(clawsec-nanoclaw): align exploitability handling and nanoclaw integration

* chore(clawsec-nanoclaw): bump version to 0.0.2

* refactor(scripts): share feed path and mirror sync helpers

* refactor(utils): unify cvss vector parsing flow

* refactor(clawsec-nanoclaw): centralize advisory risk evaluation

* docs(exploitability): refresh release metadata dates

* fix(review): align feed signing and advisory dedupe

* chore(clawsec-feed): bump version to 0.0.6

* chore(clawsec-nanoclaw): bump version to 0.0.3

* fix(backfill): limit signing to target feed only

* fix(review): keep skill runtime verify-only and dedupe matching

* chore(clawsec-nanoclaw): bump version to 0.0.4

* chore(skills): align versions with published tags

* feat(feed): enrich local population with exploitability analysis

* docs(exploitability): mark backfill as historical flow
2026-03-01 18:43:24 +02:00

421 lines
14 KiB
Markdown

# Exploitability Scoring Methodology
## Overview
ClawSec's exploitability scoring system provides context-aware vulnerability assessment specifically designed for AI agent deployments (OpenClaw/NanoClaw). Unlike generic CVSS scores that treat all environments equally, our scoring considers the unique attack surface and usage patterns of AI agents to reduce alert fatigue and prioritize actionable threats.
## Scoring Levels
| Level | Severity | Meaning |
|---|---|---|
| `high` | Critical/High | Exploitable in typical agent deployments, immediate attention required |
| `medium` | Medium | May be exploitable depending on configuration, warrants investigation |
| `low` | Low | Limited exploitability in agent context, low priority |
| `unknown` | Unknown | Insufficient data to assess exploitability |
## Scoring Factors
### 1. CVSS Base Score (Baseline)
The analysis starts with the CVSS base score as a foundation:
- **CVSS ≥ 9.0**: Critical severity → initial score `high`
- **CVSS 7.0-8.9**: High severity → initial score `high`
- **CVSS 4.0-6.9**: Medium severity → initial score `medium`
- **CVSS 1.0-3.9**: Low severity → initial score `low`
- **No CVSS**: → initial score `unknown`
### 2. Attack Vector Analysis (CVSS Metrics)
The analyzer parses CVSS v2, v3.0, and v3.1 vectors to assess:
#### Network Accessibility
- **AV:N** (Network): Remotely exploitable over network
- **AV:A** (Adjacent): Requires local network access
- **AV:L** (Local): Requires local system access
- **AV:P** (Physical): Requires physical access
**Impact on agents**: Network-accessible vulnerabilities are elevated because agents typically run as network services or make external API calls.
#### Authentication Requirements
- **PR:N / Au:NONE**: No authentication required → elevates score
- **PR:L / Au:SINGLE**: Low privileges required
- **PR:H / Au:MULTIPLE**: High privileges required → reduces score
**Impact on agents**: Unauthenticated exploits are critical for publicly exposed agent APIs.
#### User Interaction
- **UI:N**: No user interaction required → elevates score
- **UI:R**: Requires user interaction → reduces score
**Impact on agents**: Agents often operate autonomously, so vulnerabilities requiring user interaction are less critical.
#### Attack Complexity
- **AC:L**: Low complexity → elevates score
- **AC:M / AC:H**: Medium/High complexity → neutral or reduces score
**Impact on agents**: Low-complexity exploits are more likely to be automated and used in mass attacks.
### 3. Vulnerability Type (Deployment Context)
ClawSec adjusts scores based on how vulnerability types affect AI agent deployments:
#### High-Risk Types in Agent Context
**Remote Code Execution (RCE)**
```
Score: Always HIGH
Rationale: RCE is critical in agent deployments
```
AI agents execute arbitrary code as part of their function. RCE vulnerabilities allow attackers to hijack agent execution flow, exfiltrate credentials, or pivot to other systems.
**Server-Side Request Forgery (SSRF)**
```
Score: Elevated to HIGH if CVSS ≥ 6.0
Rationale: SSRF affects agents making external requests
```
Agents frequently call external APIs, access internal services, and fetch remote resources. SSRF allows attackers to:
- Access internal cloud metadata services (AWS IMDSv1, GCP metadata)
- Pivot to internal networks
- Exfiltrate data through DNS tunneling
**Path Traversal / Directory Traversal**
```
Score: Elevated to HIGH if CVSS ≥ 6.0
Rationale: Path traversal affects agents with file access
```
Agents read files, execute scripts, and manage codebases. Path traversal enables:
- Reading sensitive configuration files (.env, credentials)
- Accessing SSH keys, API tokens
- Overwriting critical system files
**Command Injection**
```
Score: Always HIGH
Rationale: Command injection is critical in agent deployments
```
Similar to RCE, agents often execute shell commands to interact with systems. Command injection allows full system compromise.
#### Medium-Risk Types
**Prototype Pollution (Node.js)**
```
Score: Elevated from LOW to MEDIUM
Rationale: Prototype pollution can escalate in Node.js agents
```
Many agent frameworks run on Node.js. Prototype pollution can lead to:
- Bypass of authentication checks
- Privilege escalation
- Denial of service
**SQL Injection / NoSQL Injection**
```
Score: Elevated to HIGH if network-accessible and unauthenticated
Rationale: Injection affects agents with database access
```
Agents that store conversation history, user data, or tool results in databases are vulnerable to injection attacks.
#### Lower-Risk Types
**Cross-Site Scripting (XSS)**
```
Score: Reduced to MEDIUM if not network-accessible
Rationale: XSS has limited impact in headless agents
```
Agents typically don't render HTML in browsers, reducing XSS impact. However, XSS in agent management UIs or chat interfaces remains a concern.
### 4. Exploit Availability
When `--check-exploits` is enabled, the analyzer checks reference URLs for public exploits:
**Exploit Indicators:**
- exploit-db.com / exploit-database.com
- packetstormsecurity.com
- github.com/exploit, github.com/poc
- metasploit framework modules
- URLs containing "/exploit", "/poc", "/proof-of-concept"
**Score Elevation:**
- `low``medium` (exploit available)
- `medium``high` (exploit available)
- `unknown``medium` (exploit available + CVSS > 0)
**Rationale**: Public exploits lower the skill barrier for attackers and increase the likelihood of automated exploitation.
## Scoring Algorithm
The analyzer follows this decision tree:
```
1. Parse CVSS score → set baseline (high/medium/low/unknown)
2. Parse CVSS vector → analyze attack characteristics
3. Adjust for attack vector:
- Network-accessible + no auth + no UI → elevate to HIGH
- Local-only access → reduce HIGH to MEDIUM
4. Adjust for vulnerability type:
- Check against agent-specific risk categories
- Elevate or reduce score based on deployment context
5. Check for public exploits (if enabled):
- Elevate score if exploits detected
6. Generate rationale explaining the final score
```
## Examples
### Example 1: Critical RCE (High Exploitability)
```json
{
"cve_id": "CVE-2024-12345",
"cvss_score": 9.8,
"cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:H",
"type": "remote_code_execution",
"description": "Unauthenticated RCE in Express.js framework"
}
```
**Analysis Output:**
```json
{
"exploitability_score": "high",
"exploitability_rationale": "Critical CVSS score (9.8); remotely exploitable without authentication; RCE is critical in agent deployments"
}
```
**Why HIGH**: Critical CVSS + network accessible + no auth + RCE type.
### Example 2: SSRF in Agent API (High Exploitability)
```json
{
"cve_id": "CVE-2024-23456",
"cvss_score": 7.3,
"cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:L/A:L",
"type": "server_side_request_forgery",
"description": "SSRF in webhook handler allows internal network access"
}
```
**Analysis Output:**
```json
{
"exploitability_score": "high",
"exploitability_rationale": "High CVSS score (7.3); remotely exploitable without authentication; SSRF affects agents making external requests"
}
```
**Why HIGH**: SSRF is critical for agents that make API calls (most do). Network-accessible without authentication elevates risk.
### Example 3: Path Traversal with Public Exploit (High Exploitability)
```json
{
"cve_id": "CVE-2024-34567",
"cvss_score": 6.5,
"cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:N/A:N",
"type": "path_traversal",
"references": [
"https://exploit-db.com/exploits/51234",
"https://nvd.nist.gov/vuln/detail/CVE-2024-34567"
]
}
```
**Analysis Output (with --check-exploits):**
```json
{
"exploitability_score": "high",
"exploitability_rationale": "Medium CVSS score (6.5); network accessible; path traversal affects agents with file access; public exploit available (1 source)"
}
```
**Why HIGH**: Path traversal + agent file access + public exploit elevates medium CVSS to high exploitability.
### Example 4: XSS in Agent UI (Medium Exploitability)
```json
{
"cve_id": "CVE-2024-45678",
"cvss_score": 7.1,
"cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:C/C:L/I:L/A:L",
"type": "cross_site_scripting",
"description": "Stored XSS in agent management dashboard"
}
```
**Analysis Output:**
```json
{
"exploitability_score": "medium",
"exploitability_rationale": "High CVSS score (7.1); network accessible; XSS has limited impact in headless agents"
}
```
**Why MEDIUM**: Despite high CVSS, XSS is less critical in agent deployments (headless operation). Requires user interaction.
### Example 5: Local Privilege Escalation (Medium Exploitability)
```json
{
"cve_id": "CVE-2024-56789",
"cvss_score": 8.8,
"cvss_vector": "CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:C/C:H/I:H/A:H",
"type": "privilege_escalation",
"description": "Local privilege escalation via symbolic link attack"
}
```
**Analysis Output:**
```json
{
"exploitability_score": "medium",
"exploitability_rationale": "High CVSS score (8.8); requires local access"
}
```
**Why MEDIUM**: Despite high CVSS, requires local access. Agents typically run in containerized/sandboxed environments where local escalation has limited impact.
### Example 6: Prototype Pollution with Exploit (High Exploitability)
```json
{
"cve_id": "CVE-2024-67890",
"cvss_score": 5.3,
"cvss_vector": "CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:L/A:N",
"type": "prototype_pollution",
"description": "Prototype pollution in lodash merge function",
"references": [
"https://github.com/exploit/prototype-pollution-poc",
"https://snyk.io/vuln/SNYK-JS-LODASH-1234567"
]
}
```
**Analysis Output (with --check-exploits):**
```json
{
"exploitability_score": "high",
"exploitability_rationale": "Medium CVSS score (5.3); remotely exploitable without authentication; prototype pollution can escalate in Node.js agents; public exploit available (1 source)"
}
```
**Why HIGH**: Prototype pollution in Node.js agents + public exploit + network-accessible without auth = high risk despite moderate CVSS.
## Usage in ClawSec Workflows
### Automated Scoring (NVD Feed)
The `poll-nvd-cves.yml` workflow automatically scores new CVEs:
```bash
# Workflow step
python utils/analyze_exploitability.py --json --check-exploits < cve-data.json
```
Advisories in `advisories/feed.json` can include:
```json
{
"id": "CVE-2024-12345",
"severity": "high",
"exploitability_score": "high",
"exploitability_rationale": "Critical CVSS score (9.8); remotely exploitable without authentication; RCE is critical in agent deployments",
"attack_vector_analysis": {
"is_network_accessible": true,
"requires_authentication": false,
"requires_user_interaction": false,
"complexity": "low"
}
}
```
### Manual Analysis
Security researchers can analyze CVEs manually:
```bash
# Basic analysis
echo '{"cve_id":"CVE-2024-12345","cvss_score":7.3,"type":"ssrf"}' | \
python utils/analyze_exploitability.py --json
# With exploit detection
echo '{"cve_id":"CVE-2024-12345","cvss_score":7.3,"references":["https://exploit-db.com/exploits/51234"]}' | \
python utils/analyze_exploitability.py --json --check-exploits
```
### Filtering by Exploitability
Users can filter advisories by exploitability score:
```bash
# Get only high-exploitability advisories
curl -s https://clawsec.prompt.security/feed.json | \
jq '.advisories[] | select(.exploitability_score == "high")'
# Prioritize by exploitability and severity
curl -s https://clawsec.prompt.security/feed.json | \
jq '[.advisories[] | select(.exploitability_score == "high" and .severity == "critical")] | sort_by(.cvss_score) | reverse'
```
## Backfilling Existing Advisories (Historical Maintenance)
`scripts/backfill-exploitability.sh` is retained as a historical maintainer utility for one-off repository maintenance.
It is not the primary path for normal advisory generation.
Preferred paths:
1. CI canonical path: run the NVD workflow with init/reset to rebuild advisories from NVD and sign artifacts in pipeline.
2. Local developer path: run `./scripts/populate-local-feed.sh --force` to repopulate local feeds with exploitability context.
Use backfill only when explicitly repairing legacy feed content that already exists in-repo.
## Community Contributions
Community members can submit exploitability assessments:
1. **Report via GitHub Issue**: Use the advisory template to report CVEs with exploitability context
2. **Automated Analysis**: The `community-advisory.yml` workflow automatically scores community-reported CVEs
3. **Manual Review**: Maintainers review and approve exploitability assessments
4. **Feed Update**: Approved advisories are added to the feed with exploitability scores
## Limitations and Future Work
### Current Limitations
1. **Static Analysis**: Scoring is based on CVE metadata, not dynamic runtime analysis
2. **No Version Detection**: Doesn't check if specific versions are vulnerable
3. **Binary Classification**: Doesn't consider partial mitigations or defense-in-depth
4. **Limited Context**: Doesn't know exact agent configuration or deployed tools
### Future Enhancements
1. **EPSS Integration**: Incorporate EPSS (Exploit Prediction Scoring System) probability scores
2. **KEV Matching**: Cross-reference with CISA KEV (Known Exploited Vulnerabilities) catalog
3. **Agent Profiling**: Consider deployed agent capabilities and exposed APIs
4. **Mitigation Detection**: Check for WAF rules, sandboxing, or other compensating controls
5. **ML-Based Scoring**: Use machine learning to predict exploitability based on historical data
## References
- **CVSS v3.1 Specification**: [https://www.first.org/cvss/v3.1/specification-document](https://www.first.org/cvss/v3.1/specification-document)
- **CVSS v2 Guide**: [https://www.first.org/cvss/v2/guide](https://www.first.org/cvss/v2/guide)
- **EPSS**: [https://www.first.org/epss/](https://www.first.org/epss/)
- **CISA KEV**: [https://www.cisa.gov/known-exploited-vulnerabilities-catalog](https://www.cisa.gov/known-exploited-vulnerabilities-catalog)
- **NVD API**: [https://nvd.nist.gov/developers/vulnerabilities](https://nvd.nist.gov/developers/vulnerabilities)
## Contributing
To improve the exploitability scoring methodology:
1. **Submit Test Cases**: Add test cases to `utils/analyze_exploitability.py`
2. **Report False Positives/Negatives**: Open GitHub issues with CVE examples
3. **Propose Scoring Adjustments**: Submit PRs with rationale and examples
4. **Share Agent Context**: Contribute agent-specific vulnerability patterns
See [CONTRIBUTING.md](../CONTRIBUTING.md) for detailed contribution guidelines.
---
**Maintained by**: [Prompt Security](https://prompt.security)
**License**: AGPL-3.0-or-later
**Last Updated**: 2026-03-01