Playbook: LLM Data Poisoning Response
ID: PB-52
Severity: Critical | Category: AI/ML Security
MITRE ATT&CK: AML.T0020 (Poison Training Data), T1565 (Data Manipulation)
Trigger: Model accuracy degradation, unexpected outputs, training data integrity alert, third-party data compromise
Data Poisoning IR Flow
graph LR
Detect["🚨 Detect"] --> Analyze["🔍 Analyze"]
Analyze --> Contain["🔒 Contain"]
Contain --> Eradicate["🗑️ Eradicate"]
Eradicate --> Recover["♻️ Recover"]
Recover --> Lessons["📝 Lessons"]
style Detect fill:#e74c3c,color:#fff
style Analyze fill:#f39c12,color:#fff
style Contain fill:#e67e22,color:#fff
style Eradicate fill:#27ae60,color:#fff
style Recover fill:#2980b9,color:#fff
style Lessons fill:#8e44ad,color:#fff
1. Analysis (Triage)
1.1 Initial Assessment
| Check |
How |
Tool |
| Model performance change |
Compare metrics (accuracy, F1) against baseline |
MLOps monitoring |
| Training data integrity |
Audit recent training data additions |
Data pipeline logs |
| Third-party data sources |
Check for compromised external data feeds |
Vendor notifications |
| Fine-tuning logs |
Review recent fine-tuning/RLHF sessions |
Training platform |
| RAG knowledge base |
Scan for injected/modified documents |
Document versioning |
| Embedding/vector integrity |
Review recent embedding rebuilds and vector store changes |
Vector DB audit logs |
1.2 Poisoning Type Classification
| Type |
Description |
Severity |
| Training data poisoning |
Malicious samples injected into training set |
Critical |
| RAG knowledge poisoning |
Corrupted documents in retrieval pipeline |
High |
| RLHF manipulation |
Biased human feedback during alignment |
Critical |
| Fine-tuning backdoor |
Trigger phrase activates hidden behavior |
Critical |
| Label flipping |
Incorrect labels on training examples |
High |
1.3 Scope Assessment
2. Containment
| # |
Action |
Tool |
Done |
| 1 |
Rollback to last known-good model checkpoint |
MLOps platform |
☐ |
| 2 |
Halt all ongoing training/fine-tuning jobs |
Training orchestrator |
☐ |
| 3 |
Quarantine suspected training data sources |
Data pipeline |
☐ |
| 4 |
Enable enhanced output monitoring |
LLM monitoring |
☐ |
| 5 |
Notify downstream consumers of potential data quality issues |
Communication channels |
☐ |
| 6 |
Freeze automated or third-party data ingestion until provenance is verified |
Data pipeline / ETL |
☐ |
2.2 If RAG Knowledge Base Was Poisoned
| # |
Action |
Done |
| 1 |
Disable affected document collections from retrieval |
☐ |
| 2 |
Audit all documents added in the suspect time window |
☐ |
| 3 |
Restore documents from known-good backup |
☐ |
| 4 |
Re-index clean knowledge base |
☐ |
3. IoC Collection
| Type |
Value |
Source |
| Compromised data source |
|
Data pipeline logs |
| Suspect training samples |
|
Training dataset audit |
| Modified documents |
|
Version control diff |
| Model checkpoint before/after |
|
MLOps registry |
| Affected topics/domains |
|
Output analysis |
4. Escalation Criteria
| Condition |
Escalate To |
| Production model serving poisoned outputs |
CTO + AI Team Lead |
| Customer-facing decisions affected |
Legal + Product |
| Third-party data provider compromised |
Vendor Management + Procurement |
| Regulatory compliance at risk (financial/medical AI) |
Compliance + Legal |
| Deliberate targeted attack confirmed |
CISO + IR Team |
5. Decision Matrix
| Condition |
Decision |
Owner |
SLA |
| Performance anomaly explained by benign model drift or expected data refresh |
Keep service running, monitor, and document |
SOC Analyst + AI Team |
Same business day |
| Suspect poisoned records or documents, but no production impact confirmed |
Freeze ingestion and continue scoped investigation |
Security Engineer + SOC Manager |
30 minutes |
| Production outputs are poisoned or model integrity is untrusted |
Roll back to known-good state and contain immediately |
IR Engineer + AI Team Lead |
Immediate |
| Customer impact, regulated decisions, or third-party provider compromise confirmed |
Notify legal, compliance, procurement, and executives as needed |
SOC Manager + CISO |
Per incident policy |
6. Recovery
7. Post-Incident
Detection Rules (Sigma)
References