Playbook PB-35: Suspicious Data Collection
Severity: High | Category: Collection | MITRE: T1560, T1119, T1115, T1074, T1213, T1005
Decision Flow
graph TD
Alert["🚨 Data Collection Alert"] --> Auth{"👤 Authorized?"}
Auth -->|Normal role| Close["✅ Close"]
Auth -->|Unusual| Volume{"📊 Volume?"}
Volume -->|Normal| Monitor["👁️ Monitor 48h"]
Volume -->|Excessive| Escalate["🔴 Escalate Tier 2"]
Escalate --> Contain["🔌 Isolate + Investigate"]
Data Staging Process
graph LR
Collect["📁 Collect Files"] --> Archive["📦 Archive"]
Archive --> Stage["📂 Staging Dir"]
Stage --> Method{"📡 Exfil Method?"}
Method -->|Web| Cloud["☁️ Cloud Upload"]
Method -->|USB| USB["💾 Removable"]
Method -->|Email| Mail["📧 Email"]
style Collect fill:#3498db,color:#fff
style Stage fill:#f39c12,color:#fff
style Cloud fill:#e74c3c,color:#fff
UEBA Detection Sequence
sequenceDiagram
participant User
participant DLP
participant UEBA
participant SOC
User->>DLP: Access 500+ files
DLP->>UEBA: Volume anomaly
UEBA->>UEBA: Compare baseline
UEBA->>SOC: 🚨 Risk score exceeded
SOC->>SOC: Review data type + user role
SOC->>DLP: Block further access
Description
An attacker gathers sensitive data from within the environment before exfiltration. This includes staging files in temporary directories, accessing SharePoint/OneDrive, archiving data with compression tools, clipboard capture, and automated collection scripts. Collection is a precursor to exfiltration and indicates the attacker has achieved their objective access.
Detection Sources
| Source |
Alert Examples |
| EDR |
Suspicious archiver usage (7z, rar, zip from temp dirs), data staging |
| SIEM |
Bulk file access, unusual SharePoint/OneDrive downloads |
| DLP |
Sensitive data in archives, clipboard monitoring alerts |
| Cloud |
Mass document downloads from Teams/SharePoint/Google Drive |
| Network |
Large internal data transfers, staging to file shares |
Triage Checklist
| # |
Step |
Action |
| 1 |
Identify the user |
Who is collecting data? Authorized role? |
| 2 |
Check data type |
What files/data are being accessed? Sensitive? Classified? |
| 3 |
Volume |
How much data? Unusual volume for this user/role? |
| 4 |
Staging |
Are files being copied to temp directories, USB, or cloud? |
| 5 |
Tools |
Compression tools used? (7z, rar, WinRAR, tar with password) |
| 6 |
Timeline |
Does this correlate with discovery or credential access alerts? |
Response Actions
Tier 1
- Document the user, files accessed, volume, and destination
- Check if user's role justifies the data access pattern
- If unauthorized volume/type → Escalate to Tier 2
Tier 2
- Review full file access history for the user (past 30 days)
- Check for preceding compromise indicators
- Investigate staging locations (temp dirs, file shares, cloud storage)
- If confirmed malicious:
- Isolate the source host
- Revoke cloud/file share access
- Preserve staging artifacts for forensics
- Block egress for the source IP
Tier 3
- Full forensic analysis of staging directories
- Analyze archives for content classification
- Assess data breach scope — what data was collected?
- Coordinate with Legal if PII/sensitive data involved
- Update DLP rules to detect similar patterns
Containment
| Action |
Method |
Approval |
| Isolate source host |
EDR network isolation |
SOC Lead |
| Revoke access |
SharePoint/OneDrive/IAM |
SOC Lead |
| Block egress |
Firewall rules for source IP |
SOC Lead + Network |
| Preserve evidence |
Forensic image of staging location |
IR Lead |
| Disable auto-forward rules |
Exchange admin / Google admin |
SOC Lead |
Eradication
| # |
Action |
Done |
| 1 |
Remove staging archives and temp files |
☐ |
| 2 |
Delete archiving tools not part of baseline |
☐ |
| 3 |
Remove unauthorized scripts (PowerShell, Python) |
☐ |
| 4 |
Clear scheduled tasks created for data collection |
☐ |
| 5 |
Revoke OAuth tokens used for cloud access |
☐ |
| 6 |
Remove malicious email forwarding rules |
☐ |
IoC Collection
| Type |
Value |
Source |
| Source host / IP |
|
EDR / SIEM |
| User account |
|
AD / IAM logs |
| Archive file name |
|
File system / DLP |
| Archive hash (SHA256) |
|
Forensic analysis |
| Staging directory path |
|
EDR telemetry |
| Destination (upload URL / email) |
|
Proxy / email logs |
| Volume of data collected (MB/GB) |
|
DLP / Cloud audit |
| Tools used |
|
Process logs |
Escalation Criteria
| Condition |
Escalate To |
| PII / customer data in staging archive |
Legal + DPO (PDPA) |
| > 1 GB of data archived/staged |
SOC Lead + IR |
| Source code or trade secrets identified |
CISO + Legal |
| Collection followed by confirmed exfiltration |
Tier 3 + CISO |
| Multiple departments' data accessed |
SOC Manager + Data Owners |
| Executive/C-level data targeted |
CISO + CEO |
Recovery
Post-Incident
Key Indicators
| Indicator |
Example |
| Archiver Tools |
7z a -p, rar a -hp, tar -czf, Compress-Archive (PowerShell) |
| Staging Paths |
C:\Temp\, C:\Users\Public\, /tmp/, %APPDATA%\Temp\ |
| Bulk Access |
100+ files accessed in < 10 min, SharePoint bulk download |
| Cloud |
OneDrive sync of entire department folder, Google Takeout |
| Clipboard |
Clipboard capture tools, screen recording software |
| Email |
Mass email forwarding rules, auto-forward to external addresses |
Data Classification Impact
| Data Type |
Risk |
Notification |
| PII (General) |
High |
DPO within 72 hours |
| PII (Sensitive) |
Critical |
DPO immediately, PDPC notification |
| Financial |
High |
CFO, Legal |
| Trade Secrets |
Critical |
CEO, Legal |
| Source Code |
High |
CTO, Engineering Lead |
DLP Architecture
graph LR
Endpoint["💻 Endpoint DLP"] --> SIEM["📊 SIEM"]
Network["🌐 Network DLP"] --> SIEM
Cloud["☁️ Cloud DLP"] --> SIEM
Email["📧 Email DLP"] --> SIEM
SIEM --> SOC["🎯 SOC Alert"]
style Endpoint fill:#3498db,color:#fff
style Cloud fill:#27ae60,color:#fff
style SOC fill:#e74c3c,color:#fff
Insider Data Theft Indicators
graph TD
UEBA["🔍 UEBA"] --> Type{"📋 Indicator?"}
Type --> Resign["📝 Upcoming resignation"]
Type --> Hours["🕐 Off-hours access"]
Type --> Volume["📊 Excessive downloads"]
Type --> USB["💾 USB data copy"]
Resign --> Risk["⚠️ High Risk"]
Hours --> Risk
Volume --> Risk
USB --> Risk
style Risk fill:#e74c3c,color:#fff
Detection Rules (Sigma)
References