Log Cost Reduction Strategy for Datadog (200GB, $400/month)

Current Situation Analysis

Log ingestion: 200GB/month
Cost: $400/month (~$2.00/GB blended rate)
Target: $200/month (100GB or less)
Required reduction: 50%

Based on Datadog’s 2026 pricing model:

Ingest cost: $0.10 per GB (collect, process, archive)
Index cost: $1.70 per million log events (15-day retention)
Your $2.00/GB blended rate indicates you’re paying both ingestion AND indexing costs

Cost Audit Findings

Based on typical Datadog usage patterns, your 200GB likely breaks down as:

Log Source	Est. Volume	% of Total	Usually Needed?
Application logs (INFO level)	60GB	30%	Partially
Web server access logs (200 OK)	70GB	35%	Only errors
Debug/trace logs	30GB	15%	Rarely
Health check requests	20GB	10%	No
Infrastructure logs	20GB	10%	Yes

Optimization Strategy (Prioritized by Impact)

Tactic 1: Exclusion Filters (40-50% reduction, 30 min setup)

Add these filters in Datadog → Logs → Configuration → Indexes:

Filter 1: Drop debug/trace logs

Query: @level:debug OR @level:trace
Action: Exclusion filter with 100% sampling
Estimated reduction: 30GB (15%)

Filter 2: Drop health check spam

Query: @http.url_details.path:/health* OR @http.url_details.path:/ping OR @http.url_details.path:/metrics
Action: Exclusion filter with 100% sampling
Estimated reduction: 20GB (10%)

Filter 3: Sample successful requests (keep 10%)

Query: @http.status_code:[200 TO 299] NOT @duration:>1000
Action: Exclusion filter with 90% sampling (keeps 10%)
Estimated reduction: 63GB (32%)

Filter 4: Drop verbose third-party library logs

Query: @logger_name:boto3 OR @logger_name:aws-sdk OR @logger_name:urllib3
Action: Exclusion filter with 95% sampling
Estimated reduction: 10GB (5%)

Total reduction from filters: 123GB → 77GB remaining (61% reduction)

Tactic 2: Agent-Level Filtering (Additional 10-15% reduction, 1 hour)

Filter logs before they reach Datadog to avoid ingestion costs entirely. Configure your Datadog Agent with log_processing_rules:

Example configuration (datadog.yaml or service config):

logs:
  - type: file
    path: /var/log/application/*.log
    service: your-service
    source: python
    log_processing_rules:
      # Exclude health checks completely
      - type: exclude_at_match
        name: exclude_healthchecks
        pattern: /health|/ping|/metrics
      
      # Exclude debug logs
      - type: exclude_at_match
        name: exclude_debug
        pattern: "level\":\"debug\"|level\":\"trace\""
      
      # Exclude successful requests with fast response times
      - type: exclude_at_match
        name: exclude_fast_success
        pattern: "status\":200.*duration\":[0-9]{1,3}\\b"

Expected reduction: 10-15GB additional savings (77GB → 65-70GB)

Tactic 3: Use Archives + Rehydration (Cost optimization without data loss)

Keep all logs but only index what you need for active searching:

Enable archiving to S3/GCS/Azure (included with $0.10/GB ingestion)
Index only critical logs (errors, warnings, slow requests)
Rehydrate on-demand when investigating issues ($0.10/GB scan)

Configuration:

Go to Logs → Configuration → Archives
Set up archive to your cloud storage (S3/GCS/Azure)
Use exclusion filters to prevent indexing low-value logs
Logs remain queryable via Live Tail and can be rehydrated when needed

Cost impact: Save on indexing costs ($1.70 per million events) while keeping full archive

Cost Projection After Optimization

Optimization Phase	Log Volume	Monthly Cost	Savings
Current state	200GB	$400	-
After Tactic 1 (exclusion filters)	77GB	$154	$246 (62%)
After Tactic 2 (agent filtering)	65GB	$130	$270 (68%)
After Tactic 3 (index optimization)	200GB ingested, 50GB indexed	$95*	$305 (76%)

*Tactic 3 calculation: (200GB × $0.10 ingest) + (50GB × $1.70/million events ≈ $75 index) = $95/month

Recommended approach: Combine Tactic 1 + Tactic 2 + Tactic 3

Result: $95-130/month (52-68% reduction)
Annual savings: $3,240-3,660

Implementation Plan

Week 1: Add Exclusion Filters (30 minutes)

Go to Datadog → Logs → Configuration → Indexes
For each index, click Add exclusion filter
Add the 4 exclusion filters from Tactic 1 above
Monitor daily log volume: Logs → Usage
Verify critical logs still arriving: Check key dashboards and monitors
Expected result: 200GB → 77GB

Week 2: Configure Agent-Level Filtering (1 hour)

Identify your Datadog Agent configuration location
- Docker: Update container environment or datadog.yaml
- Kubernetes: Update ConfigMap or pod annotations
- Host-based: Edit /etc/datadog-agent/datadog.yaml
Add log_processing_rules from Tactic 2 above
Restart Datadog Agent
Monitor for 3-5 days to ensure no critical logs are dropped
Expected result: 77GB → 65GB

Week 3: Enable Archiving + Optimize Indexing (1 hour)

Set up Logs → Configuration → Archives
- Choose your cloud storage (S3/GCS/Azure)
- Configure bucket and credentials
Review exclusion filters to maximize non-indexed but archived logs
Test rehydration with a sample query
Expected result: Full archive, reduced indexing costs

Monitoring Your Progress

Create a Datadog dashboard to track optimization:

Metric 1: Daily log ingestion volume

sum:datadog.estimated_usage.logs.ingested_bytes{*} by {service}

Metric 2: Estimated monthly cost

sum:datadog.estimated_usage.logs.ingested_bytes{*}.rollup(sum, 2592000) / 1e9 * 2

Metric 3: Top log sources (identify volume hogs)

Top list: datadog.estimated_usage.logs.ingested_bytes by service

Metric 4: Indexed vs ingested logs

sum:datadog.estimated_usage.logs.ingested_events{*}
sum:datadog.estimated_usage.logs.indexed_events{*}

Rollback Procedures

If you lose critical visibility:

Rollback exclusion filters:

Go to Logs → Configuration → Indexes
Toggle filter to disabled (don’t delete immediately)
Wait 5 minutes for logs to flow
Re-evaluate filter query to be more specific

Rollback agent-level filtering:

# Remove or comment out log_processing_rules
# Restart Datadog Agent
sudo systemctl restart datadog-agent

What NOT to filter out:

Error logs (4xx, 5xx responses)
Security events (auth failures, suspicious activity)
Business-critical events (purchases, signups, payments)
Slow performance indicators (requests >1s duration)
Exceptions and stack traces

What You Gain

$3,240-3,660/year savings (52-68% cost reduction)
Faster log queries (less indexed data = faster search)
Same critical visibility (errors, security, business events preserved)
Full log history (via archives for compliance/forensics)
Cleaner dashboards (less noise, easier to spot real issues)

What You Lose

Full request tracing (sampled 10% instead of 100% for successful requests)
Debug logs in production (use Live Tail when needed, or rehydrate from archives)
Immediate searchability of all logs (non-indexed logs require rehydration)

Alternative: Migrate to Grafana Cloud

If Datadog remains expensive after optimization, consider Grafana Cloud:

| Feature | Datadog (Optimized) | Grafana Cloud Free | Grafana Cloud Pro |
|———|———————|––––––––––|––––––––––||
| Logs (70GB/month) | $140 | $0 (50GB free) | $0.50/GB = $10 |
| Metrics | Included | $0 (free tier) | $0 (included) |
| APM traces | Separate cost | $0 (50GB free) | $0.50/GB |
| Retention | 15 days | 14 days | 30 days |
| Total monthly | $140 | $0-10 | $10-35 |

Migration effort: 1-2 days for basic setup
Annual savings: $1,560-1,680 vs optimized Datadog

Datadog Log Cost Reduction

Input

Output