All examples for Observability Cost Optimizer

New Relic Full Stack Cost Optimization

Reduce New Relic costs from $1200/month to $600-650/month through log filtering, APM right-sizing, metrics cardinality fixes, and user seat optimization with migration alternatives.

Input

What you provide to the skill

We’re spending $1200/month on New Relic (APM, logs, infrastructure). 15 hosts, 8-person team. Can we reduce this without losing critical visibility?

Output

What the skill generates for you

New Relic Cost Optimization ($1200/month, 15 hosts, 8-person team)

Current Situation Analysis

Monthly cost: $1,200/month ($14,400/year)
Team size: 8 people
Infrastructure: 15 hosts
Services: APM, logs, infrastructure monitoring
Cost per host: ~$80/host/month (well above typical infrastructure monitoring costs)

Based on New Relic’s usage-based pricing model ($0.30/GB for data ingest + user seat costs), your estimated breakdown is likely:

Component	Est. Monthly Cost	% of Total	Est. Volume/Users
Log ingestion	$400-500	33-42%	~1,500 GB/month at $0.30/GB
APM (traces/spans)	$400-500	33-42%	15 hosts with full APM enabled
Infrastructure monitoring	$150-200	13-17%	15 hosts + custom metrics
User seats	$100-200	8-17%	2-4 full platform users at $99-549/user
Total	$1,200	100%	-

Phase 1: Immediate Quick Wins (Week 1, 3-4 hours effort)

Tactic 1: Log Filtering and Sampling (40-60% log cost reduction)

Add drop filters in New Relic to eliminate low-value logs:

# In New Relic: Logs → Data management → Parsing

# Filter 1: Drop debug/trace logs (save ~30-40% log volume)
WHERE level IN ('DEBUG', 'TRACE')
ACTION: Drop

# Filter 2: Drop health check noise (save ~10-15% log volume)
WHERE request.uri IN ('/health', '/healthz', '/ping', '/metrics', '/ready', '/live')
ACTION: Drop

# Filter 3: Sample successful requests (keep 5%, save ~20-25% log volume)
WHERE http.statusCode >= 200 AND http.statusCode < 300 AND duration < 1000
ACTION: Sample at 5%

# Filter 4: Drop verbose cloud provider SDK logs (save ~5-10% log volume)
WHERE logger.name LIKE '%boto3%' OR logger.name LIKE '%aws-sdk%' OR logger.name LIKE '%azure-sdk%'
ACTION: Drop or Sample at 10%

Expected log reduction: 1,500 GB → 600-700 GB (60% reduction)
Estimated savings: $240-270/month

Tactic 2: Right-Size APM Coverage (30-50% APM cost reduction)

Audit which hosts actually need full APM:

# In New Relic: APM & Services → Service Map

# Identify and disable APM on:
1. Development environments (save ~$50-80/month)
2. Staging/test environments (save ~$50-80/month)
3. Internal tools/admin services (save ~$30-50/month)
4. Database replicas (monitor primary only, save ~$20-40/month)

# Keep full APM only on:
- Production application servers
- Critical API services
- Customer-facing web servers

Implementation:

# Remove APM agent from non-production hosts
# For each dev/staging host:
sudo systemctl stop newrelic-infra
sudo systemctl disable newrelic-infra
# Or set in environment:
NEW_RELIC_ENABLED=false  # For dev/staging

Expected APM reduction: 15 hosts → 7-9 production hosts
Estimated savings: $150-200/month

Tactic 3: Reduce High-Cardinality Metrics (15-30% metrics cost reduction)

Identify and fix expensive custom metrics:

# In New Relic: Metrics explorer → Sort by cardinality

# Common high-cardinality culprits:
BAD:  http.requests{user_id:*, session_id:*, request_id:*}
GOOD: http.requests{endpoint:/api/users, method:GET, status:200}

BAD:  cache.operations{key:*}  # Millions of unique keys
GOOD: cache.operations{operation:get, cache_name:redis-main}

BAD:  background.job{job_id:*}
GOOD: background.job{job_type:email_worker, queue:default}

Code fix example (Python):

# Before: Creates millions of unique metric combinations
newrelic.agent.record_custom_metric(
    f'Custom/User/{user_id}/requests', 1
)

# After: Aggregate by user tier instead
newrelic.agent.record_custom_metric(
    f'Custom/UserTier/{user.tier}/requests', 1
)

Expected savings: $50-80/month

Phase 2: Application-Level Changes (Week 2, 4-6 hours effort)

Tactic 4: Reduce Log Verbosity at Source (20-40% additional log reduction)

Update application logging configuration:

Environment variables (fastest approach):

# Production
LOG_LEVEL=WARN              # Instead of INFO or DEBUG
LOG_SAMPLE_RATE=0.05        # Sample INFO logs at 5%
NEW_RELIC_LOG_LEVEL=info    # Reduce agent verbosity

# Staging
LOG_LEVEL=INFO
LOG_SAMPLE_RATE=0.2

# Development
LOG_LEVEL=DEBUG
LOG_SAMPLE_RATE=1.0

Tactic 5: Optimize User Seat Allocation

Review your user assignments in New Relic:

# In New Relic: Account → User management

# Audit current allocation:
- Full Platform Users: 2-4 users at $99-549/user
- Core Users: Typically $49/user
- Basic Users: Free (view-only)

# Optimization:
- Keep only 1-2 Full Platform Users (senior engineers who configure monitoring)
- Downgrade to Core Users for most developers (can view APM, create basic queries)
- Use Basic Users for PMs, support staff, managers (dashboard viewing only)

Expected savings: $50-200/month (depending on current allocation)

Cost Projection After Optimization

Optimization Phase	Est. Monthly Cost	Savings	Cumulative Savings
Current state	$1,200	-	-
After Phase 1 (Week 1)	$750-800	$400-450	33-38%
After Phase 2 (Week 2)	$600-650	$150-200	46-50%

Optimized annual cost: $7,200-7,800 (down from $14,400)
Total annual savings: $6,600-7,200 (46-50% reduction)

Implementation Checklist

Week 1: Quick Wins (3-4 hours)

Add 4 log drop filters in New Relic Data management
Audit APM hosts, disable on dev/staging (save $150-200)
Identify top 5 high-cardinality metrics
Verify changes via New Relic Usage dashboard
Expected result: $1,200 → $750-800/month

Week 2: Deeper Changes (4-6 hours)

Update LOG_LEVEL=WARN in production
Deploy metric cardinality fixes
Optimize user seat allocation
Monitor for 5-7 days
Expected result: $750-800 → $600-650/month

Week 3: Validation (1 hour)

Check key dashboards for missing data
Verify alerts still firing correctly
Review New Relic bill estimate
Document changes for team

Rollback Procedures

If you lose critical visibility:

Re-enable logs:
- Disable drop filter in Data management
- Restore LOG_LEVEL=INFO temporarily
Re-enable APM:
- Reinstall New Relic infrastructure agent on host
- Add service to APM configuration
Restore metrics:
- Revert metric tag changes
- Redeploy application

What You Gain

$6,600-7,200/year savings (46-50% cost reduction)
Maintained visibility for critical production issues
Faster log queries (less indexed data)
Cleaner metrics (lower cardinality = better query performance)

What You Lose

Full dev/staging APM (use production APM for troubleshooting)
Verbose debug logs (use Live Tail when needed)
Per-user metrics (aggregate by cohort/feature instead)

Alternative: Migrate to Grafana Cloud

If $600-650/month still feels expensive for an 8-person team:

Grafana Cloud Free Tier includes:

10K metrics series (vs. limited free tier)
50GB logs/month
50GB traces/month
3 users
14-day retention

For your scale:

Likely $0/month on free tier for 12+ months
Even paid tier: ~$50-100/month (vs. $600-650 optimized New Relic)

Migration effort: 1-2 days
Annual savings: $5,400-7,200 vs. optimized New Relic

View all examples for Observability Cost Optimizer

About This Skill

Reduce monitoring costs (Datadog, New Relic) by 30-70% with optimization strategies, configuration changes, and migration paths to cheaper alternatives.

View Skill Details

More Examples

Datadog Log Cost Reduction

Optimize Datadog log costs from $400/month to $140/month through exclusion filters, agent-level filtering, and retention optimization with copy-paste configurations.

Datadog $2500/month Migration Options Analysis

Comprehensive 5-option analysis for a startup spending $2500/month on Datadog with 100GB daily logs, including optimization, Grafana Cloud, New Relic, SigNoz, and self-hosted paths.

View all examples