Reading the AWS Cost Explorer CSV: a field-by-field guide

What every column in an AWS Cost Explorer CSV export actually means, the schema gotchas that bite first-time readers, and how to slice the file for the answers you usually want.

영문 본문을 표시하고 있습니다. 번역은 준비 중입니다.

The AWS Cost Explorer CSV export is the most useful and the most confusing artifact in AWS billing. Useful because it contains the full granularity AWS will give you outside of CUR; confusing because the column names are not always what you expect, and the same logical data shows up under different headers depending on which dimensions you grouped by in the UI. This post walks the schema column by column and notes the half-dozen things that catch most people on first read.

How the file gets generated

Cost Explorer exports follow the dimensions you chose in the UI at export time. Group by service and the CSV has a service column; group by usage type and the column changes accordingly. The exception is the time series — the file always carries one row per group-by-value per time bucket (typically day or month), so a 30-day daily export with 50 services produces ~1,500 rows. The date-and-cost pair is always there; everything else depends on what you filtered and grouped by.

This matters more than it sounds. Two exports that look like they ask the same question — "what was my monthly spend last quarter?" — can have different schemas if one grouped by Linked Account and the other by Service. Tools that parse the CSV need to be tolerant of the variation.

Column-by-column reference

Service

The AWS service name as it appears in the bill — "Amazon Elastic Compute Cloud - Compute", "Amazon Simple Storage Service", etc. These are long strings, not the short service codes you see in the API (AmazonEC2, AmazonS3). Cost Explorer is consistent within an export but the names change occasionally as AWS renames services internally; "Amazon RDS Service" became "Amazon Relational Database Service" at one point, and old reports still carry the previous name.

Linked account

The 12-digit AWS account ID that incurred the charge, when you are on a Consolidated Billing org. The display name from AWS Organizations does not appear here — only the numeric ID. If you grouped by Linked Account, this is the row identifier; otherwise it may be aggregated out entirely.

Usage Type

The single most informative column. AWS encodes service, region, and resource-specific detail into a slash-separated string here: USE1-BoxUsage:m5.xlarge, APN1-DataTransfer-Out-Bytes, EUC1-NatGateway-Bytes. The three-letter prefix is an AWS region code abbreviated to airline-style identifiers (USE1 = us-east-1, APN1 = ap-northeast-1, EUC1 = eu-central-1). The portion after the prefix encodes what was charged. For most diagnostic work, sorting by Usage Type tells you what is happening faster than the Service column.

Operation

The API operation that produced the charge, when AWS can attribute it that way. For S3 this is invaluable — GetObject/PutObject/ListBucket charges accumulate at different rates. For EC2 the column is often RunInstances for everything, which is less useful. The column is present in every export but populated for only ~60% of usage types.

Resource ID

Only populated when you explicitly enable resource-level data in Cost Explorer (it costs extra: $0.01 per 1,000 resource lookups). When present it carries the ARN: arn:aws:ec2:us-east-1:123:instance/i-abc, arn:aws:s3:::mybucket/path/. The biggest single privacy consideration in the CSV — these ARNs let anyone reading the file identify your accounts, buckets, and instances. Strip this column before sharing or pasting into any third-party tool.

Tag columns

For each AWS tag you have enabled as a cost-allocation tag, Cost Explorer adds a column prefixed aws: (system tags) or user: (user-defined). A tag named Environment with values prod, dev, staging gives you a user:Environment column. Untagged rows have an empty value here — that empty value is also a tag value, just an implicit one. "Untagged" is a useful filter to find resources that escaped the tagging policy.

UsageStartDate / UsageEndDate

The window the row covers, in UTC. For daily reports the dates are typically 2025-05-01 00:00:00 UTC to 2025-05-02 00:00:00 UTC — note the end is exclusive, not inclusive. Off-by-one errors here are the leading cause of "my totals do not match" support tickets.

UsageQuantity and Unit

How much was used. The Unit column tells you what to multiply: GB-Mo for storage, Hrs for compute, Requests for API calls. Combining quantity and unit gives you the actual usage figure even when the cost column has been adjusted for a discount or credit.

Cost columns

There are usually two: Cost (sometimes called UnblendedCost) is what the charge actually was, and AmortizedCost reflects the smoothed value for Reserved Instances and Savings Plans purchased up front. For accrual-style reporting use amortized; for cash-flow reporting use unblended. The difference is often 10–20% of the total and confuses people who forget to pick a side and stay with it.

pricing/term and pricing/unit

When present these tell you whether the line was billed at on-demand, reserved, or spot rates, and at what per-unit price. Mostly useful for diagnosing the "trial expired" cause of bill spikes (covered separately — see #7).

Three patterns that surprise readers

Negative costs are real. Credits, RI savings, and corrections produce rows with negative numbers in the Cost column. Sum without filtering them out and you understate the spend; filter them in or out consistently.

The total in the file may not match the total in the AWS bill. Cost Explorer rounds rows independently and the rounding can leave a sub-dollar gap. For exact reconciliation you need the Cost and Usage Report (CUR), which is row-level S3 export rather than CSV.

Account-level reports drop the Service column. If you grouped by Linked Account only, every service rolls up to a single row per account per day. That is what you want for chargeback but useless for diagnosing a spike — re-export grouped by Service.

Working with the file

For one-off exploration, Excel or Google Sheets works fine if the export is under 50,000 rows. Beyond that, a Python pandas notebook or DuckDB CLI is more pleasant. Either way the first three useful queries are: group by Service and sort by Cost descending; filter to the top service and group by Usage Type; filter to the top usage type and group by Linked Account or tag. Those three drills resolve most spikes.

For a quicker pass without writing SQL, the AWS Billing Analyzer takes the CSV directly, applies the field interpretations above, and surfaces the top contributors at each level. Files stay in your browser — relevant because the Resource ID column is in there.