Base64 is not encryption: what it actually does
Base64 is a binary-to-text transport encoding, not a security measure. How it works, where it belongs, and why treating it as secrecy is a real mistake.
영문 본문을 표시하고 있습니다. 번역은 준비 중입니다.
Base64 is a way to represent arbitrary binary data using only 64 printable ASCII characters. That is the entire job. It exists because many systems — email, URLs, JSON, XML attributes — were designed to carry text and choke on raw bytes. Base64 is defined in RFC 4648, the same document that specifies Base32 and Base16, and it has no relationship to confidentiality whatsoever. The single most common mistake made with it is assuming otherwise.
How the encoding works
Base64 reads the input three bytes at a time. Three bytes is 24 bits, and 24 divides evenly into four groups of 6 bits. Each 6-bit group indexes into a 64-character alphabet, producing four output characters. So every 3 input bytes become 4 output characters — a fixed 4:3 ratio.
Input bytes: 0x4D 0x61 0x6E ("Man")
Binary: 01001101 01100001 01101110
Regrouped: 010011 010110 000101 101110
Index: 19 22 5 46
Output chars: T W F u ("TWFu")
The standard alphabet is A–Z, a–z, 0–9, then + and / for
values 62 and 63. The = character is reserved for padding and is
never a data value.
Padding handles inputs whose length is not a multiple of 3. If the
final group has only 1 byte (8 bits), it is padded to 12 bits, emitted
as 2 characters, and followed by ==. If it has 2 bytes (16 bits), it
becomes 3 characters followed by a single =. The padding keeps the
output length a multiple of 4, which lets streaming decoders work in
fixed-size blocks.
The cost is size. Four characters for every three bytes is a ~33% expansion. A 1 MB binary becomes roughly 1.37 MB of Base64 text before any line-wrapping overhead. That overhead is the price of passing through a text-only channel, and it is why you do not Base64 things that do not need it.
Where it actually shows up
Base64 is everywhere bytes have to travel through text:
- MIME email. The original motivation. SMTP was a 7-bit protocol;
attachments are 8-bit binary.
Content-Transfer-Encoding: base64is how a PDF survives a mail relay. data:URIs. A small image or font embedded directly in CSS or HTML, e.g.data:image/png;base64,iVBORw0KGgo..., avoiding a separate HTTP request.- Binary in JSON/XML. JSON has no byte type. A field that needs to carry raw bytes — a thumbnail, a certificate, a signature — carries a Base64 string.
- JWT segments. Each of the three dot-separated parts of a JWT is Base64url of JSON (header and payload) or of the raw signature bytes.
- HTTP Basic auth. The
Authorization: Basicheader isbase64(username:password). Note: encoded, not protected — more on that below.
In every one of these cases the goal is transport, not secrecy. The encoding is reversible by design and by anyone.
The base64url variant
The standard + and / characters are hostile to URLs. + is
interpreted as a space in application/x-www-form-urlencoded query
strings, and / is a path separator. Putting standard Base64 in a URL
means it then has to be percent-encoded again, which is exactly the
kind of double-encoding that produces hard-to-debug bugs.
base64url, also defined in RFC 4648, swaps the two offending
characters: - (minus) replaces +, and _ (underscore) replaces
/. Both are URL-safe. Padding is usually omitted entirely, because
the decoder can recover the length from the input modulo 4, and =
itself needs escaping in some contexts.
This is why JWTs use base64url: the token is meant to travel in URLs,
headers, and cookies. If you decode a JWT and see - or _ where you
expected + or /, you are looking at base64url, and a decoder set to
the standard alphabet will reject it.
Base64 is not encryption
This is the part that matters. Base64 provides no confidentiality, no
integrity, and no authentication. It uses no key. Anyone holding the
encoded string can recover the exact original bytes with a single
function call and zero secret material. echo dG9wLXNlY3JldA== | base64 -d prints top-secret on any machine on earth.
Calling Base64 "encoding for safety" or shipping a credential as Base64 "so it's not in plaintext" is a genuine security defect, and it appears in real code reviews and breach post-mortems regularly. HTTP Basic auth is the canonical illustration: the password is Base64, which is why Basic auth is only acceptable over TLS — the TLS layer provides the confidentiality, and Base64 provides none of it. Strip the TLS and the credential is readable on the wire.
The distinction is precise:
| Property | Base64 | Encryption | Hashing |
|---|---|---|---|
| Reversible | Yes, trivially | Yes, with the key | No |
| Requires a key | No | Yes | No |
| Provides confidentiality | No | Yes | N/A |
| Purpose | Transport | Secrecy | Fingerprint / verification |
If you need confidentiality, you need a cipher with a key you actually keep secret — AES-GCM, ChaCha20-Poly1305, or an envelope from a KMS. That ciphertext is still raw bytes, so you may well Base64 it afterward to move it through a text channel. The two operations stack in that order; one never substitutes for the other. The broader map of which tool does what is worth internalizing — the difference between hashing, encryption, and encoding is one of the most consistently confused topics in this space.
Base64 is also not obfuscation in any meaningful sense. It does not slow down an attacker; recognizing and decoding it is automatic. A secret that is "hidden" in Base64 is a secret in plaintext with two extra steps that anyone's tooling performs for free.
When to reach for it, and when not
Use Base64 when binary data must pass through a channel that only tolerates text, and you have measured or accepted the ~33% size cost:
- Embedding a small asset in a
data:URI to save a request. - Putting bytes (a cert, a signature, a blob) into a JSON or XML field.
- Encoding a value for a URL or header — reach for base64url there, not the standard alphabet.
Avoid it when:
- The channel already handles binary. Don't Base64 a file you are
POSTing as
multipart/form-dataor streaming over a binary protocol; you are paying 33% for nothing. - The asset is large. A multi-megabyte image inlined as a
data:URI bloats your HTML, can't be cached separately, and blocks the parser. Serve it as a normal resource. - You are trying to protect something. Reach for a cipher instead.
For the related problem of getting text safely into a URL path or query string — a different transport encoding with a different alphabet and different escaping rules — see URL percent-encoding, which solves a similar transport problem in the specific context of URLs.
When you need to encode or decode a value by hand — inspecting a JWT
payload, decoding a data: URI, or checking what a Basic auth header
actually contains — our
Base64 encoder/decoder handles both the standard and
base64url alphabets in the browser, with nothing sent to a server. It
is the right tool for moving bytes through text. It is not, and was
never meant to be, a way to keep them secret.