How to use
Toggle Encode or Decode at the top and paste your text. Encode replaces the five reserved HTML characters — `&`, `<`, `>`, `"`, `'` — with their entity forms (`&`, `<`, `>`, `"`, `'`). The optional "Encode all non-ASCII" checkbox additionally rewrites every code point outside printable ASCII as a numeric entity (`&#xxxx;`), useful for plain-ASCII templating systems or legacy email pipelines. Decode runs the browser's own HTML parser, so it handles every named entity (`©`, `…`), decimal numeric (`©`), and hex numeric (`©`) form correctly.
Reach for this when escaping or unescaping HTML by hand — pasting user-submitted content into a static page, debugging why a `<` shows literally instead of starting a tag, or inspecting a saved HTML snippet that was already escaped once. Modern frameworks like React, Vue, and Svelte auto-escape interpolated values, so most application code does not need manual encoding. The tool runs entirely in the browser; nothing is uploaded.
FAQ
Why does the encoder use `'` for apostrophe instead of `'`?
`'` is part of XML and HTML5 but not HTML 4. Older browsers (and some legacy parsers still embedded in scrapers and feed readers) display `'` literally instead of decoding it. The numeric form `'` works everywhere, so it is the safer default for content that may flow through unknown rendering paths.
Is HTML entity encoding enough to prevent XSS?
Only in the HTML body context. Each context (HTML body, HTML attribute, JavaScript string, CSS value, URL) needs its own encoding scheme — escaping `<` does nothing if the value lands inside an `onclick=` handler or a `<style>` block. The OWASP XSS Prevention Cheat Sheet lists seven rules covering the common contexts. For any input that crosses contexts, prefer a templating system that auto-escapes per slot (React, Liquid, Mustache) rather than encoding by hand.
HTML entities vs URL percent-encoding — when do I use which?
Different layers, different jobs. HTML entities (`&`) escape characters that the HTML parser would treat as markup — used inside the document body or attribute values. Percent-encoding (`%26`) escapes characters that would break URL syntax — used inside `href`, `src`, or form-submitted query strings. A single `&` in a URL inside an `<a href>` attribute might need both at once: `https://x.com/?a=1&b=2`, where `&` keeps the HTML parser happy and the URL still has a literal `&` after decoding.
Why does decoding seem to handle bizarre entities I have never seen?
The decoder uses the browser's own HTML parser, which knows the full HTML5 named entity set — 2,231 names covering Greek letters, mathematical operators, dingbats, and even joke entities like `Ą` (Ą with ogonek). Anything the browser would render on a normal page also decodes here. Numeric forms (`&#NNN;` and `&#xHH;`) cover every code point up to U+10FFFF.
Is ` ` the same as a regular space?
No — ` ` is U+00A0 (NO-BREAK SPACE) and prevents line breaks at that position. It renders the same width as a normal space in most fonts but counts as a different code point: `"a b".split(" ").length` is 2, while `"a\u00A0b".split(" ").length` is 1. Pasting ` ` into a YAML file, a CSV column, or a SQL query is a classic source of "invisible" parse errors.
Will decoding `<script>` produce a working script tag?
In a text editor or this tool, decoding gives you the literal string `<script>`. Inserting that string into a live page with `innerHTML` would create a script element but it would not execute — the HTML5 spec excludes `<script>` from late insertion. Use a DOMParser plus explicit `eval`-equivalent to get execution, which is exactly why the spec blocks it: easy decoding should not equal easy execution.
Related concepts
HTML entities are textual references that the parser replaces with characters before rendering. Three forms exist. **Named entities** (`&`, `©`, `…`) are mnemonic shortcuts; HTML5 defines 2,231 of them, including most common Greek letters, mathematical operators, and dingbats. **Decimal numeric entities** (`©`) reference a Unicode code point in base 10. **Hex numeric entities** (`©`) do the same in base 16, mandatory for code points above the named-entity set. The numeric forms cover all code points up to U+10FFFF, so any character a browser can render is expressible.
Five characters carry special meaning in HTML syntax and need escaping when they appear as literal text: `&` (starts an entity), `<` (starts a tag), `>` (closes a tag), `"` and `'` (delimit attribute values). The first three are reserved in any HTML context; quote characters are only ambiguous inside attribute values, but escaping them is a safe default. Trying to render a Markdown comparison table with `<` and `>` in code samples without escaping fails for exactly this reason.
The broader topic is **contextual output encoding** — a security pattern, not a syntax detail. Each output context demands its own encoding: HTML body uses entities, HTML attributes use the same entities plus quoting, JavaScript strings use backslash escapes (`\x3C` for `<`), URL components use percent-encoding, CSS values use backslash hex (`\3C`). OWASP's XSS Prevention Cheat Sheet enumerates the rules. The practical guidance: choose a templating system that escapes per slot (React JSX, Liquid `{{ }}`, Go `html/template`) so the developer never decides which encoder to call.