HTML Entity Encoder/Decoder
Escape or unescape HTML entities like &, <, >
How to HTML Entity Encoder/Decoder Online
Convert text to and from HTML entities to safely embed user-supplied strings inside markup.
- Paste your text or HTML-encoded string into the input field.
- Pick options: leave "Prefer named entities" on for human-readable output, or turn it off if you need pure numeric references.
- Enable "Escape all non-ASCII characters" if your target system mangles UTF-8 (older email systems, some XML toolchains).
- Click "Encode" to convert raw text into safe HTML, or "Decode" to read back an already-encoded string.
- Copy the output to your clipboard using the Copy button.
- Use "Clear" to reset both fields and start fresh.
About HTML Entity Encoder/Decoder
HTML entities are a way to refer to a character without using the character itself. They exist because a few characters — primarily `<`, `>`, and `&` — have structural meaning inside HTML and XML. If you want to display the literal text `a < b` on a web page, you cannot just write it: the browser will look for a tag. Instead you write `a < b`, and the browser replaces `<` with a less-than sign during rendering. The same trick lets you put double quotes inside an attribute value (`"`), embed ampersands without breaking query-string-style URLs (`&`), or display arbitrary Unicode characters using a numeric reference like `€` for the euro sign.
There are two flavors of references. **Named entities** like `&`, `©`, and `—` are short mnemonics defined by the HTML spec. They are easy to read but the set is fixed — there are only about 2,200 of them. **Numeric references** like `&` (decimal) and `&` (hexadecimal) can address any Unicode code point at all, including emoji and CJK characters. Numeric references are universally supported by parsers, which is why this tool falls back to them whenever a character does not have a well-known name.
Three real-world uses keep coming up. First, **rendering user input safely**: if you have a comment box and want to display whatever someone typed without giving them control over your HTML, encode the input before inserting it. Second, **emailing HTML through old gateways**: some mailers will strip or replace non-ASCII bytes; "Escape all non-ASCII characters" converts every code point above 127 into a numeric reference that survives even the worst transcoders. Third, **debugging scraped HTML**: when you paste markup from "view source" and see `‘` everywhere, decoding shows you the actual smart quotes the page is using.
A common mistake worth highlighting: this is *not* sanitization. Encoding turns characters into entities so they display as text — it cannot decide which tags should survive in a piece of user-supplied HTML. If you are accepting markup from users (rich text editor input, for example), pair this tool's logic with a sanitizer such as DOMPurify before publishing. For the much more common case of "I have a string and I want to drop it into HTML as plain text," encoding is exactly what you need.
The encoder and decoder run entirely in your browser using a tiny, dependency-free function (no server, no third-party library). The named-entity table is bundled with the page and the numeric path uses `String.fromCodePoint`, which round-trips even astral characters like emoji losslessly.
Related Tools
Frequently Asked Questions
Which characters does encoding actually escape?
By default the tool escapes the five XML-significant characters: `<`, `>`, `&`, `"`, and `'`. Those are the only characters that can change how a browser parses HTML or how a server parses XML/JSON-in-attributes. Turn on "Escape all non-ASCII characters" to also convert letters above code point 127 (accented letters, emoji, CJK, etc.) to numeric references — useful when you need to email-safe a payload or store it through a system that mangles UTF-8.
Why is `'` shown as `'` instead of `'`?
`'` was deliberately left out of HTML 4 and is only added back in HTML5. Many parsers and email clients still do not understand it, so the safe, universally-supported representation is the numeric reference `'`. The decoder still accepts `'` on input.
What is the difference between `&#x7B;` and `&#123;`?
Nothing — both refer to the same character (`{`, code point 123). `&#xNN;` is the hexadecimal form and `&#NN;` is the decimal form. The HTML spec treats them identically. Decoding accepts either; encoding emits the decimal form because it is more compact for the ranges where this tool falls back to numeric refs.
Will this safely sanitize untrusted HTML?
No — encoding is for *display* (showing user input as text). It is not the same as sanitizing untrusted markup. If you are accepting HTML from users and want to keep some tags but strip dangerous ones, use a sanitizer like DOMPurify. Use this tool when you want to take a piece of user input and render it verbatim inside HTML.
How are emoji like 😀 encoded?
Emoji are single Unicode code points above 0xFFFF (in the "astral planes"). With "Escape all non-ASCII" enabled, 😀 becomes `😀` — a single numeric reference, not two surrogate halves. The decoder reverses this with `String.fromCodePoint`, so the round-trip is lossless.
Are there named entities the decoder does not know?
Yes. The HTML5 spec defines around 2,200 named entities; this tool ships with the ~35 most common (the five XML ones plus copy, reg, trade, hellip, mdash, smart quotes, currency symbols, math symbols, etc.). Anything else passes through unchanged. You can always fall back to numeric references, which are universally supported.
Is anything sent to a server?
No. Encoding and decoding run in your browser via plain JavaScript string operations. No network requests, no logs, no telemetry. You can confirm in your browser DevTools.