Question 1

What is a "Code Point"?

Accepted Answer

A Code Point is essentially the "address" of a character in the Unicode table. It is always written in the format `U+XXXX`.

Question 2

Why does my text show "�" symbols?

Accepted Answer

This character (The Replacement Character) appears when a decoder encounters a byte sequence that is not valid UTF-8. This usually happens when the data was saved in a different encoding, like [ISO-8859-1](https://en.wikipedia.org/wiki/ISO/IEC_8859-1).

Question 3

Is UTF-8 always the best choice?

Accepted Answer

For the web, yes. Over 98% of all websites use UTF-8. It is efficient, backward-compatible, and supported by every modern browser and server.

Question 4

What is the difference between Little Endian (LE) and Big Endian (BE)?

Accepted Answer

This refers to the order in which bytes are stored for multi-byte encodings like UTF-16. LE stores the "small" end first, while BE stores the "big" end first. Our tool handles both for UTF-16/32.

Question 5

Can I convert between different encodings?

Accepted Answer

Yes. You can paste text in one encoding and use the settings to see how it would look in others, such as UTF-16 or traditional [Windows-1252](https://en.wikipedia.org/wiki/Windows-1252).

Question 6

Is there a limit to the length of the string?

Accepted Answer

The tool runs [locally on your computer](https://developer.mozilla.org/en-US/docs/Web/API/Web_Storage_API), so the limit is only restricted by your device's RAM. It can process entire books or massive code files with ease.

Feature	UTF-8 (The Web Standard)	UTF-16 (Windows/Java)	UTF-32 (Memory Internal)
Byte Size	1 to 4 Bytes (Variable)	2 or 4 Bytes (Variable)	4 Bytes (Fixed)
ASCII Compatible?	Yes	No	No
Byte Order Mark?	Optional (Discouraged)	Required (LE or BE)	Required (LE or BE)
Efficiency	High for Western Text	High for Asian Scripts	Low (Wasteful)
Common Use	HTML / Linux / JSON	Java / Windows API	High-speed Indexing

Codificador Unicode

How Codificador Unicode Works

How Unicode Encoding Works

The History of Unicode and the Unicode Consortium

Technical Comparison: UTF-8 vs. UTF-16 vs. UTF-32

Security Considerations: Characters and Homographs

Frequently Asked Questions

Herramientas relacionadas

Buscar herramientas...