If this tool helped you, you can buy us a coffee ☕
Quickly identify text encoding types and fix gibberish text.
Enter text to view character encoding results

URL to JSON Parser
Parse URL strings into structured JSON to quickly extract key information like protocols, parameters, and paths.

Code Compare
Professionally compare differences between two texts or code snippets. Highlights additions, deletions, and modifications to assist with code review, document merging, and version control.

JSON Formatter
Process JSON data online: format, minify, and validate to boost your development and debugging efficiency.

PYC Decompiler
Restore Python bytecode .pyc files into readable source code for easy code auditing and learning. Supports mainstream versions.

URL to JSON Parser
Parse URL strings into structured JSON to quickly extract key information like protocols, parameters, and paths.

Code Compare
Professionally compare differences between two texts or code snippets. Highlights additions, deletions, and modifications to assist with code review, document merging, and version control.

JSON Formatter
Process JSON data online: format, minify, and validate to boost your development and debugging efficiency.

PYC Decompiler
Restore Python bytecode .pyc files into readable source code for easy code auditing and learning. Supports mainstream versions.

JSON to TypeScript Converter
Automatically convert JSON data into TypeScript interfaces or type aliases for frontend data modeling and API integration.
When you open a text file or webpage and see gibberish (mojibake), it is usually because the system decoded it using the wrong encoding format. This tool accurately identifies common encoding types like UTF-8 and GBK by analyzing the text's byte patterns. Character encoding is a system of rules that maps characters to numbers that computers can store; different encodings can interpret the exact same byte sequence in completely different ways.
Why are there multiple possible encodings in the results?
This happens because different encodings can overlap in how they interpret certain byte sequences. The tool displays all possible encodings sorted by confidence score.
How do I fix the "锟斤拷" (replacement character) mojibake?
This is a classic case of GBK encoding being misread as UTF-8. You should use this tool to confirm the actual encoding, then reopen the file using the correct encoding.
Detection accuracy may be lower for short texts (under 50 characters). Binary files cannot be detected for text encoding. For mixed-encoding text, only the primary encoding type can be identified.
We highly recommend always using UTF-8 encoding in development. A typical example: the Chinese characters "你好" (Hello) take up 2 bytes per character in GBK (0xC4E3 0xBAC3) and 3 bytes per character in UTF-8 (0xE4BDA0 0xE5A5BD). You can often make a preliminary guess about the encoding type based on these byte length differences.