Encodings
Content
Links
Definitions
- Symbol encoding
- establishes rule how symbols\pictures correlate with arithmetic numbers.
(e.g. unicode)
- Character encoding
- establishes rule how numbers (signifying some character) will be encoded in bytes (and written somewhere) and vice versa.
(e.g. UTF-8, UTF-16, …)
Exists a lot of abnormal encodings (e.g. cp1251, …), which are messing up two concepts, enclosing both of them: symbol encoding and character encoding.
Different encode types
- URL encode
- (url must be represented by ascii symbols 0 - 126)
Hello World –> Hello%20%57%6f%72%6c%64 (normal ascii symbols can be represented without encode by choice)
` ` –>+or %20
not ascii symbols: ü –> %C3%BC (utf-8 hex representation)
- HTML entities
- Any symbol can be encoded in decimal
{or in hexģEncoded symbols will be not interpreted by browser as a special symbols.
| ’ ‘ | non-breaking space | |
  |
| < | less than | < |
< |
| > | greater than | > |
> |
| & | ampersand | & |
& |
| ¢ | cent | ¢ |
¢ |
| £ | pound | £ |
£ |
| ¥ | yen | ¥ |
¥ |
| € | euro | € |
€ |
| © | copyright | © |
© |
| ® | registered trademark | ® |
® |
| etc. |
Encoding tricks
-
Encodings latin1, gbk and character escaping
In latin1 string=
%BF%27=¿'After escaping symbol
%27='with%5C=\string=%BF%5C%27In gbk encoding string=
%BF%5C%27=縗'If mysql
SET NAMES gbk;was set, then this encoding trick will help to bypassmysql_real_escape_stringphp function.Similar tricks can be done with next encodings:
big5,cp932,gb2312,gbkandsjis. -
\x90- assembler’s nop-code
Special characters
Special unicode symbols:
- unicode replacement symbol - “\ufffd”
- RTLO - RLO - Right-To-Left override - “0x202E”
|
|
|