© Adrian Sutton It looks like a pretty simple string and all. It should be encoded as: %C2%A9%20Adrian%20Sutton assuming UTF-8 character encoding (and I literally mean assuming since there’s no possible way to know for sure). If however you were to use the javascript escape() function you could get any one of: %u00A9+Adrian+Sutton %C2%A9%20Adrian+Sutton %u00A9%29Adrian%20Sutton It’s impossible to tell if the + sign in the first two is an encoded space or an actual plus sign (there’s no requirement for + to be escaped in URIs so many implementations leave it as is). Then you have to deal with the rather odd %u00A9 syntax which seems to be half URI escaping, half HTML entity and finally you get to worry about which character set was in use.
For the record, here’s what your browser makes of it: Category: Code and Geek Stuff
Jeroen Wenting says:
which is of course something else again :-)
%A9%20Adrian%20Sutton
Eugen Konkov says:
>there’s no requirement for + to be escaped in URIs
You are not right
http://www.ietf.org/rfc/rfc2396.txt
reserved = “;” | “/” | “?” | “:” | “@” | “&” | “=” | “+” |
“$” | “,”
you MUST enconde the ‘+’ sign!
Adrian Sutton says:
Eugen,
It’s entirely academic – quite a few implementations *don’t* escape + in URLs so you have no idea if it’s a + or an incorrectly encoded space. The majority of the time, + means space in a URL but there are exceptions.