Http.urlencode adding an extra character

I believe I found a bug in the http.urlencode() function. When converting the symbol for degrees (ALT+0176) to (%b0) an extra character (%c2) shows up.

message = "99°F"; server.log("Raw Message: "+message); local body = http.urlencode({Body=message}); server.log("URLEncoded: "+body);

Raw Message: 99°F URLEncoded: Body=99%c2%b0F

This is UTF-8 encoding for the degree symbol; see here:

http://www.utf8-chartable.de/

Interesting. I’m using the agent to send data to a PHP script that dumps data into a MySQL database. This PHP script is in use by another device too, which is working fine. Data from the imp has an extra character in SQL, so I’m trying to figure out where it is coming from.

1=82.2 °F
I'll keep looking.

Edit: Looks like this is what happens when UTF-8 encoded data is read using ISO-8859-1 encoding. For more details, see vb.net - HTML encoding issues - "Â" character showing up instead of " " - Stack Overflow

I’ve had the same. But pasting the degree sign from an other source fixed it. I suppose it like when you copy " from a pdf file some code editors don’t recognise it

Looks like your “other device” uses ISO-8859-1 encoding. The imp uses UTF-8 encoding. Your PHP script will have to convert one way or the other, so that the database sees only one consistent encoding.

Peter

This should help: http://php.net/manual/en/function.utf8-decode.php