0xC2 inserted on string, found in agent side

My new issue is that when I’m sending a binary string containing raw bytes. It is wrapped in JSON from a HTML web page, and when the imp005 agent receives and decodes the JSON, it appears a 0xC2 character gets inserted. My javascript contains code like this (inspired by examples in the imp webpage):

		function sendRawData(settings){
			$.ajax({
				async: true,
				type: "POST",
				url : agentURL + '/raw',
				data: JSON.stringify(settings),
				dataType: "json",
				success : function(response) {
					//if ('locale' in response) {
						//$('.locale-status span').text(response.locale);
					//}
					alert(response.d);
				}
			});
		}

My agent code looks like this on the other hand:

api.post(“/raw”, function(context) {
try {
local data = http.jsondecode(context.req.rawbody);
local len, idx;

    if ("bytes" in data) { 
        //server.log("data received is: " + data);
        foreach(key, value in data) {
            if (key == "bytes") continue;
            
            server.log(key + ": " + data[key]);
            device.send(key, data[key]);
        }
        device.send("raw", data.bytes);
        context.send(200, "Okay");
    } else {
        context.send(500, "Request JSON error receiving raw bytes");
    }
} catch (err) {
    context.send(500, "Bad data posted: " + err);
}

});

Now, I found out it is related to UTF-8 encoding (in JSON?). I’ve observed 0xC2 is inserted whenever I have bytes whose value > 0x7f. Not sure where it gets inserted, whether from web page side or at the agent side. Not sure how can I remove or prevent this from happening. I checked some similar problems here in the forum and I’m reading this:

https://discourse.electricimp.com/t/receiving-a-json-data/6138

As someone coming from firmware background, I’m quite new to this high-level web software abstractions, as well on electric imp. I’m not sure how to do the Base64 encoding on both sides (web and agent?). That is if I want to completely eliminate this issue of 0xC2 insertions. Another thing I can think of is to traverse the string for its individual bytes, check for 0xC2 and check if its succeeding byte is 0x80 or greater, then remove the 0xC2. But I think this latter solution is not that clean?

So how to keep this 0xC2 problem from happening? I certainly can’t limit the bytes I’m sending to 0x7f and below since it came from application layer itself and have no control to what raw bytes they send to me (HTML page).

I think this has your answer: Squirrel Data Serialization | Dev Center

Because of the way such data is encoded for transmission, string data sent to the device (or from device to agent) has to be ‘safe’, ie. encoded in Ascii or UTF-8 and contains no embedded NULs.

This does not apply to blobs, so you probably want:

local b = blob(data.bytes.len());
b.writestring(data.bytes);
device.send("raw", b);

PS. Base64 on the agent is done with http.base64encode() and http.base64decode().

@smittytone thanks for the Base64 reply. I have to intecept the 0xC2 at the agent. Is the below sequence correct?

[HTML SIDE] Base64 encode serial data → JSON encode ------>
------>
[AGENT SIDE] http.jsondecode() —> http.base64decode() → blob containing original data

Stlll checking which JS method to use for base64 encoding. btoa seems inappropriate as the input should be string which in my case it’s not (must be serial data).

Not quite. If I’ve read your original post correctly, the 0xC2 is turning up at the device. Just send the binary data from the agent to the device as a blob, not as a string (see my first post above). I don’t think you need to bother with base64 unless the data is getting (separately) mangled at the server end, but I don’t think that’s the case.

[HTML SIDE] → data as string → [AGENT] → data as blob [DEVICE]

The pointer to the base64 docs was more an FYI.

Hi @smittytone, I’d like to clarify that the 0xC2 is observed to show up first in the agent, not at the device, whenever I sent the binary from the HTML to the agent.

Ah, I misunderstood. Then yes, your sequence is correct. This may help: javascript - Convert blob to base64 - Stack Overflow

Thanks. I’ll be trying this.

Is this encoding and decoding pair also a must for image data if I’m moving image data from HTML to agent in order to avoid 0xC2?

Like, with an HTTP POST? These are binary clean, but it’s not unusual for other things in the path to mangle data (like: don’t know how you’re calling the agent API and with what tool: UTF-8 may strike again there too).

Hence using base64 is likely to reduce headaches, yes.

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.