Did I screw-up my remote imp006 into needing a manual power cycle to reconnect to network?

@terrencebarr Unfortunately, the rm.send() call is throwing device-side serializer errors every time and strangely is not working. The errors I get are:

|2020-09-02T14:30:24.484 +00:00|[Device]|ERROR: _nexti returned an invalid idx|
| — | — | — |
|2020-09-02T14:30:24.505 +00:00|[Device]|ERROR: in _write …imp#serializer.class.nut#1.0.0:164|
|2020-09-02T14:30:24.535 +00:00|[Device]|ERROR: from _serialize …imp#serializer.class.nut#1.0.0:132|
|2020-09-02T14:30:24.535 +00:00|[Device]|ERROR: from _serialize …imp#serializer.class.nut#1.0.0:132|
|2020-09-02T14:30:24.567 +00:00|[Device]|ERROR: from serialize …imp#serializer.class.nut#1.0.0:26|
|2020-09-02T14:30:24.584 +00:00|[Device]|ERROR: from write …ashlogger.device.lib.nut#2.2.0:143|
|2020-09-02T14:30:24.591 +00:00|[Device]|ERROR: from _persistMessage …messenger.device.lib.nut#0.2.0:286|
|2020-09-02T14:30:24.600 +00:00|[Device]|ERROR: from _onSendFail …messenger.device.lib.nut#0.2.0:232|
|2020-09-02T14:30:24.657 +00:00|[Device]|ERROR: from _processQueues …messenger.device.lib.nut#0.2.0:258|

The initialization code is:

cm <- ConnectionManager({"stayConnected": true, "connectTimeout": 600.0});
imp.setsendbuffersize(8096);
local sfl = SPIFlashLogger()
rm  <- ReplayMessenger(sfl, cm, {"resendLimit": 900});

The send code is:

rm.send("0", dBlob, RM_IMPORTANCE_HIGH)

The full code is:

#require "Serializer.class.nut:1.0.0"
#require "SPIFlashLogger.device.lib.nut:2.2.0"
#require "ConnectionManager.lib.nut:3.1.1"
#require "Messenger.lib.nut:0.2.0"
#require "ReplayMessenger.device.lib.nut:0.2.0"
cm <- ConnectionManager({"stayConnected": true, "connectTimeout": 600.0});
    imp.setsendbuffersize(8096);
    local sfl = SPIFlashLogger()
    rm  <- ReplayMessenger(sfl, cm, {"resendLimit": 900});

imp.setpowersave(false);
    uart <- hardware.uartXEFGH;
    lastPacket <- 0
    server.log("start");
    uart.setrxfifosize(150*20)

function processpacket(dBlob) {
    dBlob.seek(0, 'b')
    local tHeader = dBlob.readn(105) 
    if(tHeader > 1 && tHeader != 9999) {
        if(lastPacket == 0 || tHeader != lastPacket) {
                lastPacket = tHeader
                rm.send("0", dBlob, RM_IMPORTANCE_HIGH)
                //agent.send("RM_DATA", dBlob)
            }
    }
}

imp.net.getcellinfo(function(cellInfo) {server.log(cellInfo) });

server.log("Initialized")
buffer <- blob();
rxtimeout <- null;

function rx() {
    buffer.writeblob(uart.readblob());
    if (rxtimeout != null) imp.cancelwakeup(rxtimeout);
    rxtimeout = imp.wakeup(0.050, sendbuffer);
} 
function sendbuffer() {
    rxtimeout = null;
    if (buffer.len() >= 150) processpacket(buffer);
    buffer.resize(0)
}
uart.configure(115200, 8, PARITY_NONE, 1, NO_TX | NO_CTSRTS, rx);

Any ideas why ReplayMessender.send() is throwing exceptions?

Thank you for your help.

The Serializer is throwing an exception at

|2020-09-02T14:30:24.505 +00:00|[Device]|ERROR: in _write …imp#serializer.class.nut#1.0.0:164|
which is

without deep investigation on my side I suspect the serializer code is iterating through your payload blob and tripping over special character. Not sure what’s contained in your payload but the Squirrel serialization rules are here:

What I would do is to replace the Serializer library at the top of your code (#require "Serializer.class.nut:1.0.0") by manually coping the source code of the Serializer class from

so that you can add some debugging statements into the _write() function. This should help you figure out at which point of the blob things go wrong.

Hope that helps. Best,
– Terrence

I suspect this is a lifetime issue; I think the blob is being passed by reference, and when it wants to serialize it, the blob no longer exists (or has been resized to zero bytes long, which is what my code does - it’s a global in this case).

Try changing this:

                rm.send("0", dBlob, RM_IMPORTANCE_HIGH)

to

                rm.send("0", dBlob.readblob(), RM_IMPORTANCE_HIGH)

…this will create a new blob with the same content and pass that to the replaymessenger, hence decoupling it from the global at the time you want to queue it.

1 Like

Thanks @hugo! I will try this. I was going to base64 the blob to circumvent this but was disappointed in that solution due to the extra data usage. I will try .readblob(). Thanks for your great support.

After many frustrations, I finally got ReplayMessenger to temporarily work for 24 hours. I was so happy and it was working just great. This library was going to go a huge way to attracting me over to imp006 from Particle (I know it’s not really fair to judge hardware based on available software, but the fact that I have limited time and resources make that a reality).

The ReplayMessenger library has since started throwing endless crashes and errors, preventing all uploading of my packets while it is used.

The same code that was running stably for about 24 hours and loyally reuploading the missing packets after disconnect randomly started generating these errors, which would continuously reset the device:
2020-09-07T16:10:49.718 +00:00 [Status] Device connected
2020-09-07T16:10:50.397 +00:00 [Device] start
2020-09-07T16:10:50.398 +00:00 [Device] Initialized
2020-09-07T16:10:50.429 +00:00 [Device] 2,0
“CAT-M1”,-74,-108,93,-16
“CAT-M1”,“310410”,“LTE BAND 12”,5110
2020-09-07T16:10:51.167 +00:00 [Agent] Packet from 600a0c2a69e60474 sent. Timestamp: 1599494884
2020-09-07T16:10:51.599 +00:00 [Device] Received ack for message temp with id: 2160
2020-09-07T16:10:54.179 +00:00 [Agent] Packet from 600a0c2a69e60474 sent. Timestamp: 1599458294
2020-09-07T16:10:54.450 +00:00 [Agent] Packet from 600a0c2a69e60474 sent. Timestamp: 1599458354
2020-09-07T16:10:54.725 +00:00 [Agent] Packet from 600a0c2a69e60474 sent. Timestamp: 1599458414
2020-09-07T16:10:56.331 +00:00 [Device] ERROR: Exception reading logger object address 0x5b809 with length 50022: CRC err: 0xca != 0x7f
2020-09-07T16:10:56.334 +00:00 [Device] ERROR: the index ‘p’ does not exist
2020-09-07T16:10:56.335 +00:00 [Device] ERROR: in _messageFromFlash …messenger.device.lib.nut#0.2.0:527
2020-09-07T16:10:56.340 +00:00 [Device] ERROR: from unknown …messenger.device.lib.nut#0.2.0:335
2020-09-07T16:11:26.789 +00:00 [Status] Device disconnected
2020-09-07T16:11:27.484 +00:00 [Status] Device connected
2020-09-07T16:11:28.157 +00:00 [Device] start
2020-09-07T16:11:28.158 +00:00 [Device] Initialized
2020-09-07T16:11:28.177 +00:00 [Device] 2,0
“CAT-M1”,-75,-110,55,-19
“CAT-M1”,“310410”,“LTE BAND 12”,5110
2020-09-07T16:11:31.160 +00:00 [Agent] Packet from 600a0c2a69e60474 sent. Timestamp: 1599494944
2020-09-07T16:11:31.867 +00:00 [Agent] Packet from 600a0c2a69e60474 sent. Timestamp: 1599458294
2020-09-07T16:11:31.632 +00:00 [Device] Received ack for message temp with id: 2160
2020-09-07T16:11:32.181 +00:00 [Agent] Packet from 600a0c2a69e60474 sent. Timestamp: 1599458354
2020-09-07T16:11:32.449 +00:00 [Agent] Packet from 600a0c2a69e60474 sent. Timestamp: 1599458414
2020-09-07T16:11:34.079 +00:00 [Device] ERROR: Exception reading logger object address 0x5b809 with length 50022: CRC err: 0xca != 0x7f
2020-09-07T16:11:34.085 +00:00 [Device] ERROR: the index ‘p’ does not exist
2020-09-07T16:11:34.097 +00:00 [Device] ERROR: in _messageFromFlash …messenger.device.lib.nut#0.2.0:527
2020-09-07T16:11:34.098 +00:00 [Device] ERROR: from unknown …messenger.device.lib.nut#0.2.0:335
2020-09-07T16:12:04.229 +00:00 [Status] Device disconnected
2020-09-07T16:12:04.762 +00:00 [Status] Device connected
2020-09-07T16:12:05.437 +00:00 [Device] start
2020-09-07T16:12:05.438 +00:00 [Device] Initialized
2020-09-07T16:12:05.448 +00:00 [Device] 2,0
“CAT-M1”,-75,-110,92,-19
“CAT-M1”,“310410”,“LTE BAND 12”,5110
2020-09-07T16:12:05.926 +00:00 [Agent] Packet from 600a0c2a69e60474 sent. Timestamp: 1599495005
2020-09-07T16:12:06.318 +00:00 [Device] Received ack for message temp with id: 2160
2020-09-07T16:12:09.777 +00:00 [Agent] Packet from 600a0c2a69e60474 sent. Timestamp: 1599458294
2020-09-07T16:12:10.060 +00:00 [Agent] Packet from 600a0c2a69e60474 sent. Timestamp: 1599458354
2020-09-07T16:12:10.331 +00:00 [Agent] Packet from 600a0c2a69e60474 sent. Timestamp: 1599458414
2020-09-07T16:12:11.920 +00:00 [Device] ERROR: Exception reading logger object address 0x5b809 with length 50022: CRC err: 0xca != 0x7f
2020-09-07T16:12:11.920 +00:00 [Device] ERROR: the index ‘p’ does not exist
2020-09-07T16:12:11.927 +00:00 [Device] ERROR: in _messageFromFlash …messenger.device.lib.nut#0.2.0:527
2020-09-07T16:12:11.928 +00:00 [Device] ERROR: from unknown …messenger.device.lib.nut#0.2.0:335
2020-09-07T16:12:42.071 +00:00 [Status] Device disconnected
2020-09-07T16:12:42.760 +00:00 [Status] Device connected
2020-09-07T16:12:43.447 +00:00 [Device] start
2020-09-07T16:12:43.447 +00:00 [Device] Initialized
2020-09-07T16:12:43.456 +00:00 [Device] 2,0
“CAT-M1”,-75,-110,87,-19
“CAT-M1”,“310410”,“LTE BAND 12”,5110
2020-09-07T16:12:46.350 +00:00 [Agent] Packet from 600a0c2a69e60474 sent. Timestamp: 1599495005
2020-09-07T16:12:46.728 +00:00 [Device] Received ack for message temp with id: 2160
2020-09-07T16:12:47.159 +00:00 [Agent] Packet from 600a0c2a69e60474 sent. Timestamp: 1599458294
2020-09-07T16:12:47.507 +00:00 [Agent] Packet from 600a0c2a69e60474 sent. Timestamp: 1599458354
2020-09-07T16:12:47.773 +00:00 [Agent] Packet from 600a0c2a69e60474 sent. Timestamp: 1599458414
2020-09-07T16:12:49.369 +00:00 [Device] ERROR: Exception reading logger object address 0x5b809 with length 50022: CRC err: 0xca != 0x7f
2020-09-07T16:12:49.370 +00:00 [Device] ERROR: the index ‘p’ does not exist
2020-09-07T16:12:49.376 +00:00 [Device] ERROR: in _messageFromFlash …messenger.device.lib.nut#0.2.0:527
2020-09-07T16:12:49.377 +00:00 [Device] ERROR: from unknown …messenger.device.lib.nut#0.2.0:335
2020-09-07T16:13:19.429 +00:00 [Status] Device disconnected
2020-09-07T16:13:20.122 +00:00 [Status] Device connected
2020-09-07T16:13:20.808 +00:00 [Device] start
2020-09-07T16:13:20.817 +00:00 [Device] Initialized
2020-09-07T16:13:20.825 +00:00 [Device] 2,0
“CAT-M1”,-75,-110,75,-19
“CAT-M1”,“310410”,“LTE BAND 12”,5110
2020-09-07T16:13:28.045 +00:00 [Device] ERROR: Exception reading logger object address 0x5b809 with length 50022: CRC err: 0xca != 0x7f
2020-09-07T16:13:28.045 +00:00 [Device] ERROR: the index ‘p’ does not exist
2020-09-07T16:13:28.045 +00:00 [Device] ERROR: in _messageFromFlash …messenger.device.lib.nut#0.2.0:527
2020-09-07T16:13:28.045 +00:00 [Device] ERROR: from unknown …messenger.device.lib.nut#0.2.0:335
2020-09-07T16:13:28.378 +00:00 [Agent] Packet from 600a0c2a69e60474 sent. Timestamp: 1599458294
2020-09-07T16:13:28.650 +00:00 [Agent] Packet from 600a0c2a69e60474 sent. Timestamp: 1599458354
2020-09-07T16:13:28.924 +00:00 [Agent] Packet from 600a0c2a69e60474 sent. Timestamp: 1599458414
2020-09-07T16:13:57.990 +00:00 [Status] Device disconnected
2020-09-07T16:13:58.606 +00:00 [Status] Device connected
2020-09-07T16:13:59.286 +00:00 [Device] start
2020-09-07T16:13:59.287 +00:00 [Device] Initialized
2020-09-07T16:13:59.297 +00:00 [Device] 2,0
“CAT-M1”,-75,-110,69,-19
“CAT-M1”,“310410”,“LTE BAND 12”,5110
2020-09-07T16:14:06.328 +00:00 [Device] ERROR: Exception reading logger object address 0x5b809 with length 50022: CRC err: 0xca != 0x7f
2020-09-07T16:14:06.333 +00:00 [Device] ERROR: the index ‘p’ does not exist
2020-09-07T16:14:06.336 +00:00 [Device] ERROR: in _messageFromFlash …messenger.device.lib.nut#0.2.0:527
2020-09-07T16:14:06.341 +00:00 [Device] ERROR: from unknown …messenger.device.lib.nut#0.2.0:335
2020-09-07T16:14:12.519 +00:00 [Agent] Packet from 600a0c2a69e60474 sent. Timestamp: 1599458294
2020-09-07T16:14:12.792 +00:00 [Agent] Packet from 600a0c2a69e60474 sent. Timestamp: 1599458354
2020-09-07T16:14:13.063 +00:00 [Agent] Packet from 600a0c2a69e60474 sent. Timestamp: 1599458414
2020-09-07T16:14:36.477 +00:00 [Status] Device disconnected
… and so on.

It spent a lot of paid bandwidth trying to reupload the same packets which never went through because of the error.

Any idea @terrencebarr which ReplayMessenger was hitting these seemingly-SPI-flash-related memory access issues? (You have been tremendously helpful so far and for that I thank you)

What is the write durability of the SPI flash? If there’s any sort of limitation on frequent reading and writing, it’s a no-go for me. Each 60 seconds my code needs to persist a 150-byte message to memory and free it only upon receiving ACK of receipt, precisely what ReplayMessager is supposed to do and was doing well.

My full device code was as follows:

#require "Serializer.class.nut:1.0.0"
#require "SPIFlashLogger.device.lib.nut:2.2.0"
#require "ConnectionManager.lib.nut:3.1.1"
#require "Messenger.lib.nut:0.2.0"
#require "ReplayMessenger.device.lib.nut:0.2.0"

cm <- ConnectionManager({
          "stayConnected": true,
          "connectTimeout": 600.0,
          "retryOnTimeout" : true
        });
imp.setsendbuffersize(8096);

local sfl = SPIFlashLogger()
rm  <- ReplayMessenger(sfl, cm, {"resendLimit": 50});
rm.confirmResend(function(message) {
    // Resend all messages until they are acknowledged
    return true;
});
function onAck(message, ackData) {
    local payload = message.payload;
    local id = payload.id;
    local name = payload.name;
    local msgData = payload.data;
    local metadata = message.metadata;
    server.log("Received ack for message " + name + " with id: " + id);
}
function onUnknownMsg(payload, customAck) {
    local data = payload.data;
    server.error("Received unknown message: " + payload.name);
    if (typeof data == "table" || typeof data == "array") {
        server.log(http.jsonencode(data));
    } else {
        server.log(data);
    }
}
rm.onAck(onAck.bindenv(this));
rm.defaultOn(onUnknownMsg.bindenv(this));
        
imp.setpowersave(false);
uart <- hardware.uartXEFGH;

lastPacket <- 0 

server.log("start");

uart.setrxfifosize(150*20)

enum MSG_NAMES {
    TEMP = "temp" //This is copied and pasted from an example due to frustations I was having getting the messages to connect to the handlers on agent-side; in reality I'm sending far more than just some "temperature" data
}

function processpacket(dBlob) {
    dBlob.seek(0, 'b')
    local tHeader = dBlob.readn(105) 
    if(tHeader > 1 && tHeader != 9999) {
        if(lastPacket == 0 || tHeader != lastPacket) {
                lastPacket = tHeader
                dBlob.seek(0, 'b')
                rm.send(MSG_NAMES.TEMP, dBlob.readblob(150), RM_IMPORTANCE_HIGH)
                //agent.send("RM_DATA", (dBlob.readblob(150)))
            }
    }
}

imp.net.getcellinfo(function(cellInfo) {
    server.log(cellInfo)
});
 
server.log("Initialized")
buffer <- blob();
rxtimeout <- null;

function rx() {
    buffer.writeblob(uart.readblob());
    if (rxtimeout != null) imp.cancelwakeup(rxtimeout);
    rxtimeout = imp.wakeup(0.050, sendbuffer);
} 

function sendbuffer() {
    rxtimeout = null;
    if (buffer.len() >= 150) processpacket(buffer);
    buffer.resize(0)
} 

uart.configure(115200, 8, PARITY_NONE, 1, NO_TX | NO_CTSRTS, rx);

Hi Paul,

I will look into this shortly. But two quick comments:

Pretty much all flash has limitations on the number of lifetime write operations, typically on the order of 100,000 operations. Software deals with that using a concept of ‘wear-leveling’ where write operations are spread out over the entire flash (rather than repeatedly re-writing the same locations again and again). SPIFlashLogger does that as well which extends the typical life of the flash to many years, depending on the usage patterns by the application. I think I have yet to see a customer hit the limit on flash write operations - I’d be very surprised if you hit that with your testing already.

Which leaves two other reasons for the behavior your seeing:

  • A hardware issue with the board or the flash (despite multiple levels of testing, hardware flakiness can never be completely ruled out)
  • A software problem in the SPIFlashLogger or underlying libraries/code

I need to take a deeper look at this tomorrow or Friday.

Best,
– Terrence

Thank you @terrencebarr! I hope there is some sort of “gotcha” that you can easily find, and that it is not an elusive low-level issue with the SPI flash memory.

Paul,

So it looks like one object in the flash got corrupted, and the ReplayMessenger keeps tripping over that. This causes a device restart, the application re-runs, tries again, etc. (note that repeated restarts due to application crashes is something the application can handle intelligently in order not to burn through a lot of battery or data volume unnecessarily - happy to explain more).

Why did the flash object get corrupted? It could be an actual fault in the flash (possible, but unlikely), or - more likely - the application was interrupted while it was writing the object to flash, e.g. due to a power loss or reset. The SPIFlashLogger object storage is not atomic and happens in two steps, first the metadata, then the object itself. If the 2nd step is interrupted then it will leave a corrupt object behind.

To resolve the situation, you can add code into your application that combs through the stored objects and clears out any corrupted objects before starting ReplayMessenger. The code would be something like this:

function validateMessages(onDone) {
    local onData = function(payload, address, next) {
        try {
            local id = payload[RM_COMPRESSED_MSG_PAYLOAD]["id"];
        } catch (err) {
            logger.error("Corrupted message detected: " + err);
            spiFL.erase(address);
        }
 
        next();
    }.bindenv(this);
 
    spiFL.read(onData, onDone, -1);
}

Note this is not ideal if the number of stored objects is large it will take significant time to iterate over all of them. There are smarter ways to clean the store that can be considered, but as a quick resolution this should work.

I will file an enhancement request for ReplayMessenger, i.e. to more gracefully handle corrupt objects in flash.

Let me know if that helps!

Best,
– Terrence

1 Like

Filed: https://github.com/electricimp/Messenger_ReplayMessenger/issues/13

1 Like

Thank you @terrencebarr

This topic was automatically closed after 60 days. New replies are no longer allowed.