How to best debug a memory leak

What is the general strategy to detect a memory leak in squirrel ? Tried with the methods of traversing the getrootable and calculating the size of every object that takes up memory, but while the getmemoryfree function returns steadily lower values (with an 'out of memory restart" eventually), the sum of all object sizes remains the same…

What else can I try ?

Are you using Promises or any kind of local variable function scoping with an in-line declared function?

The code snippet below is several hacks on top of each other to use Message Manager (with all of its retries, acknowledgements, etc.) as a protocol layer on top of the new UDP API’s and some code that I’ve developed. We wanted a promise based master/slave architecture for peer-to-peer local network imp interactions.

We faced a similar challenge to track down a memory leak and went down the same path of using recursive getSize() method to isolate the leak. When that failed (i.e. nothing in the Squirrel root table was growing unbounded), we began looking through commit history and through some experimentation were able to isolate the leak to this particular function. I could articulate my theories here, but I’m sure @peter can provide a more useful response as to why this code caused the problems it caused. In any event, the fix was a simple one liner to remove my reference to the imp.wakeup timer pointer - the memory leak disappeared after this.

function udpTx(key, data=null){
    return Promise(function(resolve, reject){
        local ackOnNextTickTimer;

        g_UDP_MM.send(
            key,
            data,
            {

                "onAck": function(msg){
                    // Message Manager calls acknowledgement handlers before reply handlers
                    // We delay our resolve until the next tick so that onReply can have a chance to cancel our timer
                    ackOnNextTickTimer = imp.wakeup(0.0, function(){
                        ackOnNextTickTimer = null; //TODO: THIS IS VERY IMPORTANT - without it, we will crash out of memory because the garbage collector will hold onto ackOnNextTickTimer and the anonymous funciton with the resolve inside...
                        resolve({
                            "responseType": "ACK",
                            "msg": msg
                        })

                        ]
                    })
                },

                "onReply": function(msg, reply){
                    imp.cancelwakeup(ackOnNextTickTimer)
                    ackOnNextTickTimer = null;  //TODO: THIS IS VERY IMPORTANT - without it, we will crash out of memory because the garbage collector will hold onto ackOnNextTickTimer and the anonymous funciton with the resolve inside...

                    resolve({
                        "responseType": "REPLY",
                        "msg": msg,
                        "reply": reply
                    })
                },

                "onFail": function(msg, error, retry){
                    // server.error("ON FAIL")
                    // server.error(error)
                    // PrettyPrinter.print(msg)

                    ackOnNextTickTimer = null; //TODO: THIS IS VERY IMPORTANT - without it, we will crash out of memory because the garbage collector will hold onto ackOnNextTickTimer and the anonymous funciton with the resolve inside...

                    //if we register the onFail, then the onus is on us to actually call retry, manage number of retries, etc...
                    if(msg.tries < g_UDP_MM._maxAutoRetries){
                        return retry()
                    }

                    // MessageManager handling of timeouts / failures is a bit goofy - Let's provide something useful upstream
                    if(error == "User called fail"){
                        error = "TIMEOUT"
                    }

                    reject({
                        "responseType": "FAIL",
                        "msg": msg,
                        "error": error,
                        "retry": retry
                    })
                }
            },
            null,
            {"ackOnNextTickTimer": null}
        )
    }.bindenv(this))
}

On this one we were not using Promises but I’m very interested to understand the root cause of this issue as well. We did extensive work with Promises in more or less the same way in another project where we have a memory leak that we never managed to track down. As it took several days to deplete the memory, we finally decided to live with it and make the code as stateless as possible so that restarts really didn’t matter too much.
On the particular issue we’ve faced leading to this post, we did track down by accident the offending code yesterday. It’s however totally not clear why we lose memory there, it’s with @Peter for analysis as well (especially since the leak dissappeared when moving an imp to 41.22…)

I’d be very interested in your work on using Promises in the protocol layer. We’ve done some attempts to that over a serial connection but never reached a well working prototype. But at that time Message Manager wasn’t around yet to start from.