Help please: Memory leak in Hannah/lis3dh interrupt handler

I have a problem finding and fixing a serious memory leak. I have optimised the code as much as possible and have pinpointed it down to being directly related to the lis3dh handler on the Hannah board. Couple of things I can now categorically state:
A: The memory leak is directly related to the IRQ handler and proportional to the rate of IRQ’s
B: Full execution of the ISR takes between 2100uSec and 2250uS
C: It definitely does not happen that the ISR is being called whilst in the ISR, so no re-entry/recursion whatever…
D: Only real ISR code being executed are lines 24-28 and 40-55
E: The logic of the ISR works flawlessly, all buttons & temperature alarms recognised, 4D/6D orientation correctly identified, movement and at rest states detected.
F: If I disable lis3dh interrupts, no memory leakage occurs.

Example 1: ODR set to 25Hz, service 225 IRQ’s per minute, 53K RAM gone in about 4min

2014-02-19 10:16:03 UTC+2: [Device] Code => Hannah v0.46
2014-02-19 10:16:03 UTC+2: [Device] Mode=G3
2014-02-19 10:16:05 UTC+2: [Device] Mem=52900, Irq=6
2014-02-19 10:16:05 UTC+2: [Device] Z:Up
2014-02-19 10:16:05 UTC+2: [Device] tmp: > TempHI
2014-02-19 10:16:34 UTC+2: [Device] Mem=44652, Irq=120
2014-02-19 10:17:03 UTC+2: [Device] Mem=42028, Irq=232
2014-02-19 10:17:33 UTC+2: [Device] Mem=34000, Irq=344
2014-02-19 10:18:02 UTC+2: [Device] Mem=30456, Irq=456
2014-02-19 10:18:31 UTC+2: [Device] Mem=21364, Irq=568
2014-02-19 10:19:00 UTC+2: [Device] Mem=17304, Irq=680
2014-02-19 10:19:29 UTC+2: [Device] Mem=13176, Irq=792
2014-02-19 10:19:58 UTC+2: [Device] Mem=8888, Irq=905

Example 2: ODR set to 10Hz, service 90 IRQ’s per minute, 53K RAM gone in about 10min

2014-02-19 10:20:30 UTC+2: [Device] Code => Hannah v0.46
2014-02-19 10:20:30 UTC+2: [Device] Mode=G3
2014-02-19 10:20:31 UTC+2: [Device] Mem=52900, Irq=4
2014-02-19 10:20:32 UTC+2: [Device] Z:Up
2014-02-19 10:20:32 UTC+2: [Device] tmp: > TempHI
2014-02-19 10:21:01 UTC+2: [Device] Mem=49160, Irq=48
2014-02-19 10:21:30 UTC+2: [Device] Mem=46112, Irq=94
2014-02-19 10:21:59 UTC+2: [Device] Mem=44004, Irq=138
2014-02-19 10:22:29 UTC+2: [Device] Mem=42068, Irq=184
2014-02-19 10:22:58 UTC+2: [Device] Mem=37996, Irq=229
2014-02-19 10:23:27 UTC+2: [Device] Mem=35960, Irq=274
2014-02-19 10:23:56 UTC+2: [Device] Mem=33864, Irq=319

Code is below:

Any and all help & suggestions appreciated but I am totally out of ideas, other than maybe a bug outside my code??

Andre

@beardedinventor Apologies, should have been posted to Software category, can you please move?

function sx1509IrqHandler() { // ProcessTimerStart(0, 1); local RegIrqS = 0, lisInt1Src = 0, lisFifoSrc = 0, FifoLevel = 0, lisStatReg2 = 0 ; local Bytes = 0, status = 0, value = 0, temp = 0, i = 0, s = "" ; if (IrqFlag & 0x8000) // Check for IRQ handler re-entry.... IrqFlag = IrqFlag | 0x4000; // Yes, flag as Re-entry !! else IrqFlag = IrqFlag | 0x8000; // mark as in IrqHandler RegIrqS = i2cPort.read(i2c_ioexp, "\\x19", 1)[0]; // READ = sx1509 RegIrqSrcA if (!(RegIrqS & 0x1F)) { IrqFlag = IrqFlag & 0x7FFF; // mark as DONE in IrqHandler // ProcessTimerShow(0); return; // If No valid interrupts exist, return.. } CountIrq++; // Unexpected IRQs !!! Clear 3MSB // if (RegIrqS & 0xE0) { i2cPort.write(i2c_ioexp, "\\x19\\xE0"); IrqFlag = IrqFlag | 0x2000; } // ---------------------------- Button 1 & 2, Hall switch ---------------------- if (RegIrqS & 0x01) { Button1 = TRUE; sx1509SwitchSSR(LedR, LastPot); sx1509ClearIrq(0); IrqFlag = IrqFlag | 0x0040; } if (RegIrqS & 0x02) { Button2 = TRUE; sx1509SwitchSSR(LedG, LastPot); sx1509ClearIrq(1); IrqFlag = IrqFlag | 0x0080; } if (RegIrqS & 0x04) { HallSW = TRUE; sx1509SwicthSSR(LedB, LastPot); sx1509ClearIrq(2); IrqFlag = IrqFlag | 0x0100; } //------------------------------------ lis3dh ---------------------------------- if (RegIrqS & 0x08) { lisStatReg2 = i2cPort.read(i2c_accel, "\\x27", 1)[0]; // StatReg2: XYZOR-Z-Y-X-ZYXDA-Z-Y-X if (lisStatReg2) { // ZYX DataAvail or OverRun Int ? lis3dhZYXBuf = i2cPort.read(i2c_accel, "\\xA8", 6); // Multibyte read XYZ regs into buffer // IrqFlag = IrqFlag | 0x1000; } lisFifoSrc = i2cPort.read(i2c_accel, "\\x2F", 1)[0]; // FIFO: WTM - ORun - EMTY - Ths4->0 if (lisFifoSrc & 0xC0) { // Fifo Watermark and/or Overrun Int? FifoLevel = lisFifoSrc & 0x1F; bytes = (FifoLevel<<2) + (FifoLevel<1); // Multiply * 6 bytes/reading in FIFO lis3dhFifoBuf = i2cPort.read(i2c_accel, "\\xA8", bytes); // Read all data in the Fifo buffer s = "\\x2E\\x00"; i2cPort.write(i2c_accel,s); // Reset to Bypass mode, TR=0 WM=0 s[1] = lis3dhRegBuf[lisFifoCtrl]; // then reset to correct Fifo mode... i2cPort.write(i2c_accel, s); // Restore to original mode // IrqFlag = IrqFlag | 0x0800; } // Logic to handle ModeXX conditions lisInt1Src = i2cPort.read(i2c_accel, "\\x31", 1)[0]; // Read and save status, reset IA bit (ALSO screw up LIR bit!!!) switch (lisInt1Src) { // Deal with 4D/6D movement/direction/position case 0x01: break; // Ignore these values case 0x02: break; case 0x04: break; case 0x08: break; case 0x10: break; case 0x20: break; case 0x41: lisCurOrient = 2; break; // Deal with 4D/6D Position case 0x42: lisCurOrient = 3; break; case 0x44: lisCurOrient = 4; break; case 0x48: lisCurOrient = 5; break; case 0x50: lisCurOrient = 6; break; case 0x60: lisCurOrient = 7; break; case 0x55: lisCurOrient = 1; break; // At rest default: lisCurOrient = 8; break; // Movement !!! } sx1509ClearIrq(3); } // ----------------------------------- tmp112 ---------------------------------- if (RegIrqS & 0x10) { // tmp 112 Thermometer status = ReadWord(i2c_temp, tmp112RegConf); // Read config register value = ReadWord(i2c_temp, tmp112RegTemp); // Read Temp register if (value > tmp112TrigHi) IrqFlag = IrqFlag | 0x0400; if (value < tmp112TrigLo) IrqFlag = IrqFlag | 0x0200; sx1509ClearIrq(4); } DataChanged = TRUE; // ProcessTimerShow(1); IrqFlag = IrqFlag & 0x7FFF; // mark as DONE in IrqHandler }

Update on the above.

Have optimised a bit further, replaced 3 nested levels of general purpose sub-functions from sx1509ClearIrq() and below with a single hardcoded routine to clear interrupts.

Made the ISR another 300usec faster but absolutely no difference on memory leak.
Also no difference to time taken to run out of memory, still happens in ~10min uing ODR of 10Hz

A luta continua…

Got the whole code? Suspicion is that you might be re-queueing a wakeup or something else which will grow unbounded…

@hugo
Possible,… But unlikely. have about 4 wakeups running, 2 at 0.5 sec, 2 at 17-30 seconds.
ISR takes <2mSec to complete and at 45 IRQs every 30sec, should not take more than 100mSec in total?

Exact same code but with lis3dh configured for Wakeup and at rest IRQs ie, IRQ only now and then, runs for extended periods with no memory leak.

Happy to pm you the code, would not like to post publicly, shall i do so?

Apologies for the multiple postings, came up. With some “Forum Error” and logo. Only posted once, seems like auto save of draft was posted…

@ammaree - I deleted all the extra posts :slight_smile:

Seems like the issue was multiple wakeups being queued inadvertently.

Typically what happens is this:

`function regulartask() {
// do something
// re-queue
imp.wakeup(1, regulartask);
}

// start regular task
regulartask();

// some other event
agent.on(“data”, function(v) {
// do something
// run regular task
regulartask();
}`

… so when the code runs, regulartask is being run once per second. Then you get a message from the agent, and regulartask is being run twice per second. Every time you get an agent message, another regulartask is being queued… repeat until system is out of memory or completely waterlogged in callbacks.

@Hugo, thanks…
As embarrassingly simple as the solution was it was worth it, just so I know the real complicated stuff is working well underneath.

I owe you some good SA red wine…

All gifts of alcohol gladly accepted and usually shared during office celebrations :slight_smile:

Plenty of people run into this, I posted the example just in case anyone else is searching for a solution in the future!