Hang in server.configure()

I have been working on a way to monitor and control my pool. As of yesterday, I finally got things set up to read data from the pool control and output it to cosm.  It worked for a day or two straight with no interruptions.  It is possible I may have changed some code, but I don’t remember. Anyways, my program would run collect some data, then sleepfor(120).  


Last night I noticed my data was stale, and sure enough, my imp was hung (it hung at some point while I was away, so it wasn’t due to a direct change of mine). I reset the power and it worked for about 10 more hours, when it hung again.  Ok, time to reset again. This time, it seems to always hang.  I traced it to being hung in server.configure. It doesn’t seem to have to do with the number of outputs I have configured (I have 8, for the 8 things I want to keep track of in graphs on cosm).  Setting it to [] still hangs…

My code is something like this:

…setup output objects…
server.log(“setting up”);
server.configur(“Pool”, [], []);
…about 10 functions, handling the reading of data from the RS232 connection, checking checksums, etc…
do {
// read packet
// interpret packets
// check if enough is read
// If we have looped too many times, break out, something might be busted
} while (enoughDataRead)

server.sleepfor(120);

If I add the server.sleepfor(12) right after the configure, it does not hang.  Maybe I hit a limit to the code size?  My code is 10k of text (including commented out code, etc).  The code is pre-compiled on the server, right? So comments don’t count, right?  

This is pretty difficult to debug, since there is no output telling me what is wrong. Additionally, it worked for almost 2 straight days (I got some cool graphs about my pool temperature, air temperature, and when the pool runs at what speed, etc)…

Any ideas?

Thanks!


Of course, when it is hung in server.configure(), my imp is blinking the standard slow green blink and I cannot upload new firmware without a hard reboot…

Whoops, forgot to put in the server.log(“done configure”) after…


Basically, I get the log before config, but not afterwards. Either way, I am bringing the software up a little a time to see when it starts failing…

So, I created a new program, and all is working now (same code, just pasted in a little at a time). Going back to the old still hangs. The differences are very minor…

This could be code size, yes. Comments don’t appear in the bytecode so they shouldn’t make any difference.


This will be less of an issue when release-5 comes out, as most of the code is executed directly from flash.

If you can PM me the code, I can try to replicate here. I can also push release-5 to your imps if you’ll let us know if you see any regressions, we’re testing it in-house now.

Ok, wacky.  My new program is now doing the same thing. Note, it was working all day, until I removed one of the digital pins (since when the Imp is off, it was floating, and I need it to be 0 volts).  I didn’t change the code, though…I put the pin back on, and have the code as it was running all day (just fine) and it is busted. Still hangs in configure…

Ok, interesting.  One time it did complete, and I got this error:


 ERROR: stack overflow, cannot resize stack while in  a metamethod`
</div>

Can you give me the mac address of the card, and try the other one you have? We may be seeing an issue with certain cards not updating. You are showing as having two cards, one of which is on release-2 and one on release-3.

I sent one of the cards back (Atten Nong), as requested by Nong.  The other one, ends in 2e2, which I believe you mentioned in your direct e-mail.  Is there a way for me to check the release?

Ok, I solved it (thanks Hugo). I did have an infinite loop (that I didn’t notice).  For some reason, my RS-232<->RS-485 (which I was reading data from) died, and I am always getting -1 back from hardware.uart12.read(), and I was looping forever.


Being more fault tolerant, fixes that. I am still confused about how the server.log before “configure” printed fine, but the one afterwards did not. I guess a tight loop makes it impossible for the logs to be sent upstream.

I am re-writing my code to be more tolerant of these sort of things, while I figure out why my RS485 board died.