Stack Flattening

coverdriven · July 11, 2017, 3:40am

I’m completing some refactoring on my device code and struck a familiar problem.

As I approach the stack limit on the imp, normally routine api calls such as format() or blob() fail. I’ll get runtime errors such as “the index ‘format’ does not exist (line xxxx)”.

My usual solution is to step through the calling hierarchy as find convenient place to replace f() with imp.wakeup(0,f). This flattens the stack enough to continue.

I’m also curious about other options that I have.

Reading the Squirrel documentation about collapsing the stack using tail recursion. Could this be clarified further? In which of these scenarios will the stack be collapsed?

function rowdy(callback) {
  ...
  if (finished)
    return callback(true)
  ...
  rowdy(callback)
}
-------------------------------
function rowdy(callback) {
  ...
  yates(callback)
}
-------------------------------
function rowdy(callback) {
  ...
  return yates(callback)
}
-----------------------------------
function charlton(callback) {
  if (finished)
    return heston(callback)
  ...
}

Also, I’m trying to understand what local variables Squirrel holds onto when I create a closure. Does a closure hold a reference to ALL local variables available to its parent, or only the ones that are used in the closure? The impact of lots of imp.wakeups is brief periods of elevated memory use. Are there steps I could take to reduce the number of things that persist in memory until the closure actually runs?

coverdriven · July 11, 2017, 10:29am

As an additional comment, it would be great if Electric Imp could offer a debug OS that could be loaded into dev devices selectively. The debug OS would be identical to the current release, but would make extra tools available like a profiler, stack level monitoring, traffic logging etc.

peter · July 11, 2017, 3:05pm

If you can supply Squirrel which reproduces that, I’d be very interested to see it.

Only the ones used in the closure.

Peter

peter · July 11, 2017, 3:09pm

Do you have a link to that documentation? I can check, but I don’t think Squirrel can optimise any of your quoted cases. Even in the first one, the name of a function is not inherently in scope within that function, and so an assignment to the root table can mean that the call to “rowdy” at the end is to a completely different function.

Peter

coverdriven · July 11, 2017, 10:25pm

Thanks Peter,

interested in your comment on this: http://squirrel-lang.org/forums/default.aspx?g=posts&t=4583

I should have been more general and said “tail calling” rather than “tail recursion”

coverdriven · July 11, 2017, 11:38pm

Peter, I’ve posted some sample code in the support channel (#4325)

peter · July 12, 2017, 12:51pm

Ah OK, it does go through the whole call sequence, just on top of the same stack frame. As your link implies, you can detect tail-call optimisation yourself by throwing an exception and checking the stack backtrace.

And the answers are that only these calls are tail-optimised: rowdy1’s call to callback; rowdy3’s call to yates, and charlton’s call to heston.

There doesn’t seem to be any intrinsic reason why rowdy1’s recursive call and rowdy2’s call to yates could not be optimised too, but Squirrel does not offer that optimisation.

Peter

coverdriven · July 13, 2017, 2:14am

Thanks for clarification. I’ll be mindful of preceding tail calls with return, even if it seems redundant.

Looking at this in retrospect, device Squirrel code capacity is now ~4 times what is was initially (2x capacity and compression improvements) and RAM on newer devices is ~10 times greater. Yet, stack limit is lower than the $0.50 PIC I have on my boards.

peter · July 13, 2017, 11:05am

Just to clarify, there ought to be no limit on Squirrel stack size, except that imposed by overall memory (and by fragmentation of the heap). Any circumstances in which a large Squirrel stack causes a problem, other than when completely out-of-memory, constitute a bug, and I’m still eager to hear from anyone who can reliably reproduce those bugs.

There is, however, a separate limit on the C stack size – a stack whose usage only increases when the entire Squirrel interpreter recurses. There are a small number of places where this happens: in agent Squirrel only for metamethods, and in device Squirrel additionally for these APIs: closure.[a][p]call(), array.apply(), array.map(), array.reduce(), array.filter(), and array.sort(). In most or all cases, these can be avoided by open-coding the operations in pure Squirrel (with, in the case of metamethods, admittedly a slight decline in readability).

The stack-flattening techniques discussed on this thread only affect the Squirrel stack, not the C stack, and do not help with the limit on recursive interpreter invocations.

At some point we’d like to clean up agent Squirrel metamethods so that even they do not cause the interpreter to recurse, as that would remove some oddities from our agent-scheduler – but there isn’t a firm plan for doing so.

Peter

peter · July 14, 2017, 3:23pm

For instance, here’s something that’s very like f0.acall(a):

function acall2(f0,a) {
  local f = f0.bindenv(a[0]);
  switch (a.len()) {
  case 1: return f();
  case 2: return f(a[0]);
  case 3: return f(a[0],a[1]);
  ... as many of these as you need ...
  }
  throw "too many parameters for acall2";
}

…although it’s not quite the same in the case where “f0” has already been bindenv’d.

Peter

deldrid1 · July 16, 2017, 8:26pm

Thanks for posting this @peter! A support ticket was about to be on its way until I saw this…

@smittytone might want to make a dev center article about this limitation