Hardware.sampler multiple channel mixup

iceled · January 13, 2020, 5:09pm

Sampling two analog pins with an imp001 I don’t understand the results I seem to be getting from the interleaved data. Every so often the two channel values read from the buffer returned by the callback swap position. While pin 1 data readings always start out (when device started) in first pair of bytes, pin 2 in second pair - then, after maybe a few hours, pin 1 data appears in the second byte pair and pin 2 the first. Then a few hours later it may return to the correct sequence, and so on.

Here’s the code that does it:

//sampler buffers
buffer1 <- blob(2000);  //2x2 bytes @50Hz gives 10 sec updates
buffer2 <- blob(2000);  //double buffer

// Configure analogue pins 1 & 2 to be sampled 50 times per second
hardware.sampler.configure([hardware.pin1,hardware.pin2], 50, [buffer1, buffer2], samplesReady);

//sampler callback every 10 sec...
function samplesReady(buffer, length) {
    local ava=0.0;
    local avv=0.0;
    if (length > 0) {
        length/=4;     //2 x 16bit samples
        for(local i = 0; i < length; i++)
        {
            ava+=buffer.readn('w');
            avv+=buffer.readn('w');
        }
        ava/=length;     //take avg. amps
        avv/=length;     //take avg. volts

        server.log("check: A="+ava+" V="+avv);

   } else {
        server.error("Overrun");
   }
}

Testing with two different fixed voltage levels at the inputs, it’s very obvious that the order is randomly swapping. Am I retrieving the interleaved data correctly by doing two sequential buffer.readn(‘w’) ?

hugo · January 13, 2020, 8:30pm

Yes you should just be able to read sequentially like that. The interleaving order shouldn’t be changing. How are you generating your fixed voltages? (just want to check that there’s not a possibility of charge issues when the ADC MUX switches)

iceled · January 13, 2020, 9:46pm

Oh, that’s very odd then. I originally had a hall-effect current transducer with buffered output on pin 1 and a potential divider of 39K to a solar panel / 2K to 0V for voltage monitoring on pin 2, so the source impedance was reasonably low. This worked great but the channels kept swapping so I hooked up a couple of 10K pots across 3V3 as a sanity check and got the same odd result.

I have tried different sample rates and would say that it flips more frequently at lower rates but it’s hard to be certain. It seemed to settle down yesterday so I put it all back but got this set of readings today:

It was stable last night then started flipping around 8:30AM
The software does some additional scaling and offsetting to get calibrated readings but my sanity check was done to eliminate everything but the basic sampler output.

hugo · January 13, 2020, 10:00pm

Can you put (eg) a 100nF cap on the input pins?

Is that picture from the pots or from the solar panel etc?

iceled · January 14, 2020, 10:12am

Can try caps but I might even have bypassed inputs with a few nF already (can’t remember but I usually do to bypass RF).

The screenshot shows the solar data, just to give an idea of the frequency of flipping.

Will try to set up another imp to bench test as this one is quite hard to access outdoors and we’ve got 60mph winds & rain today.

Just another data point - the log below shows raw data values for A and V (as per above code snippet) when it flipped soon after a device reconnection about half an hour ago:

|2020-01-14T10:23:28.972 +00:00|[Device]|check: A=34441.2 V=53339.4
|2020-01-14T10:23:38.946 +00:00|[Device]|check: A=34277.1 V=53376.2
|2020-01-14T10:23:48.945 +00:00|[Device]|check: A=34421.1 V=53354.7
|2020-01-14T10:23:59.100 +00:00|[Device]|check: A=34289.3 V=53384.5
|2020-01-14T10:25:49.644 +00:00|[Status]|Device disconnected
|2020-01-14T10:26:01.408 +00:00|[Device]|check: A=34537.3 V=53388.2
|2020-01-14T10:26:01.409 +00:00|[Device]|ERROR: Overrun
|2020-01-14T10:26:11.105 +00:00|[Device]|check: A=53053.5 V=34266.9
|2020-01-14T10:26:21.157 +00:00|[Device]|check: A=53402.1 V=34461.8
|2020-01-14T10:26:31.125 +00:00|[Device]|check: A=53439.9 V=34253.4
|2020-01-14T10:26:41.099 +00:00|[Device]|check: A=53408.2 V=34433.6

This is from live data so the values are not steady but I think it clearly shows the two channels swapping. Maybe the bumpy restart and subsequent overrun has something to do with it?

Also, there seems to be a correlation between the frequency of ‘flipping’ and the weather - with heavy rain possibly interrupting the WiFi signal and causing a temporary disconnect.

hugo · January 14, 2020, 11:32am

You’re not queuing up the (now emptied) buffer in your ready callback, which means after the two buffers have been consumed, you will always hit an overrun.

Quite possible that when there’s an overrun things get out of sync, so the first thing to do is stop the overruns.

iceled · January 14, 2020, 11:43am

I didn’t realise there was anything else to do other than read the contents of the particular buffer returned in the callback. I understood from the documentation that the buffer switching happened in the background. If this wasn’t the case then surely I’d be getting overruns every callback? I’m not and the code has been running mostly well for months on end. Only the recent storms seem to have intensified the issue.

I don’t see anything in the code examples to suggest what to do with an emptied buffer other than to leave it to the device to refill… or am I misunderstanding you?

hugo · January 14, 2020, 12:19pm

Hmm, I think I may have been getting confused with ffdac (been using that more recently).

Yes the buffer swaps are automatic. Generally you shouldn’t do a server.log (as it’s high latency) within the callback if possible, because the buffer isn’t recycled until the callback exits - or you could just add a third buffer to cover potential latency.

That’d be my advice for an easy experiment - move to 3 buffers from 2?

iceled · January 14, 2020, 12:35pm

OK, that’s fine. More than happy to try anything from the comfort of indoors

I’m absolutely certain now that the overrun only happens during disconnect/reconnect due to WiFi dropout so the latency would more likely be due to time spent reestablishing the connection - although I agree logging in the callback looks like a bad idea as it will only compound the effect.

I say this because my latest debug setup is showing me that the overruns are always happening after disconnection/reconnection.

As a general approach, would it be sensible to call hardware.sampler.stop(); then hardware.sampler.start(); when an overrun is reported by the callback?

hugo · January 14, 2020, 1:27pm

Yes, absolutely you should do stop and start after an overrun to ensure a consistent state on restart.

If you run in RETURN_ON_ERROR mode then you should get rid of the overruns due to disconnections - look at the connectionmanager library for this. Then, when comms become stalled due to a wifi dropout, the buffers will still be processed and you shouldn’t overrun.

iceled · January 15, 2020, 9:36pm

I tried stopping then starting the sampler in the callback reporting an overrun condition but this resulted in a never ending succession of overruns after the first one to occur.

I didn’t have time to explore that any further so instead implemented RETURN_ON_ERROR mode. That appears to have solved my problem by keeping the sampler running during WiFi outages. Thanks for the tip.