Log Analysis - Pixhawk 2.1 crashing after GPS_PRIMARY_CHANGED

Hello,

We’ve had to recent back to back crashes with a Cube-based hexacopter running 3.6.9 with Here2s (over I2C). The drone was operating correctly for the full operational day but suddenly had two crashes during the same mission. On both of them, the Cube stopped logging mid-air (one of them it looks almost like the Cube maybe restarted mid-air). We are aware of the yaw-imbalance issue that exists for this system. However, we don’t think that it is related to this current issue. Is this just a faulty Cube/GPS or is something else going on? We’re not sure why this started happening so suddenly. Both the logs are attached.

Any help is appreciated. Thank you in advance.

U012_1564201580_stopped_logging.bin (1.3 MB)
U012_1564208138_ekf_variance_stopped_logging.bin (1.3 MB)

I haven’t looked yet, but I will ask a few questions.

  1. Are arming checks set to 1?
  2. Have you followed all SB instructions?
  1. All the arming checks are enabled except for “RC Channels” (we fly without RC) and “System” (bit 13), I don’t know what this check does.

  2. Everything except 1: We were not able to turn on INS_USE3. Doing that gives us NavEKF2: allocation failed errors. Not ideal but we’ve flown 2000+ flights on 3.6.5 which had effectively no IMU redundancy so this was still a step up for us.

We had another crash happen today on an entirely different drone. One interesting note: it seems like with all 3 of these crashes, we have lost telemetry just slightly before the crash. That’s been consistent so it seems linked, but I don’t know exactly how.

Latest crash log (this one also stops logging midair): https://www.dropbox.com/s/bf95whlgniu2c8x/U013_1564329838.bin?dl=0

Looks like we have simultaneous dual GPS failure. Here are the GPS lat/lng graphs for the last 2 crashes:


This doesn’t look like a glitch since the Lat/Lngs go straight to 0 immediately. This also happens simultaneously on both GPSs at the same time.

Here are some graphs of the GPS innovations (notice the flatlining right before the crash):


Here is a the Vcc graph:

Main power never dips below 4.7V, but is that really low enough to cause this? We’re checking for bad wiring currently but is there anything else it could be? I2C bus lockup? It makes no sense that both of them would be going down at the same exact time.

We’ve also seen this issue confirmed on a copter running on 3.6.5.

Power issue confirmed. VCC power flags indicate peripheral overcurrent at the exact moment the GPS dies:

I read through this thread with similar issues: [Solved] Peripheral power / brownout / bootloader / PSM

However, we are running on 3.6.9 with ChibiOS and still having the issue. Is it possible that 2 GPSs is simply too much power? The inconsistency is still really strange. The entire fleet is operating with the same configuration and only these 2 drones are seeing this problem.

Here is our full Cube port configuration:

Power1 - Mauch HYB-BEC
Power2 - Mauch Backup HYB-BEC
Servo Rail - Powered independently by another Mauch BEC (5.3V)
GPS1 - HERE2 (I2C)
GPS2 - HERE2 (I2C)
Telem1 - RFD900+
Telem2 - LiDAR (only data lines, powered through servo rail)
USB (the JST-GH port) - buzzer
Serial5/Cons - LiDAR2 (only data lines, powered through servo rail)
I2C - flow sensor (only data lines, powered through servo rail)

Since the RFD is on it’s own dedicated rail, the only 2 things sharing the 2nd peripheral rail are the GPSs. Is it really possible for just these 2 things to overdraw the 800mA limit?

the biggest issue I see with that setup is that you have the RFD900+ powered by the system. the RFD900 is NOT to be powered by the carrier board at all, it must have an independent power supply.
Though serial 1 has its own current limiter, it is not fast enough for the RFD900.
I do worry about the amount of noise on your VCC

looking at your power flags, we have 3 for the beginning, then 35, then 43. that equates to
3 = 00000011
35 = 00100011
43 = 00101011

BRICK_VALID=1
BACKUP_VALID=2
USB_CONNECTED=4
PERIPH_OVERCURRENT=8
PERIPH_HIPOWER_OVERCURRENT=16
STATUS_CHANGED=32

So yes, it appears that for some reason you have had an over current on periph power.

Though you didn’t have one on the Hi-power, I still would not clear the RFD900 power, as we have seen many interesting side effects from people sharing power with it from serial 1.

can you do me a favour and measure the current of your whole carrier board (please do not include the RFD900…) then check with each Here2? one at a time, then with no accessories.

1 Like

@jschall

Thanks for the advice Philip. We’ve given the RFD it’s own supply and are testing now. I’ll let you know if the issue comes up again.

can you do me a favour and measure the current of your whole carrier board (please do not include the RFD900…) then check with each Here2? one at a time, then with no accessories.

We are in the process of doing this test, but it’s currently low priority for us. We will do it immediately if the RFD fix doesn’t work.

1 Like

Total power of two here2’s is around 0.5A