CAN Driver Failing Inconsistently

Hi. We are currently encountering an issue with using the DroneCAN protocol (specifically the CAN drivers) on Cube flight controllers. This issue has come up both on the Cube Blue H7 as well as the Cube Orange+.

Currently, the CAN2 port is connected to 2 Here4 GPS units, and the CAN1 port is connected to 8 ESCs (the frame type is a quadcopter with dual motors on each arm). Both the Cube Blue and the Cube Orange+ are running the most updated firmware, as in an attempt to resolve the problem, both Cubes have been flashed through MP.

Below are our steps and observations that led up to the development of this issue.

  1. (Using Cube Blue H7) Using the SIK radio, parameter loading speed and consistency decreases over the course of 3-4 hours. Parameters are able to load about half the time
  2. During regular operator flight (with no inconsistencies to previous flights), a compass failsafe is triggered. Operator lands drone immediately.
  3. Upon rebooting the flight controller, Ardupilot failed to boot on the Cube. This is indicated by the messages console constantly displaying the message ā€˜Initializing Ardupilot’
  4. The problem was then debugged for around ~7 hours. In the process, the Cube Blue H7 was swapped out with the Cube Orange+ (both of which showed the same issue), the GPSs were removed and reinstalled, the firmware was flashed among many other attempts to resolve the problem. Eventually, it appeared that when the CAN drivers (both 1 and 2) were turned off, Ardupilot booted on the Cube and didn’t when either one was turned on.
  5. However, at the end of the 7 hours, the drivers seemed to randomly resume regular operation. This was not due to flashing firmware, rewriting parameters, or anything in general but simply occurred randomly.
    1. This resumption was verified by the spin test of 2 out of the 8 motors.
    2. All 8 ESCs were then connected to the Cube, and the issue came back as Ardupilot failed to boot once again. No software changes were made between the spin test and the Ardupilot failing to boot.

The above steps were all performed within the time frame of 48 hours. This issue was then debugged for another ~4 hours after a 24 hour break when the Cube was not powered. The Cube was then left alone for a week and a half before debugging resumed. Upon booting, it appeared that the drivers resumed normal operation (no parameters were changed). This time, the following procedure was followed.

  1. One individual ESC and motor is plugged into the Cube.
  2. Cube is power cycled 3 times. On each power cycle, it is verified with the message console that the Cube boots properly. On power cycle 2 and 3, the motor is spin tested.
    Upon testing motors 1, 6, 2, and 5 (the front set of motors), the 4 motors are tested as a set, following the same procedure as above. The same thing is done for motors 4, 7, 3, and 8 (the back set of motors) upon completion of testing each motor individually. Then, all 8 motors are spun together, which succeeds consistently through 5 consecutive power cycles. In this entire process, no parameters are changed, and the CAN drivers are not turned off in any way.

It seems to us that the problem is not able to be consistently reproduced nor consistently present. If anyone has any ideas as to where the problem is originating from and/or a consistent fix for it, please let us know.

This seems suspicious. Have you tested without the radio connected? What sort of radio is it.

I would set LOG_DISARMED,1 and gather a working instance and non-working instance.
Upload the .bin logs to a filesharing service and provide the links to working and non-working logs here.

1 Like

I am another person on this team and I can answer some questions, but both of us are away from the drone for a few days.

The radio is a sik radio (3dr I think), but we recreated the fail to initialize over usb as well multiple times, as we generally changed parameters over usb due to the radio failing.

Here’s a link to all of our logs for one of the autopilots.
Cube Blue Logs

The last flight before the failure I believe was 141, but logs from the most recent 2 days could be useful.

Unfortunately, I am not with the hardware to do the logging when disarmed, though the system now appears to work. We aren’t sure of how to recreate the issue.

UPDATE: This problem was determined to be due to the parameter ESC_CALIBRATE, which caused the problem to be irregular and difficult to pinpoint. Essentially, when the FC is powered on with the throttle on the RC set at full, the ESCs would go into calibration mode, causing the FC to not boot up. With this parameter set to 0 for ā€˜Normal Operation’, the problem was successfully resolved.