Watchdog - Application
Since 0.5.0:
A Watchdog Timer is designed to rescue your device should an unexpected problem prevent code from running. This could be the device locking or or freezing due to a bug in code, accessing a shared resource incorrectly, corrupting memory, and other causes.
Device OS includes a software-based watchdog, ApplicationWatchdog, that is based on a FreeRTOS thread. It theoretically can help when user application enters an infinite loop. However, it does not guard against the more problematic things like deadlock caused by accessing a mutex from multiple threads with thread swapping disabled, infinite loop with interrupts disabled, or an unpredictable hang caused by memory corruption. Only a hardware watchdog can handle those situations. In practice, the application watchdog is rarely effective.
Starting with Device OS 5.3.0, Gen 3 devices based on the nRF52840 (Boron, B-Series SoM, Argon, Tracker SoM, E404X) and RTL872x (P2, Photon 2) can use the hardware watchdog built into the MCU. This is highly effective at resetting based on conditions that cause user or system firmware to freeze when in normal operating mode. It is only operational in normal operations mode, not DFU or safe mode. Using this, instead of the application watchdog, is recommended.
The page on watchdog timers has information about external hardware watchdog timers, and hardware and software designs for the TPL5010 and AB1805.
// PROTOTYPES
ApplicationWatchdog(unsigned timeout_ms,
std::function<void(void)> fn,
unsigned stack_size=DEFAULT_STACK_SIZE);
ApplicationWatchdog(std::chrono::milliseconds ms,
std::function<void(void)> fn,
unsigned stack_size=DEFAULT_STACK_SIZE);
// EXAMPLE USAGE
// Global variable to hold the watchdog object pointer
ApplicationWatchdog *wd;
void watchdogHandler() {
// Do as little as possible in this function, preferably just
// calling System.reset().
// Do not attempt to Particle.publish(), use Cellular.command()
// or similar functions. You can save data to a retained variable
// here safetly so you know the watchdog triggered when you
// restart.
// In 2.0.0 and later, RESET_NO_WAIT prevents notifying the cloud of a pending reset
System.reset(RESET_NO_WAIT);
}
void setup() {
// Start watchdog. Reset the system after 60 seconds if
// the application is unresponsive.
wd = new ApplicationWatchdog(60000, watchdogHandler, 1536);
}
void loop() {
while (some_long_process_within_loop) {
ApplicationWatchdog::checkin(); // resets the AWDT count
}
}
// AWDT count reset automatically after loop() ends
A default stack_size
of 512 is used for the thread. stack_size
is an optional parameter. The stack can be made larger or smaller as needed. This is generally too small, and it's best to use a minimum of 1536 bytes. If not enough stack memory is allocated, the application will crash due to a Stack Overflow. The RGB LED will flash a red SOS pattern, followed by 13 blinks.
The application watchdog requires interrupts to be active in order to function. Enabling the hardware watchdog in combination with this is recommended, so that the system resets in the event that interrupts are not firing.
The Particle.process()
function calls ApplicationWatchdog::checkin()
internally, so you can also use that to service the application watchdog.
Your watchdog handler should have the prototype:
void myWatchdogHandler(void);
You should generally not try to do anything other than call System.reset()
or perhaps set some retained variables in your application watchdog callback. In particular:
- Do not call any cloud functions like
Particle.publish()
or evenParticle.disconnect()
. - Do not call
Cellular.command()
.
Calling these functions will likely cause the system to deadlock and not reset.
The following is a recommended watchdog callback implementation. It sets a reset reason, and does not wait for the cloud connection to gracefully close, since the device is in a bad state and that will likely never complete.
void myWatchdogHandler(void) {
System.reset(RESET_REASON_USER_APPLICATION_WATCHDOG, RESET_NO_WAIT);
}
Note: waitFor
and waitUntil
do not tickle the application watchdog. If the condition you are waiting for is longer than the application watchdog timeout, the device will reset.
Since 1.5.0:
You can also specify a value using chrono literals, for example: wd = new ApplicationWatchdog(60s, System.reset)
for 60 seconds.