15. Handle crashes and view snapshots#
When a system crashes it can be difficult to debug the cause of the crash. To help with this, pw_system provides a crash handler built on pw_cpu_exception which is invoked when a CPU exception is triggered.
The pw_system crash handler will automatically create a crash snapshot on exception and reboot the system. This snapshot can then be downloaded onto a host system for analysis.
Generate a crash#
pw_system
provides an RPC to crash the system by triggering a
HardFault.
To invoke this RPC, type the following into the Python Repl
>>> device.rpcs.pw.system.proto.DeviceService.Crash()
If you’re using the full setup, on system restart, the presence of a crash snapshot will be detected, and the following will be in the system logs:
INF pw_system RpcDevice 00:00:00.000 pw_system main
ERR pw_system RpcDevice 00:00:00.000 ==========================
ERR pw_system RpcDevice 00:00:00.000 ======CRASH DETECTED======
ERR pw_system RpcDevice 00:00:00.000 ==========================
ERR pw_system RpcDevice 00:00:00.000 Crash snapshots available.
ERR pw_system RpcDevice 00:00:00.000 Run `device.get_crash_snapshots()` to download and clear the snapshots.
INF pw_system RpcDevice 00:00:00.000 System init
On a basic setup, the snapshot
will still be generated, but logs won’t be visible in the console. When
the system restarts, the connection to the console is broken so logs
will not be displayed. After invoking the Crash()
RPC, exit the console
and start it again to
re-establish the connection.
View a crash snapshot#
The crash snapshot contains relevant information to debug crashes, such as register state thread backtraces and un-flushed logs. If there is a crash snapshot on the system, it can be downloaded to the host with the following RPC.
>>> device.get_crash_snapshots()
This RPC will download the snapshot, decode it and save it in a temporary directory, the location of which will be printed to the console as follows:
INF Wrote crash snapshot to: /var/folders/2j/sjk9390d5rxc3c9ycwcf3mdh0103lh/T/crash_0.txt
It’s also possible to specify the path as part of the RPC call:
>>> device.get_crash_snapshots("/path/")
The decoded text file should look similar to this truncated example:
Device crash cause:
pw_system/device_service_pwpb.cc:38 Crash: RPC triggered crash
Reason token: 0x735f7770
CPU Arch: ARMV8M
Exception caused by a usage fault.
Active Crash Fault Status Register (CFSR) fields:
UNDEFINSTR Undefined Instruction UsageFault.
The processor has attempted to execute an undefined
instruction. When this bit is set to 1, the PC value stacked
for the exception return points to the undefined instruction.
An undefined instruction is an instruction that the processor
cannot decode.
All registers:
pc 0x10000f0a pw::system::DeviceServicePwpb::Crash(pw::system::proto::pwpb::CrashRequest::Message const&, pw::system::proto::pwpb::CrashResponse::Message&) (/b/pw_system/device_service_pwpb.cc:38)
lr 0x10012787 pw::StringBuilder::FormatVaList(char const*, std::__va_list) (/build/pw_string/string_builder.cc:102)
psr 0x41000000
msp 0x20081fe0 __scratch_y_end__ (??:?)
psp 0x2000a100 pw::system::rpc_thread_context (freertos_target_hooks.cc:0)
exc_return 0xfffffffd
cfsr 0x00010000
msplim 0x00000000
psplim 0x20002288
mmfar 0xe000ed34 __scratch_y_end__ (??:?)
bfar 0xe000ed38 __scratch_y_end__ (??:?)
icsr 0x00400806
hfsr 0x00000000
shcsr 0x00070008
control 0x00000000
r0 0x2000a0e0 pw::system::rpc_thread_context (freertos_target_hooks.cc:0)
r1 0x0000003e pw_assert_basic_HandleFailure (/b/pw_assert_basic/basic_handler.cc:74)
r2 0x0000002b pw_assert_basic_HandleFailure (/b/pw_assert_basic/basic_handler.cc:74)
r3 0x2000a100 pw::system::rpc_thread_context (freertos_target_hooks.cc:0)
r4 0x10019596
r5 0x2000a178 pw::system::rpc_thread_context (freertos_target_hooks.cc:0)
r6 0x10019eec pw::system::proto::pw_rpc::pwpb::DeviceService::Service<pw::system::DeviceServicePwpb>::kPwRpcMethods (??:?)
r7 0x2000a108 pw::system::rpc_thread_context (freertos_target_hooks.cc:0)
r8 0x2000a118 pw::system::rpc_thread_context (freertos_target_hooks.cc:0)
r9 0x2000a16e pw::system::rpc_thread_context (freertos_target_hooks.cc:0)
r10 0x2000b4f0 pw::system::(anonymous namespace)::server (hdlc_rpc_server.cc:0)
r11 0x2000a22c pw::system::rpc_thread_context (freertos_target_hooks.cc:0)
r12 0x00000008 pw_assert_HandleFailure (/b/pw_assert_basic/assert_basic.cc:20)
Thread State
6 threads running, RpcThread active at the time of capture.
~~~~~~~~~
Thread (RUNNING): RpcThread <-- [ACTIVE]
Est CPU usage: unknown
Stack info
Current usage: 0x2000a288 - 0x2000a100 (392 bytes, 1.20%)
Est peak usage: 944 bytes, 2.88%
Stack limits: 0x2000a288 - 0x2000228c (32764 bytes)
Stack Trace (most recent call first):
1: at void pw::rpc::internal::PwpbMethod::CallSynchronousUnary<pw::system::proto::pwpb::RebootRequest::Message, pw::system::proto::pwpb::RebootResponse::Message>(pw::rpc::internal::CallContext const&, pw::rpc::internal::Packet const&, pw::system::proto::pwpb::RebootRequest::Message&, pw::system::proto::pwpb::RebootResponse::Message&) const (0x10000F59)
in /build/pw_rpc/pwpb/public/pw_rpc/pwpb/internal/method.h:258
2: at void pw::rpc::internal::PwpbMethod::CallSynchronousUnary<pw::system::proto::pwpb::CrashRequest::Message, pw::system::proto::pwpb::CrashResponse::Message>(pw::rpc::internal::CallContext const&, pw::rpc::internal::Packet const&, pw::system::proto::pwpb::CrashRequest::Message&, pw::system::proto::pwpb::CrashResponse::Message&) const (0x10001137)
in /build/pw_rpc/pwpb/public/pw_rpc/pwpb/internal/method.h:267
3: at xQueueSemaphoreTake (0x10013049)
in /build/external/freertos+/queue.c:1555
4: at void pw::rpc::internal::PwpbMethod::SynchronousUnaryInvoker<pw::system::proto::pwpb::CrashRequest::Message, pw::system::proto::pwpb::CrashResponse::Message>(pw::rpc::internal::CallContext const&, pw::rpc::internal::Packet const&) (0x10000F4F)
in /build/pw_rpc/pwpb/public/pw_rpc/pwpb/internal/method.h:322
5: at pw::rpc::Server::ProcessPacket(pw::rpc::internal::Packet) (0x1000EA9D)
in /build/pw_rpc/public/pw_rpc/internal/method.h:0
6: at pw::rpc::Server::ProcessPacket(pw::span<std::byte const, 4294967295u>) (0x1000E9CD)
in /build/pw_rpc/server.cc:40
7: at pw::system::RpcDispatchThread::Run() (0x10008625)
in /build/pw_system/hdlc_rpc_server.cc:127
8: at pw::thread::freertos::Context::ThreadEntryPoint(void*) (0x1000EFA5)
in /build/third_party/fuchsia/repo/sdk/lib/fit/include/lib/fit/internal/function.h:362
9: at prvTaskExitError (0x100137C9)
in /build/external/freertos+/portable/GCC/ARM_CM33_NTZ/non_secure/port.c:634
...
Device Logs:
[RpcDevice] pw_system 0 pw_system main targets/rp2040/boot.cc:56
[RpcDevice] pw_system 0 System init pw_system/init.cc:65
[RpcDevice] pw_system 0 Registering RPC services pw_system/init.cc:75
...
Summary#
On this page, we met pw_cpu_exception
, the CPU exception handler entry point.
We also learned how to generate crashes and download the resulting crash snapshot.
Next, head over to 16. Wrapping up to wrap up your tour of Pigweed.