Jetson AGX Orin+JetPack 5.1.2
question description:
On Jetson Orin (Ubuntu 20.04, JetPack 5.1.2), the system fails to shut down cleanly from the graphical interface. After issuing a shutdown command, the console stalls for about one minute and then prints RCU-related warnings and stack traces.
From the kernel logs, multiple tasks (sugov, swapoff, cleanup_net) are stuck in synchronize_rcu(). The traces typically look like:
tegra194_cpufreq_set_target → icc_set_bw → mutex_lock → … → RCU
swapoff → synchronize_rcu
cleanup_net → synchronize_rcu
This suggests that the system hang is caused by an RCU grace period that never completes. As a result, any subsystem that calls synchronize_rcu() during shutdown (cpufreq, interconnect bandwidth control, swapoff, network cleanup) gets permanently stuck.
Analysis
The root cause appears to be a kernel-level RCU + cpufreq/ICC + swap interaction bug on the Tegra194/Orin platform, rather than a direct failure of a single external peripheral driver (Wi-Fi, Bluetooth, Ethernet).
External devices (e.g., netdev cleanup) may exacerbate the issue but are not the primary cause; even without loading certain Wi-Fi/BT modules, the RCU stalls still occur.
Disabling swap or locking the CPU governor to performance can sometimes mitigate the issue, but this is not a true fix.
Request for Guidance
Could NVIDIA please confirm:
Is this a known shutdown hang/RCU bug on JetPack 5.1.2 (L4T kernel for Orin)?
Is there an upstream patch, BSP update, or kernel workaround available to resolve this issue?
Are there recommended mitigations (e.g., disabling swap, using a specific CPU governor, disabling ICC paths) until an official fix is provided?
System details:
Hardware: Jetson Orin
OS: Ubuntu 20.04
JetPack: 5.1.2
Kernel: 5.10.120-rt70-tegra
error log:
[ 423.165816] rcu: INFO: rcu_preempt self-detected stall on CPU
[ 423.165826] rcu: 0-…: (5371 ticks this GP) idle=ab6/1/0x4000000000000002 softirq=8944/8945 fqs=2013
[ 423.165833] (t=5250 jiffies g=13105 q=6037)
[ 423.165836] Task dump for CPU 0:
[ 423.165838] task:migration/0 state:R running task stack: 0 pid: 14 ppid: 2 flags:0x0000002a
[ 423.165845] Call trace:
[ 423.165846] dump_backtrace+0x0/0x1d0
[ 423.165859] show_stack+0x30/0x40
[ 423.165863] sched_show_task+0x148/0x170
[ 423.165871] dump_cpu_task+0x4c/0x58
[ 423.165879] rcu_dump_cpu_stacks+0xb8/0xf4
[ 423.165882] rcu_sched_clock_irq+0xb14/0xec0
[ 423.165888] update_process_times+0x68/0xa0
[ 423.165893] tick_sched_handle.isra.0+0x38/0x70
[ 423.165897] tick_sched_timer+0x54/0xb0
[ 423.165899] __hrtimer_run_queues+0x148/0x360
[ 423.165902] hrtimer_interrupt+0xf0/0x250
[ 423.165906] arch_timer_handler_phys+0x40/0x50
[ 423.165913] handle_percpu_devid_irq+0x90/0x280
[ 423.165917] generic_handle_irq+0x40/0x60
[ 423.165920] __handle_domain_irq+0x70/0xd0
[ 423.165922] gic_handle_irq+0x68/0x134
[ 423.165924] el1_irq+0xd0/0x1c0
[ 423.165926] _raw_spin_unlock_irq+0x2c/0x70
[ 423.165930] __schedule.constprop.0+0x814/0x8c0
[ 423.165934] schedule+0x8c/0x100
[ 423.165937] smpboot_thread_fn+0x25c/0x270
[ 423.165941] kthread+0x16c/0x190
[ 423.165947] ret_from_fork+0x10/0x24
[ 423.165950] rcu: ====For debug only: Start Printing Blocked Tasks====<print_cpu_stall>
[ 423.166010] task:kworker/u24:1 state:D stack: 0 pid: 128 ppid: 2 flags:0x00000028
[ 423.166017] Workqueue: netns cleanup_net
[ 423.166024] Call trace:
[ 423.166024] __switch_to+0xc8/0x120
[ 423.166028] __schedule.constprop.0+0x320/0x8c0
[ 423.166031] schedule+0x8c/0x100
[ 423.166035] schedule_timeout+0x2c0/0x320
[ 423.166036] wait_for_completion+0x8c/0x120
[ 423.166040] __wait_rcu_gp+0x184/0x190
[ 423.166044] synchronize_rcu+0x8c/0xa0
[ 423.166047] cleanup_net+0x218/0x390
[ 423.166048] process_one_work+0x1c4/0x490
[ 423.166051] worker_thread+0x54/0x430
[ 423.166053] kthread+0x16c/0x190
[ 423.166056] ret_from_fork+0x10/0x24
[ 423.166132] task:sugov:8 state:D stack: 0 pid: 543 ppid: 2 flags:0x00000028
[ 423.166136] Call trace:
[ 423.166136] __switch_to+0xc8/0x120
[ 423.166139] __schedule.constprop.0+0x320/0x8c0
[ 423.166143] schedule+0x8c/0x100
[ 423.166146] schedule_preempt_disabled+0x2c/0x50
[ 423.166149] __mutex_lock.isra.0+0x18c/0x560
[ 423.166153] __mutex_lock_slowpath+0x28/0x40
[ 423.166157] mutex_lock+0x60/0x70
[ 423.166160] icc_set_bw+0x54/0x2d0
[ 423.166165] tegra194_cpufreq_set_target+0x120/0x150
[ 423.166172] __cpufreq_driver_target+0x1b0/0x5c0
[ 423.166175] sugov_work+0x64/0x80
[ 423.166178] kthread_worker_fn+0xa0/0x170
[ 423.166181] kthread+0x16c/0x190
[ 423.166184] ret_from_fork+0x10/0x24
[ 423.166187] task:sugov:4 state:D stack: 0 pid: 547 ppid: 2 flags:0x00000028
[ 423.166191] Call trace:
[ 423.166191] __switch_to+0xc8/0x120
[ 423.166195] __schedule.constprop.0+0x320/0x8c0
[ 423.166198] schedule+0x8c/0x100
[ 423.166201] schedule_preempt_disabled+0x2c/0x50
[ 423.166204] __mutex_lock.isra.0+0x18c/0x560
[ 423.166207] __mutex_lock_slowpath+0x28/0x40
[ 423.166211] mutex_lock+0x60/0x70
[ 423.166214] icc_set_bw+0x54/0x2d0
[ 423.166216] tegra194_cpufreq_set_target+0x120/0x150
[ 423.166219] __cpufreq_driver_target+0x1b0/0x5c0
[ 423.166222] sugov_work+0x64/0x80
[ 423.166223] kthread_worker_fn+0xa0/0x170
[ 423.166226] kthread+0x16c/0x190
[ 423.166229] ret_from_fork+0x10/0x24
[ 423.166258] task:swapoff state:D stack: 0 pid: 3625 ppid: 1 flags:0x00000000
[ 423.166261] Call trace:
[ 423.166261] __switch_to+0xc8/0x120
[ 423.166264] __schedule.constprop.0+0x320/0x8c0
[ 423.166267] schedule+0x8c/0x100
[ 423.166270] schedule_timeout+0x2c0/0x320
[ 423.166272] wait_for_completion+0x8c/0x120
[ 423.166276] __wait_rcu_gp+0x184/0x190
[ 423.166278] synchronize_rcu+0x8c/0xa0
[ 423.166281] __arm64_sys_swapoff+0x230/0x650
[ 423.166286] el0_svc_common.constprop.0+0x80/0x1d0
[ 423.166292] do_el0_svc+0x38/0xb0
[ 423.166295] el0_svc+0x1c/0x30
[ 423.166299] el0_sync_handler+0xa8/0xb0
[ 423.166302] el0_sync+0x16c/0x180
[ 423.166305] task:swapoff state:D stack: 0 pid: 3626 ppid: 1 flags:0x00000000
[ 423.166308] Call trace:
[ 423.166308] __switch_to+0xc8/0x120
[ 423.166312] __schedule.constprop.0+0x320/0x8c0
[ 423.166315] schedule+0x8c/0x100
[ 423.166318] schedule_timeout+0x2c0/0x320
[ 423.166319] wait_for_completion+0x8c/0x120
[ 423.166323] __wait_rcu_gp+0x184/0x190
[ 423.166325] synchronize_rcu+0x8c/0xa0
[ 423.166328] __arm64_sys_swapoff+0x230/0x650
[ 423.166329] el0_svc_common.constprop.0+0x80/0x1d0
[ 423.166333] do_el0_svc+0x38/0xb0
[ 423.166336] el0_svc+0x1c/0x30
[ 423.166339] el0_sync_handler+0xa8/0xb0
[ 423.166342] el0_sync+0x16c/0x180
[ 423.166345] task:swapoff state:D stack: 0 pid: 3627 ppid: 1 flags:0x00000000
[ 423.166348] Call trace:
[ 423.166348] __switch_to+0xc8/0x120
[ 423.166352] __schedule.constprop.0+0x320/0x8c0
[ 423.166355] schedule+0x8c/0x100
[ 423.166358] schedule_timeout+0x2c0/0x320
[ 423.166359] wait_for_completion+0x8c/0x120
[ 423.166362] __wait_rcu_gp+0x184/0x190
[ 423.166365] synchronize_rcu+0x8c/0xa0
[ 423.166367] __arm64_sys_swapoff+0x230/0x650
[ 423.166369] el0_svc_common.constprop.0+0x80/0x1d0
[ 423.166372] do_el0_svc+0x38/0xb0
[ 423.166376] el0_svc+0x1c/0x30
[ 423.166379] el0_sync_handler+0xa8/0xb0
[ 423.166381] el0_sync+0x16c/0x180
[ 423.166384] task:swapoff state:D stack: 0 pid: 3628 ppid: 1 flags:0x00000000
[ 423.166386] Call trace:
[ 423.166387] __switch_to+0xc8/0x120
[ 423.166390] __schedule.constprop.0+0x320/0x8c0
[ 423.166393] schedule+0x8c/0x100
[ 423.166396] schedule_timeout+0x2c0/0x320
[ 423.166397] wait_for_completion+0x8c/0x120
[ 423.166400] __wait_rcu_gp+0x184/0x190
[ 423.166403] synchronize_rcu+0x8c/0xa0
[ 423.166405] __arm64_sys_swapoff+0x230/0x650
[ 423.166407] el0_svc_common.constprop.0+0x80/0x1d0
[ 423.166410] do_el0_svc+0x38/0xb0
[ 423.166413] el0_svc+0x1c/0x30
[ 423.166416] el0_sync_handler+0xa8/0xb0
[ 423.166418] el0_sync+0x16c/0x180
[ 423.166421] task:swapoff state:D stack: 0 pid: 3629 ppid: 1 flags:0x00000000
[ 423.166423] Call trace:
[ 423.166424] __switch_to+0xc8/0x120
[ 423.166427] __schedule.constprop.0+0x320/0x8c0
[ 423.166430] schedule+0x8c/0x100
[ 423.166433] schedule_timeout+0x2c0/0x320
[ 423.166435] wait_for_completion+0x8c/0x120
[ 423.166438] __wait_rcu_gp+0x184/0x190
[ 423.166440] synchronize_rcu+0x8c/0xa0
[ 423.166442] __arm64_sys_swapoff+0x230/0x650
[ 423.166444] el0_svc_common.constprop.0+0x80/0x1d0
[ 423.166447] do_el0_svc+0x38/0xb0
[ 423.166450] el0_svc+0x1c/0x30
[ 423.166453] el0_sync_handler+0xa8/0xb0
[ 423.166456] el0_sync+0x16c/0x180
[ 423.166460] task:swapoff state:D stack: 0 pid: 3630 ppid: 1 flags:0x00000000
[ 423.166462] Call trace:
[ 423.166463] __switch_to+0xc8/0x120
[ 423.166467] __schedule.constprop.0+0x320/0x8c0
[ 423.166470] schedule+0x8c/0x100
[ 423.166473] schedule_timeout+0x2c0/0x320
[ 423.166474] wait_for_completion+0x8c/0x120
[ 423.166477] __wait_rcu_gp+0x184/0x190
[ 423.166480] synchronize_rcu+0x8c/0xa0
[ 423.166482] __arm64_sys_swapoff+0x230/0x650
[ 423.166484] el0_svc_common.constprop.0+0x80/0x1d0
[ 423.166487] do_el0_svc+0x38/0xb0
[ 423.166491] el0_svc+0x1c/0x30
[ 423.166493] el0_sync_handler+0xa8/0xb0
[ 423.166496] el0_sync+0x16c/0x180
[ 423.166499] task:swapoff state:D stack: 0 pid: 3631 ppid: 1 flags:0x00000000
[ 423.166501] Call trace:
[ 423.166502] __switch_to+0xc8/0x120
[ 423.166505] __schedule.constprop.0+0x320/0x8c0
[ 423.166508] schedule+0x8c/0x100
[ 423.166512] schedule_timeout+0x2c0/0x320
[ 423.166513] wait_for_completion+0x8c/0x120
[ 423.166516] __wait_rcu_gp+0x184/0x190
[ 423.166519] synchronize_rcu+0x8c/0xa0
[ 423.166521] __arm64_sys_swapoff+0x230/0x650
[ 423.166523] el0_svc_common.constprop.0+0x80/0x1d0
[ 423.166526] do_el0_svc+0x38/0xb0
[ 423.166529] el0_svc+0x1c/0x30
[ 423.166531] el0_sync_handler+0xa8/0xb0
[ 423.166534] el0_sync+0x16c/0x180
[ 423.166537] task:swapoff state:D stack: 0 pid: 3632 ppid: 1 flags:0x00000000
[ 423.166539] Call trace:
[ 423.166539] __switch_to+0xc8/0x120
[ 423.166543] __schedule.constprop.0+0x320/0x8c0
[ 423.166546] schedule+0x8c/0x100
[ 423.166549] schedule_timeout+0x2c0/0x320
[ 423.166551] wait_for_completion+0x8c/0x120
[ 423.166554] __wait_rcu_gp+0x184/0x190
[ 423.166556] synchronize_rcu+0x8c/0xa0
[ 423.166558] __arm64_sys_swapoff+0x230/0x650
[ 423.166560] el0_svc_common.constprop.0+0x80/0x1d0
[ 423.166563] do_el0_svc+0x38/0xb0
[ 423.166566] el0_svc+0x1c/0x30
[ 423.166569] el0_sync_handler+0xa8/0xb0
[ 423.166571] el0_sync+0x16c/0x180
[ 423.166575] task:swapoff state:D stack: 0 pid: 3633 ppid: 1 flags:0x00000000
[ 423.166576] Call trace:
[ 423.166577] __switch_to+0xc8/0x120
[ 423.166580] __schedule.constprop.0+0x320/0x8c0
[ 423.166583] schedule+0x8c/0x100
[ 423.166586] schedule_timeout+0x2c0/0x320
[ 423.166588] wait_for_completion+0x8c/0x120
[ 423.166591] __wait_rcu_gp+0x184/0x190
[ 423.166593] synchronize_rcu+0x8c/0xa0
[ 423.166595] __arm64_sys_swapoff+0x230/0x650
[ 423.166597] el0_svc_common.constprop.0+0x80/0x1d0
[ 423.166601] do_el0_svc+0x38/0xb0
[ 423.166604] el0_svc+0x1c/0x30
[ 423.166606] el0_sync_handler+0xa8/0xb0
[ 423.166609] el0_sync+0x16c/0x180
[ 423.166613] task:swapoff state:D stack: 0 pid: 3634 ppid: 1 flags:0x00000000
[ 423.166615] Call trace:
[ 423.166615] __switch_to+0xc8/0x120
[ 423.166619] __schedule.constprop.0+0x320/0x8c0
[ 423.166622] schedule+0x8c/0x100
[ 423.166625] schedule_timeout+0x2c0/0x320
[ 423.166626] wait_for_completion+0x8c/0x120
[ 423.166630] __wait_rcu_gp+0x184/0x190
[ 423.166632] synchronize_rcu+0x8c/0xa0
[ 423.166634] __arm64_sys_swapoff+0x230/0x650
[ 423.166636] el0_svc_common.constprop.0+0x80/0x1d0
[ 423.166639] do_el0_svc+0x38/0xb0
[ 423.166642] el0_svc+0x1c/0x30
[ 423.166645] el0_sync_handler+0xa8/0xb0
[ 423.166647] el0_sync+0x16c/0x180
[ 423.166650] task:swapoff state:D stack: 0 pid: 3635 ppid: 1 flags:0x00000000
[ 423.166652] Call trace:
[ 423.166652] __switch_to+0xc8/0x120
[ 423.166656] __schedule.constprop.0+0x320/0x8c0
[ 423.166659] schedule+0x8c/0x100
[ 423.166662] schedule_timeout+0x2c0/0x320
[ 423.166663] wait_for_completion+0x8c/0x120
[ 423.166666] __wait_rcu_gp+0x184/0x190
[ 423.166669] synchronize_rcu+0x8c/0xa0
[ 423.166671] __arm64_sys_swapoff+0x230/0x650
[ 423.166673] el0_svc_common.constprop.0+0x80/0x1d0
[ 423.166676] do_el0_svc+0x38/0xb0
[ 423.166679] el0_svc+0x1c/0x30
[ 423.166682] el0_sync_handler+0xa8/0xb0
[ 423.166684] el0_sync+0x16c/0x180
[ 423.166688] task:swapoff state:D stack: 0 pid: 3636 ppid: 1 flags:0x00000000
[ 423.166690] Call trace:
[ 423.166690] __switch_to+0xc8/0x120
[ 423.166693] __schedule.constprop.0+0x320/0x8c0
[ 423.166696] schedule+0x8c/0x100
[ 423.166699] schedule_timeout+0x2c0/0x320
[ 423.166701] wait_for_completion+0x8c/0x120
[ 423.166704] __wait_rcu_gp+0x184/0x190
[ 423.166706] synchronize_rcu+0x8c/0xa0
[ 423.166708] __arm64_sys_swapoff+0x230/0x650
[ 423.166710] el0_svc_common.constprop.0+0x80/0x1d0
[ 423.166714] do_el0_svc+0x38/0xb0
[ 423.166717] el0_svc+0x1c/0x30
[ 423.166720] el0_sync_handler+0xa8/0xb0
[ 423.166722] el0_sync+0x16c/0x180
[ 423.166725] rcu: ====For debug only: End Printing Blocked Tasks====<print_cpu_stall>