Describe the bug
Sustained FireWire AV/C operations (specifically trickplay + timecode commands sent via dvcont / libavc1394) eventually cause processes to enter uninterruptible sleep (D state). In one instance, the hung task detector showed the process stuck in fw_device_op_release(), but I have not confirmed this is the case for every occurrence. Once triggered, modprobe -r firewire_ohci also hangs, and the system cannot shut down cleanly. The systemd-shutdown blocks indefinitely waiting for the stuck processes.
No DMA or OHCI errors appear in dmesg. I'm not sure whether the root cause is in firewire_core, in the interaction between libavc1394 and the kernel, or something specific to this hardware/platform combination. However, regardless of the real cause, I think my operations shouldn't be able to leave processes in an unkillable D state.
Steps to reproduce the behaviour
- Connect a MiniDV device (camera or deck) via FireWire to a Pi 5 with a PCIe OHCI card
- Run the following loop:
for i in $(seq 1 500); do
dvcont trickplay 5
sleep 0.2
dvcont timecode
sleep 0.2
dvcont trickplay -5
sleep 0.2
dvcont timecode
sleep 0.2
echo "$i"
done
- The loop will eventually hang (one
dvcont process enters D state and never returns)
In my tests:
- Clean boot, trickplay+timecode loop: hangs at ~487 iterations
- After 500×
dvcont status (which alone does not hang), trickplay+timecode loop: hangs at ~22 iterations
- Pure
dvcont status ×500: does not hang
- Pure
open()/close() on /dev/fw1 ×500: does not hang, but a single dvcont status after this immediately hangs.
Device (s)
Raspberry Pi 5
System
https://pastebin.com/kV6EecHZ
Logs
Kernel stack trace (from hung task detector):
INFO: task dvcont:4136 blocked for more than 120 seconds.
Not tainted 6.12.78-v8-16k+ #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:dvcont state:D stack:0 pid:4136 tgid:4136 ppid:1 flags:0x0000000d
Call trace:
__switch_to+0xe8/0x148
__schedule+0x390/0xb68
schedule+0x3c/0x148
fw_device_op_release+0x2d4/0x320 [firewire_core]
__fput+0xd0/0x2e0
____fput+0x1c/0x30
task_work_run+0x88/0x100
do_exit+0x2e8/0x9a8
do_group_exit+0x3c/0xa0
get_signal+0x960/0xa20
do_signal+0xfc/0x1060
do_notify_resume+0xd8/0x160
el0_svc+0xd4/0xf8
el0t_64_sync_handler+0x120/0x130
el0t_64_sync+0x190/0x198
dmesg (no firewire errors prior to hang):
[ 5.835977] firewire_ohci 0001:02:00.0: enabling device (0000 -> 0002)
[ 5.896418] firewire_ohci 0001:02:00.0: added OHCI v1.10 device as card 0, 8 IR + 8 IT contexts, quirks 0x2
[ 6.431656] firewire_core 0001:02:00.0: created device fw0: GUID 7856341278563412, S800
[ 1765.948922] firewire_core 0001:02:00.0: phy config: new root=ffc1, gap_count=5
[ 1766.681399] firewire_core 0001:02:00.0: created device fw1: GUID 0080458020f683bb, S100
[ 2146.157674] firewire_core 0001:02:00.0: phy config: new root=ffc1, gap_count=5
[ 2146.899609] firewire_core 0001:02:00.0: created device fw1: GUID 0080458020f683bb, S100
Process state when hung:
yanqihe 4136 0.0 0.0 0 0 pts/0 D 21:46 0:00 [dvcont]
yanqihe 4140 0.0 0.0 0 0 pts/0 D 21:46 0:00 [dvcont]
yanqihe 4144 0.0 0.0 0 0 pts/0 D 21:46 0:00 [dvcont]
yanqihe 4149 0.0 0.0 0 0 pts/0 D 21:46 0:00 [dvcont]
yanqihe 4154 0.0 0.0 0 0 pts/0 D+ 21:47 0:00 [dvcont]
root 4170 0.0 0.0 11248 3936 pts/2 D+ 21:48 0:00 modprobe -r firewire_ohci
Cascading failure:
Once the hang occurs:
sudo modprobe -r firewire_ohci hangs
sudo reboot blocks at shutdown, waiting for the stuck processes indefinitely
The attached log:
firewire_hang_dmesg.txt
Additional context
The use case is an art installation that uses dvcont trickplay commands to seek a MiniDV tape to specific timecodes based on real-time audio input. The hang occurs reliably during normal operation after an extended period. I have not been able to test with a second FireWire card or on another platform, so I can't rule out a hardware-specific issue with this particular controller. I'll try another FireWire card later.
lspci -v:
0001:00:00.0 PCI bridge: Broadcom Inc. and subsidiaries BCM2712 PCIe Bridge (rev 30) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 39
Bus: primary=00, secondary=01, subordinate=02, sec-latency=0
Memory behind bridge: 80000000-801fffff [size=2M] [32-bit]
Prefetchable memory behind bridge: [disabled] [64-bit]
Capabilities: <access denied>
Kernel driver in use: pcieport
0001:01:00.0 PCI bridge: Texas Instruments XIO2213A/B/XIO2221 PCI Express to PCI Bridge [Cheetah Express] (rev 01) (prog-if 00 [Normal decode])
Subsystem: Device 3412:7856
Flags: bus master, fast devsel, latency 0
Memory at 1b80100000 (32-bit, non-prefetchable) [size=4K]
Bus: primary=01, secondary=02, subordinate=02, sec-latency=0
I/O behind bridge: [disabled] [32-bit]
Memory behind bridge: 80000000-800fffff [size=1M] [32-bit]
Prefetchable memory behind bridge: [disabled] [64-bit]
Capabilities: <access denied>
0001:02:00.0 FireWire (IEEE 1394): Texas Instruments XIO2213A/B/XIO2221 IEEE-1394b OHCI Controller [Cheetah Express] (rev 01) (prog-if 10 [OHCI])
Subsystem: Device 3412:7856
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 39
Memory at 1b80004000 (32-bit, non-prefetchable) [size=2K]
Memory at 1b80000000 (32-bit, non-prefetchable) [size=16K]
Capabilities: <access denied>
Kernel driver in use: firewire_ohci
Kernel modules: firewire_ohci
Describe the bug
Sustained FireWire AV/C operations (specifically
trickplay+timecodecommands sent viadvcont/ libavc1394) eventually cause processes to enter uninterruptible sleep (Dstate). In one instance, the hung task detector showed the process stuck infw_device_op_release(), but I have not confirmed this is the case for every occurrence. Once triggered,modprobe -r firewire_ohcialso hangs, and the system cannot shut down cleanly. Thesystemd-shutdownblocks indefinitely waiting for the stuck processes.No DMA or OHCI errors appear in dmesg. I'm not sure whether the root cause is in
firewire_core, in the interaction between libavc1394 and the kernel, or something specific to this hardware/platform combination. However, regardless of the real cause, I think my operations shouldn't be able to leave processes in an unkillableDstate.Steps to reproduce the behaviour
dvcontprocess entersDstate and never returns)In my tests:
dvcont status(which alone does not hang), trickplay+timecode loop: hangs at ~22 iterationsdvcont status×500: does not hangopen()/close()on/dev/fw1×500: does not hang, but a singledvcont statusafter this immediately hangs.Device (s)
Raspberry Pi 5
System
https://pastebin.com/kV6EecHZ
Logs
Kernel stack trace (from hung task detector):
dmesg (no firewire errors prior to hang):
Process state when hung:
Cascading failure:
Once the hang occurs:
sudo modprobe -r firewire_ohcihangssudo rebootblocks at shutdown, waiting for the stuck processes indefinitelyThe attached log:
firewire_hang_dmesg.txt
Additional context
The use case is an art installation that uses
dvcont trickplaycommands to seek a MiniDV tape to specific timecodes based on real-time audio input. The hang occurs reliably during normal operation after an extended period. I have not been able to test with a second FireWire card or on another platform, so I can't rule out a hardware-specific issue with this particular controller. I'll try another FireWire card later.lspci -v: