Skip to content

Process enters uninterruptible sleep during sustained FireWire AV/C operations on Pi 5 #7352

@YanqiHe03

Description

@YanqiHe03

Describe the bug

Sustained FireWire AV/C operations (specifically trickplay + timecode commands sent via dvcont / libavc1394) eventually cause processes to enter uninterruptible sleep (D state). In one instance, the hung task detector showed the process stuck in fw_device_op_release(), but I have not confirmed this is the case for every occurrence. Once triggered, modprobe -r firewire_ohci also hangs, and the system cannot shut down cleanly. The systemd-shutdown blocks indefinitely waiting for the stuck processes.

No DMA or OHCI errors appear in dmesg. I'm not sure whether the root cause is in firewire_core, in the interaction between libavc1394 and the kernel, or something specific to this hardware/platform combination. However, regardless of the real cause, I think my operations shouldn't be able to leave processes in an unkillable D state.

Steps to reproduce the behaviour

  1. Connect a MiniDV device (camera or deck) via FireWire to a Pi 5 with a PCIe OHCI card
  2. Run the following loop:
for i in $(seq 1 500); do
  dvcont trickplay 5
  sleep 0.2
  dvcont timecode
  sleep 0.2
  dvcont trickplay -5
  sleep 0.2
  dvcont timecode
  sleep 0.2
  echo "$i"
done
  1. The loop will eventually hang (one dvcont process enters D state and never returns)

In my tests:

  • Clean boot, trickplay+timecode loop: hangs at ~487 iterations
  • After 500× dvcont status (which alone does not hang), trickplay+timecode loop: hangs at ~22 iterations
  • Pure dvcont status ×500: does not hang
  • Pure open()/close() on /dev/fw1 ×500: does not hang, but a single dvcont status after this immediately hangs.

Device (s)

Raspberry Pi 5

System

https://pastebin.com/kV6EecHZ

Logs

Kernel stack trace (from hung task detector):

INFO: task dvcont:4136 blocked for more than 120 seconds.
      Not tainted 6.12.78-v8-16k+ #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:dvcont          state:D stack:0     pid:4136  tgid:4136  ppid:1      flags:0x0000000d
Call trace:
  __switch_to+0xe8/0x148
  __schedule+0x390/0xb68
  schedule+0x3c/0x148
  fw_device_op_release+0x2d4/0x320 [firewire_core]
  __fput+0xd0/0x2e0
  ____fput+0x1c/0x30
  task_work_run+0x88/0x100
  do_exit+0x2e8/0x9a8
  do_group_exit+0x3c/0xa0
  get_signal+0x960/0xa20
  do_signal+0xfc/0x1060
  do_notify_resume+0xd8/0x160
  el0_svc+0xd4/0xf8
  el0t_64_sync_handler+0x120/0x130
  el0t_64_sync+0x190/0x198

dmesg (no firewire errors prior to hang):

[    5.835977] firewire_ohci 0001:02:00.0: enabling device (0000 -> 0002)
[    5.896418] firewire_ohci 0001:02:00.0: added OHCI v1.10 device as card 0, 8 IR + 8 IT contexts, quirks 0x2
[    6.431656] firewire_core 0001:02:00.0: created device fw0: GUID 7856341278563412, S800
[ 1765.948922] firewire_core 0001:02:00.0: phy config: new root=ffc1, gap_count=5
[ 1766.681399] firewire_core 0001:02:00.0: created device fw1: GUID 0080458020f683bb, S100
[ 2146.157674] firewire_core 0001:02:00.0: phy config: new root=ffc1, gap_count=5
[ 2146.899609] firewire_core 0001:02:00.0: created device fw1: GUID 0080458020f683bb, S100

Process state when hung:

yanqihe     4136  0.0  0.0      0     0 pts/0    D    21:46   0:00 [dvcont]
yanqihe     4140  0.0  0.0      0     0 pts/0    D    21:46   0:00 [dvcont]
yanqihe     4144  0.0  0.0      0     0 pts/0    D    21:46   0:00 [dvcont]
yanqihe     4149  0.0  0.0      0     0 pts/0    D    21:46   0:00 [dvcont]
yanqihe     4154  0.0  0.0      0     0 pts/0    D+   21:47   0:00 [dvcont]
root        4170  0.0  0.0  11248  3936 pts/2    D+   21:48   0:00 modprobe -r firewire_ohci

Cascading failure:

Once the hang occurs:

  • sudo modprobe -r firewire_ohci hangs
  • sudo reboot blocks at shutdown, waiting for the stuck processes indefinitely

The attached log:

firewire_hang_dmesg.txt

Additional context

The use case is an art installation that uses dvcont trickplay commands to seek a MiniDV tape to specific timecodes based on real-time audio input. The hang occurs reliably during normal operation after an extended period. I have not been able to test with a second FireWire card or on another platform, so I can't rule out a hardware-specific issue with this particular controller. I'll try another FireWire card later.

lspci -v:

0001:00:00.0 PCI bridge: Broadcom Inc. and subsidiaries BCM2712 PCIe Bridge (rev 30) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0, IRQ 39
        Bus: primary=00, secondary=01, subordinate=02, sec-latency=0
        Memory behind bridge: 80000000-801fffff [size=2M] [32-bit]
        Prefetchable memory behind bridge: [disabled] [64-bit]
        Capabilities: <access denied>
        Kernel driver in use: pcieport
 
0001:01:00.0 PCI bridge: Texas Instruments XIO2213A/B/XIO2221 PCI Express to PCI Bridge [Cheetah Express] (rev 01) (prog-if 00 [Normal decode])
        Subsystem: Device 3412:7856
        Flags: bus master, fast devsel, latency 0
        Memory at 1b80100000 (32-bit, non-prefetchable) [size=4K]
        Bus: primary=01, secondary=02, subordinate=02, sec-latency=0
        I/O behind bridge: [disabled] [32-bit]
        Memory behind bridge: 80000000-800fffff [size=1M] [32-bit]
        Prefetchable memory behind bridge: [disabled] [64-bit]
        Capabilities: <access denied>
 
0001:02:00.0 FireWire (IEEE 1394): Texas Instruments XIO2213A/B/XIO2221 IEEE-1394b OHCI Controller [Cheetah Express] (rev 01) (prog-if 10 [OHCI])
        Subsystem: Device 3412:7856
        Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 39
        Memory at 1b80004000 (32-bit, non-prefetchable) [size=2K]
        Memory at 1b80000000 (32-bit, non-prefetchable) [size=16K]
        Capabilities: <access denied>
        Kernel driver in use: firewire_ohci
        Kernel modules: firewire_ohci

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions