Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • P PyAV
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 37
    • Issues 37
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 26
    • Merge requests 26
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • PyAV
  • PyAV
  • Issues
  • #1107
Closed
Open
Issue created Mar 19, 2023 by laohuijiadezhu@laohuijiadezhu4 of 6 checklist items completed4/6 checklist items

av.open UnicodeDecodeError

IMPORTANT: Be sure to replace all template sections {{ like this }} or your issue may be discarded.

Overview

I have a txt. Each line is the path of the video. When I call av.open (path), some variables path will be reported UnicodeDecodeError.

Expected behavior

I expect all paths can be opened.

Actual behavior

I have a txt. Each line is the path of the video. When I call av.open (path), some variables path will be reported UnicodeDecodeError. As far as I know, the characters in the same notebook should be the same encoding. There should not be some correct partial error.

Traceback: path is /gemini/data-3/hmdb51_video/shoot_gun/fastestgunalive_shoot_gun_u_cm_np1_fr_med_3.avi or /gemini/data-3/hmdb51_video/shoot_gun/deserteagle_shoot_gun_u_cm_np1_ri_goo_0.avi or /gemini/data-3/hmdb51_video/wave/Newall_Green_High_Students_Waving_Goodbye_wave_u_cm_np1_fr_med_0.avi etc. Errors reported by the all problematic path are the same, UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8c in position 65: invalid start byte. Concretely,

Traceback (most recent call last):
  File "main.py", line 235, in <module>
    if __name__ == '__main__': main()
  File "main.py", line 147, in main
    for i, (data, labels) in enumerate(train_loader, resume_step):
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 521, in __next__
    data = self._next_data()
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 1203, in _next_data
    return self._process_data(data)
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 1229, in _process_data
    data.reraise()
  File "/usr/local/lib/python3.8/dist-packages/torch/_utils.py", line 433, in reraise
    raise RuntimeError(msg) from None
RuntimeError: Caught UnicodeDecodeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/gemini/code/efficient-video-recognition-master/video_dataset/dataset.py", line 56, in __getitem__
    container = av.open(path)
  File "av/container/core.pyx", line 401, in av.container.core.open
  File "av/container/input.pyx", line 84, in av.container.input.InputContainer.__cinit__
  File "av/stream.pyx", line 52, in av.stream.wrap_stream
  File "av/stream.pyx", line 86, in av.stream.Stream._init
  File "av/utils.pyx", line 26, in av.utils.avdict_to_dict
  File "av/utils.pyx", line 14, in av.utils._decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8c in position 65: invalid start byte

However, /gemini/data-3/hmdb51_video/drink/AllThePresidentMen_drink_f_nm_np1_ri_med_8.avi can be open

Investigation

First, the video can be played normally. Then, in order to determine whether it is a txt problem, I manually enter the path in python console. Unfortunately, the same mistake occurred.

av.open('/gemini/data-3/hmdb51_video/wave/Newall_Green_High_Students_Waving_Goodbye_wave_u_cm_np1_fr_med_0.avi')
image

Research

I have done the following:

  • Checked the PyAV documentation
  • Searched on Google
  • Searched on Stack Overflow
  • Looked through old GitHub issues
  • Asked on PyAV Gitter
  • ... and waited 72 hours for a response.

Additional context

{{ Add any other context about the problem here. }}

Assignee
Assign to
Time tracking