av.open UnicodeDecodeError
IMPORTANT: Be sure to replace all template sections {{ like this }} or your issue may be discarded.
Overview
I have a txt. Each line is the path of the video. When I call av.open (path)
, some variables path
will be reported UnicodeDecodeError.
Expected behavior
I expect all paths can be opened.
Actual behavior
I have a txt. Each line is the path of the video. When I call av.open (path)
, some variables path
will be reported UnicodeDecodeError. As far as I know, the characters in the same notebook should be the same encoding. There should not be some correct partial error.
Traceback:
path
is /gemini/data-3/hmdb51_video/shoot_gun/fastestgunalive_shoot_gun_u_cm_np1_fr_med_3.avi
or /gemini/data-3/hmdb51_video/shoot_gun/deserteagle_shoot_gun_u_cm_np1_ri_goo_0.avi
or /gemini/data-3/hmdb51_video/wave/Newall_Green_High_Students_Waving_Goodbye_wave_u_cm_np1_fr_med_0.avi
etc. Errors reported by the all problematic path are the same, UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8c in position 65: invalid start byte
. Concretely,
Traceback (most recent call last):
File "main.py", line 235, in <module>
if __name__ == '__main__': main()
File "main.py", line 147, in main
for i, (data, labels) in enumerate(train_loader, resume_step):
File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 521, in __next__
data = self._next_data()
File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 1203, in _next_data
return self._process_data(data)
File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/dataloader.py", line 1229, in _process_data
data.reraise()
File "/usr/local/lib/python3.8/dist-packages/torch/_utils.py", line 433, in reraise
raise RuntimeError(msg) from None
RuntimeError: Caught UnicodeDecodeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/_utils/worker.py", line 287, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/usr/local/lib/python3.8/dist-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/gemini/code/efficient-video-recognition-master/video_dataset/dataset.py", line 56, in __getitem__
container = av.open(path)
File "av/container/core.pyx", line 401, in av.container.core.open
File "av/container/input.pyx", line 84, in av.container.input.InputContainer.__cinit__
File "av/stream.pyx", line 52, in av.stream.wrap_stream
File "av/stream.pyx", line 86, in av.stream.Stream._init
File "av/utils.pyx", line 26, in av.utils.avdict_to_dict
File "av/utils.pyx", line 14, in av.utils._decode
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8c in position 65: invalid start byte
However, /gemini/data-3/hmdb51_video/drink/AllThePresidentMen_drink_f_nm_np1_ri_med_8.avi
can be open
Investigation
First, the video can be played normally. Then, in order to determine whether it is a txt problem, I manually enter the path in python console. Unfortunately, the same mistake occurred.
av.open('/gemini/data-3/hmdb51_video/wave/Newall_Green_High_Students_Waving_Goodbye_wave_u_cm_np1_fr_med_0.avi')

Research
I have done the following:
-
Checked the PyAV documentation -
Searched on Google -
Searched on Stack Overflow -
Looked through old GitHub issues -
Asked on PyAV Gitter -
... and waited 72 hours for a response.
Additional context
{{ Add any other context about the problem here. }}