`LookupError: unknown encoding: ascii` when parsing Buck files in virtualenv
Created by: jiangty-addepar
Problem
Reproduction steps on current Buck master
Example commit: https://github.com/facebook/buck/commit/a20f04a8c9202abd1207abd425d89a10522590b8
- In a
buck
project that uses the Python DSL, find a BUCK file that usesglob
: for example,foo/BUCK
. - Enter a virtual environment that uses Python 2.7. If your default Python is already 2.7, you can just do
virtualenv -q venv
source venv/bin/activate
- Parse that file with
buck
, for example by doing
buck targets //foo:
These steps worked on both Ubuntu and Mac OS X.
Result:
The following error is thrown:
Buck wasn't able to parse /repo/foo/BUCK:
LookupError: unknown encoding: ascii
Call stack:
File "/repo/foo/BUCK", line 4
['src/main/java/**/*.java'],
File "/repo/.buckd/resources/fff631c0eeb99191988f1cf0304b26113626ffbe/pathlib.py", line 984, in __new__
self = cls._from_parts(args, init=False)
File "/repo/.buckd/resources/fff631c0eeb99191988f1cf0304b26113626ffbe/pathlib.py", line 627, in _from_parts
drv, root, parts = self._parse_args(args)
File "/repo/.buckd/resources/fff631c0eeb99191988f1cf0304b26113626ffbe/pathlib.py", line 620, in _parse_args
return cls._flavour.parse_parts(parts)
File "/repo/.buckd/resources/fff631c0eeb99191988f1cf0304b26113626ffbe/pathlib.py", line 75, in parse_parts
parts = _py2_fsencode(parts)
File "/repo/.buckd/resources/fff631c0eeb99191988f1cf0304b26113626ffbe/pathlib.py", line 58, in _py2_fsencode
else part for part in parts]
Expected: buck targets
succeeds.
Investigation
After doing a git bisect
, we identified https://github.com/facebook/buck/commit/f947921a1afb7f766021bb9ad977de4b70e5d87d as the offending commit. We found that deleting all 3 of the .encode("ascii")
calls was the cause.
In fact, if we insert the line "123".encode("ascii")
at (almost) any point in buck.py
---for example, at the top of the file, or at a random line in the process_with_diagnostics
method---the error is fixed.
It's not clear what the root cause is, but apparently, unless we call .encode("ascii")
sometime early in the program's execution, we get the above error.
Also, for example, calling .encode("utf-8")
won't fix it.
I'm not sure what a good fix will be, but we're just going to do https://github.com/Addepar/buck/commit/6a4c88c439b00206db5128ae736c288b52e30170 on our fork for now as a workaround.