Encoding error on Win10 UTF8 file
There is one other report about encoding errors on Win10, but this may be different. Including more info:
Using Powershell I'm retrieving data from a site using curl.exe and saving as a UTF-16 file, which fails csv2kit conversion unless I convert to UTF-8. After reencoding with powershell 'set-content', all files succeed in json>csv conversion except this one file, which complains about a newline character. I've tried converting it multiple ways using multiple apps (including notepad, notepad++, etc) and multiple formats (UTF8/ANSI and back) but no joy on this one text file.
Calling from powershell in2csv.py 1.0.6 Python 3.10.1 Windows 10 Version 10.0.19043.1415
python.exe : Traceback (most recent call last):
At C:\...\Curl.ps1:8 char:3
+ python.exe $script -v -k items $folder$i.json > $folder$i.csv
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : NotSpecified: (Traceback (most recent call last)::String) [], RemoteException
+ FullyQualifiedErrorId : NativeCommandError
File "C:\...\AppData\Local\Programs\Python\Python310\Scripts\csvkit-master\csvkit\utilities\in2csv.py", line 207, in
<module>
launch_new_instance()
File "C:\...\AppData\Local\Programs\Python\Python310\Scripts\csvkit-master\csvkit\utilities\in2csv.py", line 203, in
launch_new_instance
utility.run()
File "C:\...\AppData\Local\Programs\Python\Python310\lib\site-packages\csvkit\cli.py", line 118, in run
self.main()
File "C:\...\AppData\Local\Programs\Python\Python310\Scripts\csvkit-master\csvkit\utilities\in2csv.py", line 171, in
main
table.to_csv(self.output_file, **self.writer_kwargs)
File "C:\...\AppData\Local\Programs\Python\Python310\lib\site-packages\agate\table\to_csv.py", line 43, in to_csv
writer.writerow(tuple(csv_funcs[i](d) for i, d in enumerate(row)))
File "C:\...\AppData\Local\Programs\Python\Python310\lib\site-packages\agate\csv_py3.py", line 92, in writerow
self.writer.writerow(row)
File "C:\...\AppData\Local\Programs\Python\Python310\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2028' in position 11113: character maps to <undefined>