I was unable to read a client's data file as I normally would due to odd encoding.
Normally I would open the files with Notepad++ to convert encoding, but all but one file was too large to open with Notepad++. The actual encoding for the one file which I could open was "UCS-2 LE BOM".
In order to read that with Pandas read_csv must use: encoding="utf_16_le"
df = pd.read_csv(IMPORT_FILE, sep="\t", low_memory=False, encoding="utf_16_le")
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment