Mateusz Soltysik Blog

Mateusz Soltysik Blog

On tech, software engineering, live & more.

03 Oct 2020

Convert unknown's file encoding to UTF-8

Easiest way:

soltysik@mbp ~ $ file -I Downloads/Lista_transakcji_nr_0076302940_070620.csv                                                                              <aws:zoovu>
Downloads/Lista_transakcji_nr_0076302940_070620.csv: text/plain; charset=unknown-8bit

Enca way:

soltysik@mbp ~ $ brew install enca
soltysik@mbp ~ $ enca --list languages
soltysik@mbp ~ $ enca -L polish Downloads/Lista_transakcji_nr_0076302940_070620.csv

MS-Windows code page 1250
    LF line terminators

Convert the document to the UTF-8 encoding:

iconv --from-code=Windows-1250 --to-code=UTF-8 Downloads/Lista_transakcji_nr_0076302940_070620.csv > Downloads/Lista_transakcji_nr_0076302940_070620.csv.utf8