Networking useful stuff

Monday, January 23, 2023

Python: reading a text file - character

Situation:

Reading a text file in python3 (csv or txt) there is a character that can be appreciated using "more" in terminal but in python3 the situation is more complicated.

Example:

$ more epa.csv

<U+FEFF>the text

Problem:

Python3 reads the file well, it doesn't throw an error, but that invisible "character" remains in the variables, the texts, etc. and can cause some inconvenience.

Solution:

The solution is to read the file and specify the encoding, something as simple as:

FILENAME="epa.csv"

with open(FILENAME, encoding='utf-8-sig') as file:

for line in file:

print(line)

Explanation (taken from: https://stackoverflow.com/questions/17912307/u-ufeff-in-python-string):

The Unicode character U+FEFF is the byte order mark, or BOM, and is used to tell the difference between big- and little-endian UTF-16 encoding.

Good luck,

Networking useful stuff

Monday, January 23, 2023

Python: reading a text file - character

No comments:

Post a Comment

-

CT Limo Service to NYC, EWR, LaGuardia, Bradley Airport, Boston Logan and Newark Liberty Airport