3

I have a csv file with the first 2 rows with data as:

NewDateTime ResourceName    
9/18/12 1:00    ANACACHO_ANA    
9/18/12 2:00    ANACACHO_ANA    

When I read it using pandas data frame as:

df = pd.read_csv(r'MyFile.csv')

I get

df1.columns
Index([u'NewDateTime', u'ResourceName', dtype='object')

However, when I try

df1['NewDateTime']

I get error:

UnicodeEncodeError: 'ascii' codec can't encode character u'\ufeff' in position 5: ordinal not in range(128)

Also the df1['NewDateTime'] on my pycharm interpreter has a little dash as in df1['-NewDateTime'] but when I paste it here the dash doesn't show up

1
  • 1
    As of the docs: read_csv does - per default - only handle comma-separated csv files. I further assume you have some encoding issues. Take this small csv file and create it again by hand. Does the problem still occur? What happens when you simply read in the file contents and print them via python? Is there something weird? Commented Jul 19, 2016 at 22:29

1 Answer 1

4

It looks like your CSV file has a BOM (Byte Order Mark) signature, so try to parse using 'utf-8-sig', 'utf-16' or another encoding with BOM:

df = pd.read_csv(r'MyFile.csv', encoding='utf-8-sig')

Here is a small demo:

In [18]: pd.read_csv(fn).columns
Out[18]: Index([u'?NewDateTime', u'ResourceName'], dtype='object')

In [19]: pd.read_csv(fn, encoding='utf-8-sig').columns
Out[19]: Index([u'NewDateTime', u'ResourceName'], dtype='object')

in my iPython terminal the BOM signature is showed as ? in u'?NewDateTime' - in your case it's a dash sign: df1['-NewDateTime']

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.