0

I want to read a CSV file and save it as data frame in pandas.

But I have a problem because I have rows like this:

BG,6141.6,6141.6,,3.0,,,ic

As you see there are three separators: ',,,' , ',,' and ,

How can I load it correctly into pandas?

8
  • 2
    You should take a close look at data, most likely the multiple separators would mean that your data has 8 columns but columns 4,6 and 7 in this particular row are blank Commented Jun 22, 2017 at 16:16
  • Is that because the columns are empty or some other reason? Commented Jun 22, 2017 at 16:16
  • You should indicate what you already tried, and why you consider it did not work. Commented Jun 22, 2017 at 16:19
  • What is header of csv? Are there column names? Commented Jun 22, 2017 at 16:21
  • @jezrael yes, there are. The problem is, that instead of 1 million rows pandas reads only 45000. I didn't use some special options, only provided the path. The structure of data I gave in the post. What the solution could be? Commented Jun 22, 2017 at 16:27

1 Answer 1

2

Use regex separator [,]+ - one or more ,:

import pandas as pd
from pandas.compat import StringIO

temp=u"""iBG,6141.6,6141.6,,3.0,,,ic"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), sep="[,]+", header=None, engine='python')
print (df)
     0       1       2    3   4
0  iBG  6141.6  6141.6  3.0  ic
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.