2

I'm trying to import a csv file (25MB - 80000 rows) into pandas dataframe but it's not displaying correctly. The columns are seperated with a ';' when calling df.columns.

  Originated GeoZone;Booking ...                                                                                                                                                                                                                                                                                                                                                          
1  PARIS;PARIS;;MARKer;EQDff;...                                                                                                                                                                                                                                                                                                                                                                                              
2  PARIS;PARIS;;MARKer;EQDff;...                                                                                                                                                                                                                                                                                                                                                                                              
3  PARIS;PARIS;;MARKer;EQDff;...                                                                                                                                                                                                                                                                                                                                                                                             
4  PARIS;PARIS;;MARKer;EQDff;...  

The csv file is perfectly clear on Excel. Why is it not well interpreted by pandas.

6
  • Could you share what all you've tried? Commented May 7, 2019 at 16:23
  • 3
    Can you try: pd.read_csv('your df', sep = ';') ? Commented May 7, 2019 at 16:23
  • 3
    clearly your delimiter is ;. read_csv default is : Commented May 7, 2019 at 16:24
  • 1
    CSV literally means "comma separated values". How various programs from python to excel parse non-comma separators tends to vary. General solution is to specify the separator explicitly, as @Junkrat does above Commented May 7, 2019 at 16:25
  • @Junkrat yes that works! Commented May 7, 2019 at 16:26

3 Answers 3

5

You should use delimiter or sep attribute in read_csv:

sep : str, default ‘,’

Delimiter to use. If sep is None, the C engine cannot automatically detect the separator, but the Python parsing engine can, meaning the latter will be used and automatically detect the separator by Python’s builtin sniffer tool, csv.Sniffer. In addition, separators longer than 1 character and different from '\s+' will be interpreted as regular expressions and will also force the use of the Python parsing engine. Note that regex delimiters are prone to ignoring quoted data. Regex example: '\r\t'.

delimiter : str, default None

Alias for sep.

df = pd.read_csv('waka.csv', sep=';')

Sign up to request clarification or add additional context in comments.

Comments

3

This problem might be caused by different separator in Excel, it depends usually on country you are from.

You can try this:

pd.read_csv("your file", sep = ";")

This should works.

Theory: this is caused by different separators for csv files. Python use comma by default, but some files might use something other the comma, for example: ";" or "|".

Comments

2

Solution:

pd.read_csv('your file', sep = ';')

As @RafaelC pointed out, your default delimiter is ;, so you have to specify it separately

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.