Passing column names to Pandas read_csv() function

Question

I have a csv file that looks as follows:

,time,o,h,l,c
0,2021-03-17 09:30:00,101.25,103.98,90.17,101.78
1,2021-03-17 09:45:00,102.83,107.49,95.22,101.93
2,2021-03-17 10:00:00,100.92,102.90,97.19,102.80

I'm reading in this data using the read_csv function, as follows:

cols = ['indx', 'timestamp', 'open', 'high', 'low', 'close']
df = pd.read_csv('prices.csv', names=cols)

The resulting dataframe looks as follows:

    Unnamed: 0  time                o       h       l       c
0   0           2021-03-17 09:30:00 101.25  103.98  90.17   101.78
1   1           2021-03-17 09:45:00 102.83  107.49  95.22   101.93
2   2           2021-03-17 10:00:00 100.92  102.90  97.19   102.80

Passing the names parameter seems to have no effect on the column names.

EDIT:

Passing the header parameter (per the docs) results in the same behavior:

cols = ['indx', 'timestamp', 'open', 'high', 'low', 'close']
df = pd.read_csv('prices.csv', header=0, names=cols)

Do I have to pass another parameter to the read_csv function to customize the column names?

Thanks!

Will you please show a sample of your raw CSV file, including the header and some rows? — user17242583
– user17242583, Commented Jan 25, 2022 at 23:41
From the docs (emphasis mine): "names: List of column names to use. If the file contains a header row, then you should explicitly pass header=0 to override the column names. Duplicates in this list are not allowed." — G. Anderson
– G. Anderson, Commented Jan 25, 2022 at 23:43
@G.Anderson -- passing the header parameter results in the same behavior. — equanimity
– equanimity, Commented Jan 25, 2022 at 23:48
I'm not able to reproduce, the given file produced the intended result — G. Anderson
– G. Anderson, Commented Jan 25, 2022 at 23:58

user17242583 · Accepted Answer · 2022-01-25 23:48:07Z

2

My guess is that you need to pass header=0 to read_csv:

df = pd.read_csv('prices.csv', names=cols, header=0)

Output:

>>> df
   indx            timestamp    open    high    low   close
0     0  2021-03-17 09:30:00  101.25  103.98  90.17  101.78
1     1  2021-03-17 09:45:00  102.83  107.49  95.22  101.93
2     2  2021-03-17 10:00:00  100.92  102.90  97.19  102.80

answered Jan 25, 2022 at 23:48

user17242583

Sign up to request clarification or add additional context in comments.

1 Comment

johnDanger Over a year ago

While the docs say explicitly pass header=0, I think they really mean pass header=[actual header row number]. This allows you to still skip any text above the header, while still replacing the header column names with the desired ones.

Collectives™ on Stack Overflow

Passing column names to Pandas read_csv() function

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related