Extract file name from read_csv - Python

Question

I have a script that current reads raw data from a .csv file and performs some pandas data analysis against the data. Currently the .csv file is hardcoded and is read in like this:

data = pd.read_csv('test.csv',sep="|", names=col)

I want to change 2 things:

I want to turn this into a loop so it loops through a directory of .csv files and executes the pandas analysis below each one in the script.
I want to take each .csv file and strip the '.csv' and store that in a another list variable, let's call it 'new_table_list'.

I think I need something like below, at least for the 1st point(though I know this isn't completely correct). I am not sure how to address the 2nd point

Any help is appreciated

import os 

path = '\test\test\csvfiles'
table_list = []

for filename in os.listdir(path):
    if filename.endswith('.csv'):
        table_list.append(file)
data = pd.read_csv(table_list,sep="|", names=col)

You need to use os.path.join(path, filename) to get the full name of the file to read from — Barmar
– Barmar, Commented May 14, 2018 at 19:42
The first argument to read_csv needs to be a filename, not a list of filenames. — Barmar
– Barmar, Commented May 14, 2018 at 19:43
@Barmar Ah, ok. So read_csv can't take a parameter? Has to be a single filename? Ok, I will need to change my approach I think. Thanks — JD2775
– JD2775, Commented May 14, 2018 at 19:45
Yes, it takes a parameter. That parameter must be a filename or an already open file object that it can read from. — Barmar
– Barmar, Commented May 14, 2018 at 19:48
read_csv can only read one CSV file at a time, not all the files in table_list. — Barmar
– Barmar, Commented May 14, 2018 at 19:51

Yuvraj Jaiswal · Accepted Answer · 2018-05-14 19:58:45Z

11

Many ways to do it

for filename in os.listdir(path):
    if filename.endswith('.csv'):
        table_list.append(pd.read_csv(filename,sep="|"))
        new_table_list.append(filename.split(".")[0])

One more

for filename in os.listdir(path):
    if filename.endswith('.csv'):
        table_list.append(pd.read_csv(filename,sep="|"))
        new_table_list.append(filename[:-4])

and many more

As @barmar pointed out, better to append path as well to the table_list to avoid any issues related to path and location of files and script.

edited May 14, 2018 at 19:58

answered May 14, 2018 at 19:42

Yuvraj Jaiswal

1,73317 silver badges21 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

abarnert Over a year ago

What is the file in table_list.append(file)? Surely you want to append something useful like `pd.read_csv(pathname, sep='|', names=col)`` somewhere, rather than just two different versions of the filename to two different lists?

Yuvraj Jaiswal Over a year ago

@abarnert yup, that makes much more sense. Though I was trying to solve just second part. Will edit. Thanks a ton.

Paulo Scardine · Accepted Answer · 2018-05-14 19:47:28Z

4

You can try something like this:

import glob

data = {}
for filename in glob.glob('/path/to/csvfiles/*.csv'):
    data[filename[:-4]] = pd.read_csv(filename, sep="|", names=col)

Then data.keys() is the list of filenames without the ".csv" part and data.values() is a list with one pandas dataframe for each file.

answered May 14, 2018 at 19:47

Paulo Scardine

78.3k12 gold badges134 silver badges153 bronze badges

Comments

Drew Szurko · Accepted Answer · 2019-12-08 07:30:12Z

4

I'd start with using pathlib.

from pathlib import Path

And then leverage the stem attribute and glob method.

Let's make an import function.

def read_csv(f):
    return pd.read_csv(table_list, sep="|")

The most generic approach would be to store in a dictionary.

p = Path('\test\test\csvfiles')
dod = {f.stem: read_csv(f) for f in p.glob('*.csv')}

And you can also use pd.concat to turn that into a dataframe.

df = pd.concat(dod)

edited Dec 8, 2019 at 7:30

Drew Szurko

1,6211 gold badge18 silver badges32 bronze badges

answered May 14, 2018 at 20:16

piRSquared

296k68 gold badges509 silver badges654 bronze badges

Comments

Ziyad Moraished · Accepted Answer · 2018-05-14 20:04:59Z

2

to get the list CSV files in the directory use glob it is easier than os

from glob import glob 

# csvs will contain all CSV files names ends with .csv in a list
csvs = glob('you\\dir\\to\\csvs_folder\\*.csv')

# remove the trailing .csv from CSV files names
new_table_list = [csv[:-3] for csv in csvs]

# read csvs as dataframes
dfs = [pd.read_csv(csv, sep="|", names=col) for csv in csvs]

#concatenate all dataframes into a single dataframe
df = pd.concat(dfs, ignore_index=True)

edited May 14, 2018 at 20:04

answered May 14, 2018 at 19:49

Ziyad Moraished

3612 silver badges11 bronze badges

2 Comments

Paulo Scardine Over a year ago

You may want to double the backslashes, use a raw string or replace them with forward slashes (most libraries work with forward-slash paths even if the path separator for the current OS is a back-slash). In your example, the \to in the literal will be interpreted as 0x09o

Ziyad Moraished Over a year ago

yeah I forgot them. Thanks!

Joe · Accepted Answer · 2018-05-14 20:06:00Z

1

you can try so:

import os
path = 'your path'
all_csv_files = [f for f in os.listdir(path) if f.endswith('.csv')]
for f in all_csv_files:
    data = pd.read_csv(os.path.join(path, f), sep="|", names=col)

# list without .csv
files = [f[:-4] for f all_csv_files]

edited May 14, 2018 at 20:06

answered May 14, 2018 at 19:57

Joe

12.4k7 gold badges44 silver badges58 bronze badges

Comments

user3103059 · Accepted Answer · 2020-10-21 19:10:52Z

0

You can (at the moment of opening) add the filename to a Dataframe attribute as follow:

 ds.attrs['filename']='filename.csv'

You can subsequently query the dataframe for the name

 ds.attrs['filename']
'filename.csv'

answered Oct 21, 2020 at 19:10

user3103059

Collectives™ on Stack Overflow

Extract file name from read_csv - Python

6 Answers 6

2 Comments

Comments

Comments

2 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

2 Comments

Comments

Comments

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related