Extract part of a filename shell script

Question

In bash I would like to extract part of many filenames and save that output to another file.

The files are formatted as coffee_{SOME NUMBERS I WANT}.freqdist.

#!/bin/sh
for f in $(find . -name 'coffee*.freqdist)

That code will find all the coffee_{SOME NUMBERS I WANT}.freqdist file. Now, how do I make an array containing just {SOME NUMBERS I WANT} and write that to file?

I know that to write to file one would end the line with the following.

  > log.txt

I'm missing the middle part though of how to filter the list of filenames.

Actually no. I was querying Twitter for a clinical research project that involves comparing tweets from different locations. Twitter hung about 5% into searching through 40k zip codes. But, since I loaded the zipcodes as a dictionary in Python (and so unordered), I only have the output files labeled by zipcode to figure out which zip codes I already searched at. I figured this was a good reason to learn something about shell scripting rather than doing it in Python. — mac389
– mac389, Commented Sep 25, 2012 at 11:40

dogbane · Accepted Answer · 2012-09-25 11:52:54Z

17

You can do it natively in bash as follows:

filename=coffee_1234.freqdist
tmp=${filename#*_}
num=${tmp%.*}
echo "$num"

This is a pure bash solution. No external commands (like sed) are involved, so this is faster.

Append these numbers to a file using:

echo "$num" >> file

(You will need to delete/clear the file before you start your loop.)

edited Sep 25, 2012 at 11:52

answered Sep 25, 2012 at 11:47

dogbane

276k77 gold badges407 silver badges415 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Guru · Accepted Answer · 2012-09-25 11:42:58Z

7

If the intention is just to write the numbers to a file, you do not need find command:

ls coffee*.freqdist
coffee112.freqdist  coffee12.freqdist  coffee234.freqdist

The below should do it which can then be re-directed to a file:

$ ls coffee*.freqdist | sed 's/coffee\(.*\)\.freqdist/\1/'
112
12
234

Guru.

answered Sep 25, 2012 at 11:42

Guru

17.1k2 gold badges37 silver badges47 bronze badges

1 Comment

mac389 Over a year ago

I meant to take out the leading underscore too so: 's/coffee_(.*)\.freqdist/\1/'.

James Waldby - jwpat7 · Accepted Answer · 2012-09-25 15:29:14Z

1

The previous answers have indicated some necessary techniques. This answer organizes the pipeline in a simple way that might apply to other jobs as well. (If your sed doesn't support ‘;’ as a separator, replace ‘;’ with ‘|sed’.)

$ ls */c*; ls c*
 fee/coffee_2343.freqdist
 coffee_18z8.x.freqdist  coffee_512.freqdist  coffee_707.freqdist
$ find . -name 'coffee*.freqdist' | sed 's/.*coffee_//; s/[.].*//' > outfile
$ cat outfile 
 512
 18z8
 2343
 707

answered Sep 25, 2012 at 15:29

James Waldby - jwpat7

8,7512 gold badges24 silver badges38 bronze badges

Comments

Abhishek Ghosh · Accepted Answer · 2024-05-07 07:19:43Z

1

Expanding on this topic, let's say the filename has this format :

first_second_third_requiredText_fourth_fifth_sixth

To get only requiredText use the following :

filename=first_second_third_requiredText_fourth_fifth_sixth
removed_first_part=${filename#*_*_*_}
finalText=${removed_first_part%_*_*_*}

'#' starts from the beginning of the string, '%' from the end, '*' for any number of characters. This also works for full paths.

answered May 7, 2024 at 7:19

Abhishek Ghosh

1932 silver badges5 bronze badges

Collectives™ on Stack Overflow

Extract part of a filename shell script

4 Answers 4

Comments

1 Comment

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related