1

Here is grep command:

grep "%SWFPATH%/plugins/" filename 

And its output:

set(hotspot[hs_bg_%2].url,%SWFPATH%/plugins/textfield.swf);
set(hotspot[hs_%2].url,%SWFPATH%/plugins/textfield.swf);
url="%SWFPATH%/plugins/textfield.swf"
url="%SWFPATH%/plugins/scrollarea.swf"
alturl="%SWFPATH%/plugins/scrollarea.js"
url="%SWFPATH%/plugins/textfield.swf"

I'd like to generate a file containing the names of the all files in the 'plugins/' directory, that are mentioned in a certain file.

Basically I need to extract the file name and the extension from every line. I can manage to delete any duplicates but I can't figure out how to extract the information that I need.

This would be the content of the file that I would like to get:

textfield.swf
scrollarea.swf
strollarea.js

Thanks!!!

PS: The thread "Extract filename and extension in bash (14 answers)" explains how to get filename and extension from a 'variable'. What I'm trying to achieve is extracting these from a 'file', which is completely different'

2

3 Answers 3

2

Using awk:

grep "%SWFPATH%/plugins/" filename | \
awk '{ match($0, /plugins\/([^\/[:space:]]+)\.([[:alnum:]]+)/,submatch);
     print "filename:"submatch[1];
     print "extension:"submatch[2];
    }'

Some explanation:

the match function takes every line processed by awk (indicated by $0) and looks for matches to that regex. Submatches (the parts of the string that match the parts of the regex between parentheses) are saved in the array submatch. print is as straightforward as it looks, it just prints stuff.

Sign up to request clarification or add additional context in comments.

1 Comment

Sorry, I get the following error "awk: line 1: syntax error at or near,"
1

For this specific problem

awk '/\/plugins\// {sub(/.*\//, ""); sub(/(\);|")?$/, "");
   arr[$0] = $0} END {for (i in arr) print arr[i]}' filename

4 Comments

the second sub does not work well with the first two strings
@blue, added in a different solution
Thanks! Worked like a charm. I didn't have any preblems with the second 'sub'
@RafaelGP, good to know! Blue pointed out an issue with the second sub in my original answer that I was able to subsequently fix, so you no longer see the issue in the latest answer. Cheers.
1

Use awk to simply extract the filename and then sed to clean up the trailing )"; characters.

 awk -F/ '{print $NF}' a  | sed -e 's/);//' -e 's/"$//'

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.