0

I want to extract the data from log file

For opening the file:

a = open('access.log','rb')
lines = a.readlines()

So suppose line[0]:

123.456.678.89 - - [04/Aug/2014:12:01:41 +0530] "GET /123456789_10.10.20.111 HTTP/1.1" 404 537 "-" "Wget/1.14 (linux-gnu)"

I want to extract only 123456789 and 10.10.20.111 from "GET /123456789_10.10.20.111 HTTP/1.1"

The pattern will be like string starts with /, repetition of digit then underscore then ip.

I tried this, and it works. I think it takes overhead

node = re.search(r'\"(.*)\"', line).group(1)
node = node.split(" ")[1]
node,ip = node.split("_")
node = node[1:]
print node,ip

How to get this with pattern ?

1 Answer 1

1

Would you like to do this in one line?

nodeip = re.search(r'([\d]{9})_([\d]{1,3}\.[\d]{1,3}\.[\d]{1,3}\.[\d]{1,3})', line)

Now your node and IP in groups 1 and 2:

print nodeip.group(1), nodeip.group(2)

Outputs:

123456789 10.10.20.111
Sign up to request clarification or add additional context in comments.

6 Comments

Ok, but [0-9] means that you need only one numeric char (equals to [0-9]{1}). Use [0-9]+ (equals to [0-9]{1,}) for non-fixed length.
hey @doubleui your solution work for me.. im going to extract date from above log so i used time_string = re.search(r'[(.*)]', line) it works for me but i created separate search,if i want to get it as third group in above query ..??
([\d]{2}/[A-Z]{1}[a-z]{2}/[\d]{4}:[\d]{2}:[\d]{2}:[\d]{2} \+[\d]{4})(?:.+?)([\d]{1,})_([\d]{1,3}\.[\d]{1,3}\.[\d]{1,3}\.[\d]{1,3})
(?:.+?) means: ( group_start ?: do_not_store_to_group .+ any_chars ? but_lazy_untill_found_next_rule_or_without_it_will_store_all_chars_untill_end ) group_end. In other words: skip any chars like ] "GET / with lazy method.
You can replace [A-Z]{1}[a-z]{2} with (?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec).
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.