I want to extract the data from log file
For opening the file:
a = open('access.log','rb')
lines = a.readlines()
So suppose line[0]:
123.456.678.89 - - [04/Aug/2014:12:01:41 +0530] "GET /123456789_10.10.20.111 HTTP/1.1" 404 537 "-" "Wget/1.14 (linux-gnu)"
I want to extract only 123456789 and 10.10.20.111 from "GET /123456789_10.10.20.111 HTTP/1.1"
The pattern will be like string starts with /, repetition of digit then underscore then ip.
I tried this, and it works. I think it takes overhead
node = re.search(r'\"(.*)\"', line).group(1)
node = node.split(" ")[1]
node,ip = node.split("_")
node = node[1:]
print node,ip
How to get this with pattern ?