Boisselle: Extracting URL links from a File

Monday, 30 September 2013

Extracting URL links from a File

Extracting URL links from a File

Following code is to extract /support/security/bulletins/*.html links from
a file(urlfile contain about 1000 links) to urlsort file using regex,But
i'm weak in regex can anyone show me how to do that...?
#!/usr/bin/env python
import re,sys
fileHandle = open('urlfile', 'r')
f1 = open('urlsort', 'w')
for line in fileHandle.readlines():
links = re.findall(r"(\/support\/security\/bulletins\/*.html.*?)", line)
for link in links:
sys.stdout = f1
print ('%s' % (link[0]))
sys.stdout = sys.__stdout__
f1.close()
fileHandle.close()

Boisselle

Monday, 30 September 2013

Extracting URL links from a File

No comments:

Post a Comment