Return to Snippet

Revision: 59790
at October 1, 2012 21:45 by zhyar

Initial Code
import urllib, re

url = ''
page = urllib.urlopen(url).read()
parse = re.findall("\<div class=\"post article\" id=\"(.+?)\">(.+?)</div", page)
for article in parse:
	parse1 = re.findall("\<a href=\"(.+?)" + article[0] + "\" class=\"fmllink\">(.+?)</a>", article[1])
	vdm = ''
	for test in parse1:
		vdm += test[1]
print(""+article[0]+" : "+vdm)

Initial URL

Initial Description
Simple web parser using urllib and re libs.

Initial Title
Example of web parser

Initial Tags
python, web

Initial Language