/ Published in: Python
Get all links from a website
from: http://www.pythonforbeginners.com/code/regular-expression-re-findall
from: http://www.pythonforbeginners.com/code/regular-expression-re-findall
Expand |
Embed | Plain Text
Copy this code and paste it in your HTML
import urllib2 import re #connect to a URL website = urllib2.urlopen(url) #read html code html = website.read() #use re.findall to get all the links links = re.findall('"((http|ftp)s?://.*?)"', html) print links
URL: http://www.pythonforbeginners.com/code/regular-expression-re-findall