/ Published in: Python
                    
                                        
Get all links from a website 
from: http://www.pythonforbeginners.com/code/regular-expression-re-findall
                from: http://www.pythonforbeginners.com/code/regular-expression-re-findall
                            
                                Expand |
                                Embed | Plain Text
                            
                        
                        Copy this code and paste it in your HTML
import urllib2
import re
#connect to a URL
website = urllib2.urlopen(url)
#read html code
html = website.read()
#use re.findall to get all the links
links = re.findall('"((http|ftp)s?://.*?)"', html)
print links
URL: http://www.pythonforbeginners.com/code/regular-expression-re-findall
Comments
 Subscribe to comments
                    Subscribe to comments
                
                