Advertisement
Peaser

I basically remade beautiful soup

Apr 6th, 2015
380
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 0.47 KB | None | 0 0
  1. import urllib2, re
  2. url = "http://{}/".format(raw_input("http://"))
  3. REx = {
  4.     "comments": "<!--(.*)-->",
  5.     "paragraphs": "<p>(.*)<\/p>",
  6.     "inline css paragraphs": "<p.+>(.*)<\/p>",
  7. }
  8. matches = {}
  9. for key in REx: matches[key] = []
  10. sitedata = urllib2.urlopen(url).read()
  11. for pattern in REx:
  12.     result = re.findall(REx[pattern], sitedata)
  13.     for i in result:
  14.         i = i.strip() if pattern == "comments" else i
  15.         matches[pattern].append(i)
  16. print matches #:^)
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement