Advertisement
go6odn28

5_html_parser

Mar 27th, 2024
106
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 0.51 KB | None | 0 0
  1. import re
  2.  
  3. text = input()
  4.  
  5. title_pattern = r"<title>(.+)</title>"
  6. title_match = re.search(title_pattern, text)
  7. title = title_match.group(1)
  8.  
  9. body_pattern = r"<body>(.+)</body>"
  10. body_match = re.search(body_pattern, text)
  11. body = body_match.group(1)
  12.  
  13. content_pattern = r"(^|>)(.[^<>]*)(<|$)"
  14. content_match = re.findall(content_pattern, body)
  15. content = [x[1] for x in content_match]
  16. content = "".join(content)
  17. content = re.sub(r"\\n", '', content)
  18.  
  19. print(f"Title: {title}")
  20. print(f"Content: {content}")
  21.  
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement