Advertisement
Python253

html2csv

Mar 13th, 2024
804
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 1.42 KB | None | 0 0
  1. #!/usr/bin/env python3
  2. # -*- coding: utf-8 -*-
  3. # Filename: html2csv.py
  4. # Version: 1.0.0
  5. # Author: Jeoi Reqi
  6.  
  7. """
  8. Description:
  9. This script converts an HTML file (.html) to a CSV file (.csv).
  10. It uses BeautifulSoup to parse the HTML content and extracts table rows, writing them to a CSV file.
  11.  
  12. Requirements:
  13. - Python 3.x
  14. - BeautifulSoup library (install using: pip install beautifulsoup4)
  15.  
  16. Usage:
  17. 1. Save this script as 'html2csv.py'.
  18. 2. Ensure your HTML file ('example.html') is in the same directory as the script.
  19. 3. Install the BeautifulSoup library using the command: 'pip install beautifulsoup4'
  20. 4. Run the script.
  21. 5. The converted CSV file ('html2csv.csv') will be generated in the same directory.
  22.  
  23. Note: Adjust the 'html_filename' and 'csv_filename' variables in the script as needed.
  24. """
  25. import csv
  26. from bs4 import BeautifulSoup
  27.  
  28. def html_to_csv(html_filename, csv_filename):
  29.     with open(html_filename, 'r') as htmlfile, open(csv_filename, 'w', newline='') as csvfile:
  30.         csvwriter = csv.writer(csvfile)
  31.         soup = BeautifulSoup(htmlfile, 'html.parser')
  32.         for row in soup.find_all('tr'):
  33.             csvwriter.writerow([col.get_text(strip=True) for col in row.find_all(['td', 'th'])])
  34.  
  35. if __name__ == "__main__":
  36.     html_filename = 'example.html'
  37.     csv_filename = 'html2csv.csv'
  38.     html_to_csv(html_filename, csv_filename)
  39.     print(f"Converted '{html_filename}' to '{csv_filename}'.")
  40.  
  41.  
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement