Advertisement
4ndr0666

Search Optimizer and Dork Builder

Aug 29th, 2024
153
0
Never
Not a member of Pastebin yet? Sign Up, it unlocks many cool features!
Python 13.90 KB | Source Code | 0 0
  1. #!/usr/bin/python3
  2.  
  3. import re
  4. import subprocess
  5. import sys
  6. import requests
  7. import os
  8. # Terminal colors
  9. CYAN = "\033[38;5;51m"
  10. RESET = "\033[0m"
  11.  
  12. def ask_user(prompt):
  13.     """Prompt the user for input and return the trimmed response."""
  14.     return input(prompt + "\n> ").strip()
  15.  
  16. def display_help():
  17.     """Display the help information for building a Google dork using a pager."""
  18.     help_text = """
  19.    ### Searchmaster Dork Building
  20.    
  21.    The `inurl` operator is used to search for pages where the URL contains a specific word or phrase.
  22.    This is useful for finding certain types of pages, like login pages, admin panels, or specific file types.
  23.    
  24.    ### Example 1: Finding Login Pages
  25.    - Prompt: Enter a path or parameter to look for in the URL (e.g., inurl:admin):
  26.    - User Input: login
  27.    - Explanation: If you want to find URLs that contain the word "login", this is how you would start.
  28.    - Resulting Dork: inurl:"login"
  29.    
  30.    ### Example 2: Searching for Files in the URL
  31.    - Prompt: Enter a path or parameter to look for in the URL (e.g., inurl:admin):
  32.    - User Input: jpeg
  33.    - Explanation: If you're trying to find URLs that include the word "jpeg", perhaps to locate images or directories of images, this input is appropriate.
  34.    - Resulting Dork: inurl:"jpeg"
  35.    
  36.    ### Example 3: Searching for Video Files
  37.    - Prompt: Enter a path or parameter to look for in the URL (e.g., inurl:admin):
  38.    - User Input: mp4
  39.    - Explanation: This input helps find URLs that have "mp4" in them, potentially leading to video files.
  40.    - Resulting Dork: inurl:"mp4"
  41.    
  42.    ### Step 2: Adding the `intext` Operator
  43.    
  44.    Next, the script will ask for content you want to find within the page's text.
  45.    
  46.    ### Example 1: Finding Password Mentions
  47.    - Prompt: Enter a string to search for within the page's content (e.g., intext:password):
  48.    - User Input: password
  49.    - Explanation: This might be used to find pages that mention passwords, often leading to sensitive information.
  50.    - Resulting Dork: intext:"password"
  51.    
  52.    ### Example 2: Searching for Image Descriptions
  53.    - Prompt: Enter a string to search for within the page's content (e.g., intext:password):
  54.    - User Input: sunset
  55.    - Explanation: If you're looking for web pages that describe or discuss sunsets, you might use this input.
  56.    - Resulting Dork: intext:"sunset"
  57.    
  58.    ### Example 3: Finding Mentions of Specific Formats
  59.    - Prompt: Enter a string to search for within the page's content (e.g., intext:password):
  60.    - User Input: high resolution
  61.    - Explanation: This could be used to find pages discussing high-resolution images or videos.
  62.    - Resulting Dork: intext:"high resolution"
  63.    
  64.    ### Step 3: Adding the `intitle` Operator
  65.    
  66.    After that, the script will ask for a term you want to find in the page's title.
  67.    
  68.    ### Example 1: Searching for Admin Pages
  69.    - Prompt: Enter a string to search for within the page's title (e.g., intitle:login):
  70.    - User Input: admin
  71.    - Explanation: Use this to find pages with "admin" in the title, which often indicates an admin panel.
  72.    - Resulting Dork: intitle:"admin"
  73.    
  74.    ### Example 2: Locating Indexes of Files
  75.    - Prompt: Enter a string to search for within the page's title (e.g., intitle:login):
  76.    - User Input: index of
  77.    - Explanation: This is useful for finding directory listings or file indexes.
  78.    - Resulting Dork: intitle:"index of"
  79.    
  80.    ### Example 3: Searching for Dashboard Titles
  81.    - Prompt: Enter a string to search for within the page's title (e.g., intitle:login):
  82.    - User Input: dashboard
  83.    - Explanation: Helps find pages with "dashboard" in the title, likely leading to some form of control panel or admin interface.
  84.    - Resulting Dork: intitle:"dashboard"
  85.    
  86.    ### Step 4: Adding the `filetype` Operator
  87.    
  88.    Next, the script will ask for the type of file you want to find.
  89.    
  90.    ### Example 1: Searching for PDF Files
  91.    - Prompt: Enter the file type you're searching for (e.g., filetype:pdf):
  92.    - User Input: pdf
  93.    - Explanation: Useful when looking for PDF documents, such as reports or manuals.
  94.    - Resulting Dork: filetype:pdf
  95.    
  96.    ### Example 2: Finding JPEG Images
  97.    - Prompt: Enter the file type you're searching for (e.g., filetype:pdf):
  98.    - User Input: jpeg
  99.    - Explanation: Use this to find JPEG images, perhaps in directories or galleries.
  100.    - Resulting Dork: filetype:jpeg
  101.    
  102.    ### Example 3: Locating MP4 Videos
  103.    - Prompt: Enter the file type you're searching for (e.g., filetype:pdf):
  104.    - User Input: mp4
  105.    - Explanation: This will help you find MP4 video files across various websites.
  106.    - Resulting Dork: filetype:mp4
  107.    
  108.    ### Step 5: Adding the `site` Operator
  109.    
  110.    Finally, the script asks if you want to restrict the search to a specific site.
  111.    
  112.    ### Example 1: Searching Within a Specific Domain
  113.    - Prompt: Limit the search to a specific site (e.g., site:example.com):
  114.    - User Input: example.com
  115.    - Explanation: This restricts the search to example.com.
  116.    - Resulting Dork: site:example.com
  117.    
  118.    ### Example 2: Searching Across Educational Sites
  119.    - Prompt: Limit the search to a specific site (e.g., site:example.com):
  120.    - User Input: edu
  121.    - Explanation: Restricting your search to educational domains, useful for academic resources.
  122.    - Resulting Dork: site:edu
  123.    
  124.    ### Example 3: Limiting to Government Sites
  125.    - Prompt: Limit the search to a specific site (e.g., site:example.com):
  126.    - User Input: gov
  127.    - Explanation: This limits the search to government websites, useful for official documents or data.
  128.    - Resulting Dork: site:gov
  129.    
  130.    ### Step 6: Combining and Using the Dork
  131.    
  132.    After you've entered all the components, the script combines them to create the final Google dork.
  133.    
  134.    Example Dork:
  135.    inurl:"login" intext:"password" intitle:"admin" filetype:pdf site:example.com
  136.    
  137.    - Explanation: This dork searches for:
  138.      - URLs containing "login".
  139.      - Pages that mention "password" in the content.
  140.      - Pages with "admin" in the title.
  141.      - Files of type PDF.
  142.      - Within the example.com domain.
  143.    """
  144.     # Use a pager to display the help text
  145.     pager = os.getenv('PAGER', 'less')
  146.     with subprocess.Popen(pager, stdin=subprocess.PIPE, shell=True) as proc:
  147.         proc.stdin.write(help_text.encode('utf-8'))
  148.         proc.stdin.close()
  149.         proc.wait()
  150.  
  151. def validate_date_format(date_string):
  152.     """Validate the date string to ensure it matches expected formats."""
  153.     date_formats = [
  154.         r'\d{2}\.\d{2}\.\d{4}',  # DD.MM.YYYY
  155.         r'\d{4}-\d{2}-\d{2}',    # YYYY-MM-DD
  156.         r'\d{2}/\d{2}/\d{4}'     # MM/DD/YYYY
  157.     ]
  158.     return any(re.fullmatch(fmt, date_string) for fmt in date_formats)
  159.  
  160. def validate_site_format(site_string):
  161.     """Validate that the site format matches a proper domain name."""
  162.     return re.fullmatch(r'site:\S+\.\S+', site_string) is not None
  163.  
  164. def process_search_intent(intent):
  165.     """
  166.    Process the user's search intent, detecting and applying appropriate Google search operators.
  167.    This function handles basic patterns such as 'after', 'before', and 'site'.
  168.    """
  169.     date_after = re.search(r'\bafter (\S+)\b', intent, re.IGNORECASE)
  170.     date_before = re.search(r'\bbefore (\S+)\b', intent, re.IGNORECASE)
  171.  
  172.     if date_after and validate_date_format(date_after.group(1)):
  173.         intent = re.sub(r'\bafter \S+\b', '', intent, flags=re.IGNORECASE)
  174.         intent += f" after:{date_after.group(1)}"
  175.    
  176.     if date_before and validate_date_format(date_before.group(1)):
  177.         intent = re.sub(r'\bbefore \S+\b', '', intent, flags=re.IGNORECASE)
  178.         intent += f" before:{date_before.group(1)}"
  179.  
  180.     site = re.search(r'\bsite:(\S+)\b', intent, re.IGNORECASE)
  181.     if site and validate_site_format(site.group(0)):
  182.         intent = re.sub(r'\bsite:\S+\b', '', intent, flags=re.IGNORECASE)
  183.         intent += f" site:{site.group(1)}"
  184.  
  185.     return intent.strip()
  186.  
  187. def probe_additional_parameters(intent):
  188.     """
  189.    Probe the user for additional search parameters such as date range, site limitation, file type,
  190.    and exclusion criteria, then append these parameters to the search intent.
  191.    """
  192.     date_range = ask_user("Do you want to specify a date range? If yes, provide 'after' and/or 'before' dates (e.g., after 21.05.2023 or before 2023-05-21). If no, press Enter.")
  193.     site = ask_user("Do you want to limit your search to a specific website? If yes, provide the site (e.g., site:nytimes.com). If no, press Enter.")
  194.     filetype = ask_user("Are you looking for a specific file type (e.g., PDF)? If yes, specify the file type (e.g., filetype:pdf). If no, press Enter.")
  195.     exclude = ask_user("Do you want to exclude any words from the search results? If yes, list them (e.g., -politics -economy). If no, press Enter.")
  196.    
  197.     if date_range and validate_date_format(date_range.split()[-1]):
  198.         intent += " " + date_range
  199.     if site and validate_site_format(site):
  200.         intent += " " + site
  201.     if filetype:
  202.         intent += f" filetype:{filetype}"
  203.     if exclude:
  204.         intent += " " + exclude
  205.  
  206.     return intent.strip()
  207.  
  208. def copy_to_clipboard(text):
  209.     """
  210.    Copy the given text to the clipboard using wl-copy.
  211.    """
  212.     try:
  213.         subprocess.run(['wl-copy'], input=text.encode('utf-8'), check=True)
  214.         print("\nYour query has been copied to the clipboard.")
  215.     except Exception as e:
  216.         print(f"\nFailed to copy to clipboard: {e}")
  217.  
  218. def handle_arguments():
  219.     """Handle command-line arguments for flexibility in usage."""
  220.     if len(sys.argv) > 1:
  221.         return ' '.join(sys.argv[1:])
  222.     return None
  223.  
  224. def build_google_dork():
  225.     """Prompt the user to build a Google dork using common operators."""
  226.     print("\nLet's build a dork using available search operators.\n")
  227.     dork_parts = []
  228.  
  229.     # Prompt for inurl operator
  230.     inurl = ask_user("Enter a path or parameter to look for in the URL (e.g., inurl:admin):")
  231.     if inurl:
  232.         dork_parts.append(f'inurl:"{inurl}"')
  233.  
  234.     # Prompt for intext operator
  235.     intext = ask_user("Enter a string to search for within the page's content (e.g., intext:password):")
  236.     if intext:
  237.         dork_parts.append(f'intext:"{intext}"')
  238.  
  239.     # Prompt for intitle operator
  240.     intitle = ask_user("Enter a string to search for within the page's title (e.g., intitle:login):")
  241.     if intitle:
  242.         dork_parts.append(f'intitle:"{intitle}"')
  243.  
  244.     # Prompt for filetype operator
  245.     filetype = ask_user("Enter the file type you're searching for (e.g., filetype:pdf):")
  246.     if filetype:
  247.         dork_parts.append(f'filetype:{filetype}')
  248.  
  249.     # Prompt for site operator
  250.     site = ask_user("Limit the search to a specific site (e.g., site:example.com):")
  251.     if site:
  252.         dork_parts.append(f'site:{site}')
  253.  
  254.     # Combine all parts
  255.     google_dork = ' '.join(dork_parts)
  256.  
  257.     print("\nHere is your dork:")
  258.     print(google_dork)
  259.  
  260.     # Copy the dork to the clipboard
  261.     copy_to_clipboard(google_dork)
  262.  
  263. def fetch_predefined_dorks():
  264.     """Fetch the predefined Google dorks from the provided URL."""
  265.     url = "https://pastebin.com/raw/RFYt8U22"  # Direct link to the raw paste
  266.     try:
  267.         response = requests.get(url)
  268.         response.raise_for_status()
  269.         dorks = response.text.strip().splitlines()
  270.         return dorks
  271.     except Exception as e:
  272.         print(f"\nFailed to fetch predefined dorks: {e}")
  273.         return []
  274.  
  275. def select_predefined_dork(dorks):
  276.     """Allow the user to select a predefined Google dork."""
  277.     print("\nSelect a dork from the list below:\n")
  278.     for i, dork in enumerate(dorks, start=1):
  279.         print(f"{i}. {dork}")
  280.    
  281.     choice = ask_user("\nEnter the number of the dork you want to use:")
  282.     try:
  283.         index = int(choice) - 1
  284.         if 0 <= index < len(dorks):
  285.             selected_dork = dorks[index]
  286.             print("\nHere is your selected dork:")
  287.             print(selected_dork)
  288.             copy_to_clipboard(selected_dork)
  289.         else:
  290.             print("Invalid choice. Please restart the program and select a valid number.")
  291.     except ValueError:
  292.         print("Invalid input. Please enter a number corresponding to the dork.")
  293.  
  294. def main():
  295.     """
  296.    Main function to interact with the user, process their search intent,
  297.    build Google dorks, and provide an optimized search query.
  298.    """
  299.     while True:
  300.         print(f"{CYAN}Searchmaster!{RESET}")
  301.         print(f"({CYAN}1{RESET}) Optimize ({CYAN}2{RESET}) Build ({CYAN}3{RESET}) Choose ({CYAN}4{RESET}) Help")
  302.         choice = ask_user(f"Enter {CYAN}1{RESET}, {CYAN}2{RESET}, {CYAN}3{RESET}, or {CYAN}4{RESET}:")
  303.  
  304.         if choice == '1':
  305.             intent = handle_arguments() or ask_user("Describe your search query in plain English.")
  306.             optimized_intent = process_search_intent(intent)
  307.             final_query = probe_additional_parameters(optimized_intent)
  308.  
  309.             print("\nHere is your optimized search query:")
  310.             print(f'"{final_query}"')
  311.  
  312.             copy_to_clipboard(final_query)
  313.  
  314.             print("\nExplanation:")
  315.             print("Your query has been optimized with advanced search operators to improve the relevance of results.")
  316.             print("You can use this query directly in your browser to perform your search.")
  317.         elif choice == '2':
  318.             build_google_dork()
  319.         elif choice == '3':
  320.             predefined_dorks = fetch_predefined_dorks()
  321.             if predefined_dorks:
  322.                 select_predefined_dork(predefined_dorks)
  323.             else:
  324.                 print("Could not fetch the predefined dorks. Please check your internet connection and try again.")
  325.         elif choice == '4':
  326.             display_help()
  327.         else:
  328.             print("Invalid choice. Please restart the program and choose 1, 2, 3, or 4.")
  329.  
  330. if __name__ == "__main__":
  331.     main()
  332.  
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement