Advertisement
Not a member of Pastebin yet?
Sign Up,
it unlocks many cool features!
- Set 1: Basics – Literal Matches and Simple Character Classes
- Focus: Matching exact strings and basic patterns
- Match the word "India"
- Regex: India
- Explanation: Matches the exact word "India" anywhere in the text.
- Match either "Delhi" or "Mumbai"
- Regex: Delhi|Mumbai
- Explanation: Matches either of the two city names.
- Match any single digit (e.g., part of a pincode)
- Regex: [0-9]
- Explanation: Matches any single digit from 0 to 9.
- Match all vowels (useful for analyzing names)
- Regex: [aeiouAEIOU]
- Explanation: Matches any vowel (both lowercase and uppercase).
- Match the word "Namaste" exactly at the start of a sentence
- Regex: ^Namaste
- Explanation: ^ anchors the pattern to the beginning of a line.
- Set 2: Quantifiers and Common Formats
- Focus: Using +, *, {n}, ?
- Match a sequence of 5 digits (Indian pincode)
- Regex: \b\d{6}\b
- Explanation: Matches exactly 6-digit numbers; \b ensures word boundary.
- Match a word like “Ram”, “Rama”, “Ramaa”, etc.
- Regex: Rama*
- Explanation: Matches "Ram", "Rama", "Ramaa", etc. (a* = 0 or more a's)
- Match names starting with 'A' followed by any 2 letters
- Regex: A[a-zA-Z]{2}
- Explanation: Matches strings like “Ami”, “Aru”, etc.
- Match a string that ends with “ji” (e.g., Gandhi ji)
- Regex: ji$
- Explanation: $ anchors to the end of the string.
- Match any word containing at least one vowel
- Regex: \b\w*[aeiouAEIOU]\w*\b
- Explanation: \w* matches any word with at least one vowel.
- Would you like me to continue with the next 5 examples? The next set will start exploring groups, more complex alternation, and Indian-specific formats like phone numbers and vehicle registration numbers.
- You said:
- Next 5
- ChatGPT said:
- Great! Here's the next batch.
- Set 3: Grouping, Alternation, and Indian-Specific Formats
- Focus: Using parentheses (), grouping with |, and applying to real-life Indian formats
- Match either “Kolkata” or “Chennai” or “Bengaluru”
- Regex: (Kolkata|Chennai|Bengaluru)
- Explanation: Grouping with | lets you match any one of these cities.
- Match Indian mobile numbers starting with 7, 8, or 9
- Regex: [789]\d{9}
- Explanation: Matches a 10-digit mobile number starting with 7, 8, or 9.
- Match vehicle registration numbers like KA01AB1234
- Regex: [A-Z]{2}\d{2}[A-Z]{2}\d{4}
- Explanation: Matches Indian vehicle plates (e.g., KA01AB1234, DL09CD6789).
- Match PAN card format (e.g., ABCDE1234F)
- Regex: [A-Z]{5}[0-9]{4}[A-Z]
- Explanation: Matches PAN structure – 5 uppercase letters, 4 digits, 1 uppercase letter.
- Match Aadhaar numbers (12-digit format, optional spaces)
- Regex: \d{4} ?\d{4} ?\d{4}
- Explanation: Matches Aadhaar numbers with or without spaces (e.g., 123456789012 or 1234 5678 9012).
- Would you like the next 5 examples to focus on email validation, name patterns, and Indian website formats? Or shall we go deeper into advanced regex concepts like lookaheads, backreferences, etc.?
- You said:
- Next 5
- ChatGPT said:
- Perfect! Here's the next set.
- Set 4: Indian Names, Emails, and Web Patterns
- Focus: Character classes, optional elements, and realistic Indian data
- Match typical Indian names starting with capital (e.g., “Amit”, “Suresh”, “Priya”)
- Regex: [A-Z][a-z]{2,}
- Explanation: Matches a capital letter followed by at least two lowercase letters – good for basic name detection.
- Match common Indian email addresses (e.g., [email protected])
- Regex: [a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.(com|in)
- Explanation: Matches Gmail, Yahoo, etc. ending in .com or .in – commonly used domains in India.
- Match Indian website URLs like www.irctc.co.in
- Regex: www\.[a-zA-Z0-9\-]+\.(com|in|co\.in)
- Explanation: Matches websites ending in common Indian domains like .in and .co.in.
- Match names with optional title (e.g., Mr. Sharma, Mrs. Rani)
- Regex: (Mr|Mrs|Ms)\.?\s[A-Z][a-z]+
- Explanation: Matches optional period after title, then a capitalized surname.
- Match hashtags used in Indian context (e.g., #India, #IPL2025)
- Regex: #\w+
- Explanation: Matches hashtags (words beginning with # and followed by letters/digits).
- Let me know if you'd like to continue to the next 5. We can start getting into lookaheads, backreferences, or custom validations like matching date formats (e.g., 13/04/2025), or GSTIN numbers.
- You said:
- Next
- ChatGPT said:
- Awesome, let’s keep it rolling!
- Set 5: Dates, Times, and Advanced Indian Format Validation
- Focus: Date/time patterns, character escapes, and real-world validation
- Match Indian date format DD/MM/YYYY (e.g., 13/04/2025)
- Regex: \b\d{2}/\d{2}/\d{4}\b
- Explanation: Matches dates in day/month/year format with slashes.
- Match 24-hour time format (e.g., 14:35, 09:00)
- Regex: \b([01]\d|2[0-3]):[0-5]\d\b
- Explanation: Validates 00:00 to 23:59 format.
- Match Indian currency values (e.g., ?500, ?12,34,567.89)
- Regex: ?[0-9,]+(\.\d{2})?
- Explanation: Matches ? followed by Indian-style numerals with optional paise.
- Match Indian GSTIN (Goods and Services Tax Identification Number)
- Regex: \d{2}[A-Z]{5}\d{4}[A-Z]{1}[A-Z\d]{1}[Z]{1}[A-Z\d]{1}
- Explanation: Matches the 15-character alphanumeric GSTIN format.
- Match Indian STD codes (e.g., 011, 022, 080)
- Regex: 0\d{2,3}
- Explanation: Matches STD codes starting with 0 and 2–3 digits.
- We're halfway through! ??
- Ready for the next 5? Shall we continue with lookaheads, validations, or regional language elements (like Hindi text in Unicode)?
- You said:
- Next
- ChatGPT said:
- Love the momentum! Here's the next batch.
- Set 6: Lookaheads, Word Boundaries, and Unicode (Indian Language Basics)
- Focus: Lookaheads, word boundaries, and basic Hindi character matching
- Match a word that starts with "Bharat" but doesn’t end with "Mata"
- Regex: \bBharat(?!.*Mata)\b
- Explanation: Negative lookahead ensures "Mata" doesn’t appear after "Bharat".
- Match a sentence that contains "India" only if it's followed by "is great"
- Regex: India(?= is great)
- Explanation: Positive lookahead ensures "India" is only matched when followed by "is great".
- Match exact word "chai" as a standalone word
- Regex: \bchai\b
- Explanation: Word boundaries (\b) ensure it doesn’t match "chaibreak" or "masalachai".
- Match Hindi characters (Devanagari script, e.g., ??????)
- Regex: [\u0900-\u097F]+
- Explanation: Matches one or more characters in the Devanagari Unicode range.
- Match sentences ending with "hai" (common in spoken Hindi)
- Regex: \b\w+\s+hai\.?$
- Explanation: Matches lines ending with "hai" optionally followed by a period.
Advertisement
Add Comment
Please, Sign In to add comment
Advertisement