Regex Library
Regular expression functions for pattern matching and text processing. The function signatures follow Python’s re module conventions.
Available Functions
| Function | Description |
|---|---|
match(pattern, string, flags=0) |
Match pattern at the beginning of string |
search(pattern, string, flags=0) |
Search for pattern anywhere in string |
findall(pattern, string, flags=0) |
Find all non-overlapping matches |
sub(pattern, repl, string, count=0, flags=0) |
Replace pattern matches with repl |
split(pattern, string, maxsplit=0, flags=0) |
Split string by pattern |
compile(pattern, flags=0) |
Compile pattern into a regex object |
Match Objects
The re.match() and re.search() functions return a Match object on success, or None if no match is found. Match objects provide the following methods:
| Method | Description |
|---|---|
group(n=0) |
Returns the nth matched group (0 = full match) |
groups() |
Returns a tuple of all capturing groups (excluding group 0) |
start(n=0) |
Returns the start position of the match |
end(n=0) |
Returns the end position of the match |
span(n=0) |
Returns a (start, end) tuple for the match |
Example:
import re
# Search with capturing groups
m = re.search(r'(\w+)@(\w+)\.(\w+)', 'Email: user@example.com')
if m:
print(m.group(0)) # 'user@example.com' (full match)
print(m.group(1)) # 'user' (first group)
print(m.group(2)) # 'example' (second group)
print(m.group(3)) # 'com' (third group)
print(m.groups()) # ('user', 'example', 'com')
print(m.start()) # 7 (position where match starts)
print(m.end()) # 23 (position where match ends)
print(m.span()) # (7, 23)Constants (Flags)
The regex library provides the following flags that can be passed to functions:
| Flag | Shorthand | Value | Description |
|---|---|---|---|
re.IGNORECASE |
re.I |
2 | Case-insensitive matching |
re.MULTILINE |
re.M |
8 | ^ and $ match at line boundaries |
re.DOTALL |
re.S |
16 | . matches newlines |
Flags can be combined using the bitwise OR operator (|):
import re
# Combine IGNORECASE and MULTILINE
m = re.match("hello", "HELLO\nWORLD", re.I | re.M)
if m:
print(m.group(0)) # "HELLO"Functions
re.match(pattern, string, flags=0)
Checks if the pattern matches at the beginning of the string.
Parameters:
pattern: Regular expression patternstring: String to searchflags: Optional flags (default: 0)
Returns: Match object if pattern matches at start, or None if no match
Example:
import re
m = re.match("[0-9]+", "123abc")
if m:
print("String starts with digits:", m.group(0)) # "123"
m = re.match("[0-9]+", "abc123")
if m == None:
print("Pattern must match at start")
# Case-insensitive matching
m = re.match("hello", "HELLO world", re.I)
if m:
print("Case-insensitive match:", m.group(0)) # "HELLO"re.search(pattern, string, flags=0)
Searches for the first occurrence of the pattern anywhere in the string.
Parameters:
pattern: Regular expression patternstring: String to searchflags: Optional flags (default: 0)
Returns: Match object for the first match, or None if no match found
Example:
import re
m = re.search(r'\w+@\w+\.\w+', "Contact: user@example.com")
if m:
print(m.group(0)) # "user@example.com"
result = re.search("[0-9]+", "no numbers")
print(result) # None
# Case-insensitive search
m = re.search("world", "HELLO WORLD", re.I)
if m:
print(m.group(0)) # "WORLD"
# Using capturing groups
m = re.search(r'(\d+)-(\d+)', "Phone: 555-1234")
if m:
print(m.group(0)) # "555-1234"
print(m.group(1)) # "555"
print(m.group(2)) # "1234"
print(m.groups()) # ("555", "1234")re.findall(pattern, string, flags=0)
Finds all occurrences of the pattern in the string.
Parameters:
pattern: Regular expression patternstring: String to searchflags: Optional flags (default: 0)
Returns: List of strings (all matches)
Example:
import re
phones = re.findall("[0-9]{3}-[0-9]{4}", "Call 555-1234 or 555-5678")
print(phones) # ["555-1234", "555-5678"]
# Case-insensitive findall
words = re.findall("a+", "aAbBaAa", re.I)
print(words) # ["aA", "aAa"]re.finditer(pattern, string, flags=0)
Finds all occurrences of the pattern in the string and returns Match objects.
Parameters:
pattern: Regular expression patternstring: String to searchflags: Optional flags (default: 0)
Returns: List of Match objects (all matches)
Example:
import re
matches = re.finditer("[0-9]{3}-[0-9]{4}", "Call 555-1234 or 555-5678")
for match in matches:
print(match.group(0)) # "555-1234", "555-5678"
print(match.start()) # 5, 18
print(match.end()) # 13, 26
# With capturing groups
matches = re.finditer(r'(\d+)-(\d+)', "555-1234, 888-9999")
for match in matches:
print(match.group(0)) # "555-1234", "888-9999"
print(match.group(1)) # "555", "888"
print(match.group(2)) # "1234", "9999"
print(match.groups()) # ("555", "1234"), ("888", "9999")re.sub(pattern, repl, string, count=0, flags=0)
Replaces occurrences of the pattern in the string with the replacement. The replacement can be either a string or a function. This follows Python’s re.sub() function signature.
Parameters:
pattern: Regular expression patternrepl: Replacement string or function that takes a Match object and returns a stringstring: String to modifycount: Maximum number of replacements (0 = all, default: 0)flags: Optional flags (default: 0)
Returns: String (modified text)
Example:
import re
# String replacement
text = re.sub("[0-9]+", "XXX", "Price: 100")
print(text) # "Price: XXX"
# Replace multiple occurrences
result = re.sub("[0-9]+", "#", "a1b2c3")
print(result) # "a#b#c#"
# Limit replacements with count
result = re.sub("[0-9]+", "X", "a1b2c3", 2)
print(result) # "aXbXc3"
# Case-insensitive replacement
result = re.sub("hello", "hi", "Hello HELLO hello", 0, re.I)
print(result) # "hi hi hi"
# Function replacement - uppercase all words
result = re.sub(r'(\w+)', lambda m: m.group(1).upper(), "hello world")
print(result) # "HELLO WORLD"
# Function replacement - swap first and last name
result = re.sub(r'(\w+) (\w+)', lambda m: m.group(2) + " " + m.group(1), "John Doe")
print(result) # "Doe John"
# Function replacement - format inline code
backtick = chr(96)
result = re.sub(backtick + r'([^' + backtick + r']+)' + backtick,
lambda m: "[" + m.group(1) + "]",
"test `code` here")
print(result) # "test [code] here"re.split(pattern, string, maxsplit=0, flags=0)
Splits the string by occurrences of the pattern.
Parameters:
pattern: Regular expression patternstring: String to splitmaxsplit: Maximum number of splits (0 = all, default: 0)flags: Optional flags (default: 0)
Returns: List of strings (split parts)
Example:
import re
parts = re.split("[,;]", "one,two;three")
print(parts) # ["one", "two", "three"]
# Limit splits
parts = re.split("[,;]", "a,b;c;d", 2)
print(parts) # ["a", "b;c;d"]re.compile(pattern, flags=0)
Compiles a regular expression pattern for validation and caching.
Parameters:
pattern: Regular expression patternflags: Optional flags (default: 0)
Returns: Regex object (compiled pattern) or error if invalid
Example:
import re
pattern = re.compile("[0-9]+") # Validates and caches the pattern
print(type(pattern)) # "Regex"
# Compile with flags
pattern = re.compile("hello", re.I)
print(type(pattern)) # "Regex"
# Compile with multiple flags
pattern = re.compile("hello", re.I | re.M)
print(type(pattern)) # "Regex"Compiled Pattern Methods
The Regex object returned by re.compile() provides the following methods:
pattern.match(string)- Match at start of stringpattern.search(string)- Search anywhere in stringpattern.findall(string)- Find all matches as stringspattern.finditer(string)- Find all matches as Match objects
Example:
import re
pattern = re.compile(r'\d+')
m = pattern.match("123abc") # Match at start
if m:
print(m.group(0)) # "123"
matches = pattern.findall("a1b2c3") # ["1", "2", "3"]
match_objects = pattern.finditer("a1b2c3")
for match in match_objects:
print(match.group(0), match.start(), match.end())
# "1" 1 2
# "2" 3 4
# "3" 5 6re.escape(string)
Escapes special regex characters in a string.
Parameters:
string: String to escape
Returns: String (escaped text)
Example:
import re
escaped = re.escape("a.b+c")
print(escaped) # "a\.b\+c"re.fullmatch(pattern, string, flags=0)
Checks if the pattern matches the entire string.
Parameters:
pattern: Regular expression patternstring: String to matchflags: Optional flags (default: 0)
Returns: Boolean (True if entire string matches, False otherwise)
Example:
import re
if re.fullmatch("[0-9]+", "123"):
print("Entire string is digits") # This prints
if re.fullmatch("[0-9]+", "123abc"):
print("This won't print - doesn't match entire string")
# Case-insensitive fullmatch
if re.fullmatch("hello", "HELLO", re.I):
print("Case-insensitive full match") # This printsRegular Expression Syntax
Scriptling uses Go’s regexp syntax, which is similar to Perl/Python:
Basic Patterns
.- Any character (newlines only with DOTALL flag)\d- Digit (0-9)\D- Non-digit\w- Word character (a-z, A-Z, 0-9, _)\W- Non-word character\s- Whitespace\S- Non-whitespace
Quantifiers
*- Zero or more+- One or more?- Zero or one{n}- Exactly n times{n,}- n or more times{n,m}- Between n and m times
Character Classes
[abc]- Any of a, b, or c[^abc]- Not a, b, or c[a-z]- Any lowercase letter[A-Z]- Any uppercase letter[0-9]- Any digit
Anchors
^- Start of string (or line with MULTILINE flag)$- End of string (or line with MULTILINE flag)\b- Word boundary\B- Not word boundary
Inline Flags
You can also use inline flag modifiers in your patterns:
(?i)- Case-insensitive(?m)- Multiline mode(?s)- Dotall mode (. matches newlines)
Usage Examples
import re
# Basic matching at start of string
m = re.match("[0-9]+", "123abc")
if m:
print("String starts with:", m.group(0)) # "123"
# Search anywhere in string
m = re.search(r'\w+@\w+\.\w+', "Contact: user@example.com")
if m:
print("Email:", m.group(0)) # "user@example.com"
# Search with groups
m = re.search(r'(\w+)@(\w+)\.(\w+)', "Contact: user@example.com")
if m:
print("User:", m.group(1)) # "user"
print("Domain:", m.group(2)) # "example"
print("TLD:", m.group(3)) # "com"
print("Groups:", m.groups()) # ("user", "example", "com")
# Find all matches
numbers = re.findall("[0-9]+", "abc123def456")
# ["123", "456"]
# Find all matches as Match objects
matches = re.finditer("[0-9]+", "abc123def456")
for match in matches:
print(match.group(0), match.start(), match.end())
# "123" 3 6
# "456" 9 12
# Replace text
text = re.sub("[0-9]+", "XXX", "Price: 100")
# "Price: XXX"
# Replace with count limit
text = re.sub("[0-9]+", "X", "1 2 3 4 5", 3)
# "X X X 4 5"
# Split by pattern
parts = re.split("[,;]", "one,two;three")
# ["one", "two", "three"]
# Compile pattern (validates and caches)
pattern = re.compile("[0-9]+")
# Regex object
# Use compiled pattern
matches = pattern.finditer("abc123def456")
for match in matches:
print(match.group(0)) # "123", "456"
# Escape special characters
escaped = re.escape("a.b+c*d?")
# "a\.b\+c\*d\?"
# Full match entire string
if re.fullmatch("[0-9]+", "123"):
print("String contains only digits")
# Case-insensitive matching with flag
m = re.match("hello", "HELLO world", re.I)
if m:
print("Case-insensitive match:", m.group(0))
# Case-insensitive matching with inline flag
m = re.match("(?i)hello", "HELLO world")
if m:
print("Inline flag match:", m.group(0))
# Multiline matching
text = "line1\nline2\nline3"
matches = re.findall("^line", text, re.M)
# ["line", "line", "line"]
# Dotall - dot matches newlines
m = re.search("a.*b", "a\nb", re.S)
if m:
print("Dotall match:", m.group(0)) # "a\nb"Notes
- Patterns use Go’s regexp engine (RE2)
re.match()andre.search()return Match objects (not strings) like Python- All functions are case-sensitive by default
- Use
re.Iorre.IGNORECASEflag for case-insensitive matching - Alternatively, use
(?i)at the start of pattern for case-insensitive matching - Backslashes in patterns need to be escaped in Scriptling strings
- The
countparameter inre.sub()limits the number of replacements (0 = replace all) - The
maxsplitparameter inre.split()limits the number of splits