Regular expressions in Python

Regular expressions in Python, and in other languages, are a sequence of characters that form a search pattern and can be used to check if a string contains this pattern.

ยป More Python Examples

Python has a built-in package called re, which can be used to work with regular expressions. You can start using regular expressions using the re module.

FunctionDescription
findallReturns a list containing all matches
searchReturns a Match object if there is a match anywhere in the string
splitReturns a list where the string has been split on each match
subReplaces one or more matches with a string
matchDetermines if the expression matches the beginning of the string

Metacharacters

There are metacharacters, which are nothing more than characters with special meaning.

CharacterDescriptionExample
[]Represents a set of characters[a-c]
\Signals a special sequence (can also be used to escape special characters)\n
.Any character (except the line break character)he..o
^Start with^hello
$Ends withend$
*Zero or more occurrenceseac*
+One or more occurrenceseac+
{}Exactly the specified number of occurrenceseac{3}
|Either oone|other
()Capture and group

Special characters

A special sequence is specified with the \ character followed by one of the characters in the list below:

CharacterDescriptionExample
\AReturns a match if the specified characters are at the beginning of the string\Atstart
\bReturns a match where the specified characters are at the beginning or end of a word r”\bwor”
r”ord\b”
\BReturns a match where the specified characters are present, but NOT at the beginning (or end) of a word
r”\Bain”
r”ain\B”
\dReturns a match where the string contains digits (numbers 0-9)\d
\DReturns a match where the string contains NO digits\D
\sReturns a match where the string contains a white space character\s
\SReturns a match where the string does NOT contain a whitespace character\S
\wReturns a match where the string contains any word character (characters A-Z, digits 0-9, and the underscore _)\w
\WReturns a match where the string does NOT contain any word characters\W
\ZReturns a match if the specified characters are at the end of the stringString\Z

Arrays

An array is a set of characters within a pair of square brackets [ ] with a special meaning.

ArrayDescription
[asd] Returns a match where one of the specified characters (a, s, or d) is present
[a-d] Returns a match for any lowercase character, alphabetically between a and d
[^asd] Returns a match for any character except the characters a, s, and d
[0123] Returns a match where any of the specified digits (0, 1, 2, or 3) are present
[0-9] Returns a match for any digit between 0 and 9
[0-5][0-9] Returns a match for any two-digit number between 00 and 59
[a-zA-Z] Returns a match for any character alphabetically between a and z, lowercase or uppercase
[+] In sets, +, *, ., |, (), $, {} have no special meaning, so [+] means: return a match for any + character in the string

Taking into account the previously described character sets it is possible to form regular expressions. Let’s start by seeing how the different options that python has to evaluate expressions work.

Findall() function

The findall function returns a list containing all matches.

The following example checks if there are numbers in the string and returns a list.

# geekole.com
import re

p = re.compile(r'\d+')
x = p.findall('11 it is a number and 12 too')

print(x)

if x:
    print("Yes, there is at least one match!")
else:
    print("There are no matches")

Running the example we get the following:

Regular expressions in Python

Match() function

The search() function returns None if it finds no match. If successful, a match object instance is returned, containing information about the match: where it starts and ends, the substring that was matched, and more.

The following example checks if a valid email address is found in a string.

# geekole.com
import re

ex = r"(^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+$)"
email = "email.test@domain.com"

if re.match(ex, email) is None:
    print("It is not a valid email address")
else:
    print("It is a valid email address")

Running the example we get the following output:

Regular expressions in Python

If you alter the email address, for example by adding a space, the is None condition will be false.

We hope this examples of how to use regular expressions in Python will be useful for you.