In this chapter you will learn about the Regular Expressions and the regular expression operations defined in the re module in Python Programming language.Regular expressions are also called regex or regexp.
In computer science regular expressions define a language. These languages have some characteristics and are called regular languages.Regular expression in Python is a set of characters or sequence that is used to match a string to another pattern using a formal syntax. In other words we can say that regular expression is a text or string (sequence of characters) matching pattern which match the string with the given characters using a syntax.
You can think of regular expressions as a small programming language which is embedded in python. This language (Regular expressions) can be used in Python programming by including the re module. re is the standard library of Python which supports matching operations of regular expression.
We can use regular expression to define some rules and these rules are then used to create possible strings out of the given string which we want to match the pattern with. Regular expressions in Python are interpreted as a set of instructions.When you compile the regular expression they are converted into bytecode and this byte code is then executed by machine engine. This machine engine is created in C programming language.One thing to note here is you cannot use all string operations using regular expression. This is because the language (regular expression language) is a restrict and small language. In Python you have the standard library which is included in your program to use regular expression but in other programming languages for example PERL we can directly use regular expression syntax.
But in Python the syntax after including standard library refor regular expression is similar to that we use in PERL.The match function:The match function is used to match the RE pattern with the given string. The match function contains flags which are used to change the behavior of a regular expression.The following is the syntax of match function in Python:re.match(pattern, string, flags)In this syntax we have three arguments:pattern: is the regular expression which is to be matchedstring: is the given string which is to be matched with regular expressionflags: are used to change the behavior of regular expressionHere the flag is optional.If the matching is performed successfully match object will be returned else NONE will be returned. Then the match object have further two main functions that are group(num) and group() functions.
These functions are used to return the match or a specific subsequence and all the subsequences respectively.Example 1: Using the match functionThe following example demonstrates how you can use the match function:CODE:import restr = “Hello Python Programming”mobj = re.match(r”hello”, str, re.I)print(mobj.group())In this code first of all re module is imported.
Then we have a string which will be compared with our RE pattern. We have created a match object mobj. The match function is called and the value returned from the match function will be assigned to mobj. The match function is called using re then inside parenthesis the first argument is the pattern to be matched and then we have the given string from which patter will be matched and then we have a flag value. Here we have written re.I which means IGNORECASE so it will be ignored whether the pattern and the string has different case letters (either upper case or lower case).
At the end mobj is printed and you have the following output:OUTPUT:HelloNote:In this example we have used the prefix r which tells that our string is a raw string. In a raw string there is no need to write double slashes when using escape sequences for example if you want a back slash then you just have a single and not double back slashes as we did for regular strings. This is the only difference between a regular and a raw string.Example 2: Using match function with regular StringConsider the example below in which we have used a regular string instead of a raw string:CODE:import restr = ” Hello Python Programming”mobj = re.match(” hello”, str, re.
I) #no matchstr = ” Hello Python Programming”mobj = re.match(” hello”, str, re.I) # hello is matchingThe Search function:The search function is used to search the RE pattern in the given string. We have three arguments in the function the pattern, given string, and flags (optional) respectively.The following is the syntax of the search function in Python:re.
search(pattern, string, flags)Here pattern is the regular expression that we will search in the given string. This function will return match object if it finds the pattern in the given string else NONE will be returned. The group(num) and group() functions are used to return the match (specific subsequence) and all the matches in the string (all subsequences) respectively.Example 3: Using search functionThe following Python code demonstrates the use of search function:CODE:import restr = “Hello Python Programming”sobj = re.
search(r”programming”, str, re.I)print(sobj.group())OUTPUT:ProgrammingIn this code we have searched for the word programming. The search function searches in the entire string.
The difference between search and match is that match function only checks at the beginning of the string whereas search searches in the entire string.Example 4: Searching at the beginningIf you want to search at the beginning of the string then you can use ^. Consider the following example:CODE:import restr = “Hello Python Programming”sobj = re.search(r”^programming”, str, re.
I)print(sobj.group()) #no match is foundsobj = re.search(r”^hello”, str, re.I)print(sobj.group()) #matching: HelloHere ^ will make the search only at the beginning of the string.Example 5: Searching at the endYou can also search at the end of the given string.
This can be done using $ at the end of the pattern. Consider the code below:CODE:import restr = “Hello Python Programming”sobj = re.search(r”programming$”, str, re.
I)print(sobj.group()) #matching: Programmingsobj = re.search(r”hello$”, str, re.I)print(sobj.group()) #no match found##Compiling Regular Expressions:Regular expressions in Python when compiled are converted into patterns.
These patterns are actually the pattern objects which contain different functions to perform different tasks which may include searching, matching, and replacing etc.When you compile a pattern then that pattern can be used later in the program.Example 6: Using Precompiled Patterns