Technology Software

How to Replace Regex on Python

    • 1). Open the IDLE text editor that comes bundled with the Python language by clicking on its icon. The IDLE text editor icon is located in the Python directory in your installed programs list (located under All Programs in the Windows Start Menu, and within the Applications folder in OSX). A blank source code file opens in the main editor window.

    • 2). Include the 're' module by writing this line at the top of the source code file:

      import re

    • 3). Declare a string and assign some email addresses to it, such as this:

      emailAddresses = 'William@amail.com, John@bmail.com, Bruce@cmail.com'

    • 4). Create a regular expression that searches for all possible text permutations in valid email addresses. Regular expressions work by searching for a pattern of characters in a string of text. The pattern you are interested in is any two words joined by an @ symbol. Since email addresses have many valid characters, you want to match all possible characters in each word before and after the @ symbol. This is accomplished with the regular expression [\w\.-], and by adding a + to the end of it, you can repeat this for all the characters. The completed regular expression can be saved to a string like this:

      regexPattern = r'([\w\.-]+)@([\w\.-]+)'

    • 5). Create a regular expression that replaces all of the domain names with "zmail.com." In this regular expression, the backreference character sequence \1 is used to replace the domain of the email addresses. The backreference refers to a location in a regular expression surrounded in parenthesis. By applying the regular expression to the first backreference, you save the email address but discard the old domain name. You can then add a new domain name, like '@zmail.com.' To save this second regular expression to a variable, you can write this:

      regexReplacement = r'\1@zmail.com'

    • 6). Apply the regular expressions to the string containing the email addresses like this:

      emailAddresses = re.sub(regexPattern, regexReplacement, emailAddresses)

    • 7). Print out the email addresses using this line of code. Python 3 uses this syntax for printing: print(emailAddresses), while Python 2 uses this syntax: print emailAddresses.

    • 8). Run the program by pressing the F5 key. The program output is:

      William@zmail.com, John@zmail.com, Bruce@zmail.com

Leave a reply