Holla Tech - Learn

Email Extraction
 

To demonstrate a sample usage of regular expressions, lets create a program to extract email addresses from a string.
Suppose we have a text that contains an email address: 

str = “Please contact info@htlearn.com for assistance” 

 

Our goal is to extract the substring “info@htlearn.com”.
A basic email address consists of a word and may include dots or dashes. This is followed by the @ sign and the domain name (the name, a dot, and the domain name suffix).
This is the basis for building our regular expression.

pattern = r“([\w\.-]+)@([\w\.-]+)(\.[\w\.]+)” 

 

[\w\.-]+ matches one or more word character, dot or dash.
The regex above says that the string should contain a word (with dots and dashes allowed), followed by the @ sign, then another similar word, then a dot and another word.

NOTE!
Our regex contains three groups: 1 – first part of the email address. 2 – domain name without the suffix. 3 – the domain suffix.

Email Extraction
 

Putting it all together:

import re

pattern = r“([\w\.-]+)@([\w\.-]+)(\.[\w\.]+)”
str = “Please contact info@sololearn.com for assistance”

match = re.search(pattern, str)
if match:
   print(match.group()) 

 

Result: 

>>
info@sololearn.com
>>> 

 

In case the string contains multiple email addresses, we could use the re.findall method instead of re.search, to extract all email addresses.

NOTE!
The regex in this example is for demonstration purposes only. A much more complex regex is required to fully validate an email address.

BACK NEXT

CLICK ON THE BUTTON BELOW TO GO TO THE PYTHON MAIN COURSE PAGE. 

PYTHON MAIN COURSE PAGE

 


©️ License: All Rights Reserved 


CONTACT HOLLA TECH – LEARN SUPPORT