Google sheet is full of many features and especially regexes, commonly called regular expressions. Their fields of use are very varied and allow you to process large volumes of databases. Regexes are perfect for refining the information of your prospects before launching your prospecting campaigns. Let’s see how to use them to improve the quality of your copywriting or your lead or account targeting.
Regex: definition
Regular expressions or regexes allow you to differentiate one sequence of characters from another.
It is a syntax that allows, among other things, to clean large databases and to remove information that is not useful.
The best known use of regex is in forms. Here is an example in the contact form.
Let’s imagine that we have a contact form on our site asking :
- last name
- first name
All fields are required to submit the form.
Something very common is that a user adds an email address that is not in the “correct” format, for example :
- jonathan@gmail.
- jonathangmail.com
- jonathan@g.fr
Regexes can identify that the email is not properly formed, this is pure semantic analysis.
Now that you understand this, let’s see how to use it in the context of a lead or account database.
The list of possibilities is endless, we will talk about 3 elements that you can implement right now on your databases on Google Sheet.
If you use Derrick, it is easy to :
- Preparing regex formulas;
- Import a list with Derrick’s LinkedIn lead import;
- Directly process leads in 1 blink of an eye with regexes.
➡️ Let’s take a step back:
In the example below, you can see that you can easily find all the gmail, but you will have to do the same for yahoo, hotmail, etc. and replace by hand each time.
It can be very time consuming especially if you can do everything automatically 😉
Regular expressions are great for dealing with large databases. Let’s see in the next points the 3 formulas available on Google Sheet of regex to clean up your databases as much as possible.
Regex & Google Sheet : the main formulas
At first sight, it may seem a bit complex, but it’s a much easier language to understand than English, it’s actually quite basic.
I recommend this training if you want to learn.
There are 3 main forms you can use in Google Sheet.
REGEXMATCH
The REGEXMATCH function indicates if the result of your formula is TRUE. In other words, if the text on which you apply your regex meets the requirement of this one or not.
Example of using REGEXMATCH to validate that phone numbers are correct in our database.
1st step: define what a phone number looks like
Here are the possible cases in the case of French numbers:
- 06 47 47 47 47
- 06.47.47.47.47
- 0647474747
- +33647474747
2nd step: isolate the common patterns
- There are at least 8 digits and maximum 11 digits, but only if the character sequence starts with a “+” ;
- There can be spaces between the digits or dots;
- the sequence starts with a 0 or a +.
We have defined our patterns, it’s time to translate them into regex. I’ll skip the explanation but I’ll translate our patterns into regex.
Here is a nice and simple tool to check that your regexes are working: https: //regex101.com/
Here is the regex :
^(?:(?:\+|00)33[\s.-]{0,3}(?:\(0\)[\s.-]{0,3})?|0)[1-9](?:(?:[\s.-]?\d{2}){4}|\d{2}(?:[\s.-]?\d{3}){2})$
This is what I get when I use the RegexMatch function of Google Sheet:
In TRUE: Valid french phone numbers.
In FALSE : Invalid French telephone numbers.
Now imagine doing this on hundreds of leads, for emails, phone numbers or any other data.
REGEXEXTRACT
The REGEXEXTRACT formula allows you to extract a part of a cell’s text from a regular expression. With Derrick for example, you can retrieve the bio of your leads’ LinkedIn profiles. This bio is a gold mine of information, sometimes there are numbers or emails of information. Regexes will allow you to extract them automatically without having to read the bio of each of your leads and extract them by hand!
Example of using REGEXEXTRACT to find an email in a bio
1- Start by importing your leads with Derrick (here is a video explanation).
2- Same as for RegexMatch, let’s first define what an email looks like.
- contact@derrick-app.com
- jonathan@gmail.com
- jona123@gmail.fr
- m0nsUPERMail@1234.fr
3- The patterns :
4- Back to https://regex101.com/
There are more or less “perfect” regex, for the example I went to the essential.
Here is the regex:
^(?:(?:\+|00)33[\s.-]{0,3}(?:\(0\)[\s.-]{0,3})?|0)[1-9](?:(?:[\s.-]?\d{2}){4}|\d{2}(?:[\s.-]?\d{3}){2})$Here is what I get when I use the Regex Extract function of Google Sheet on the summary column of Derrick. Result on the “email in summary” column.
=iferror(REGEXEXTRACT({{cellule-Summary}};”[a-z0-9]+@[a-z]+\.[a-z]{2,3}”);””)REGEXREPLACE
The REGEXREPLACE function allows to replace a part of a text with a regular expression.
regexreplace S.A.R.L
“Hello Sir, I saw that you work at Super Enterprise”
➡️ Disqualified
What if we could clean this up in a split second. Here we go!
1- Start by importing your companies with Derrick (here’s an explanation video)
2- You have understood the following
- SARL
- SAS
- EURL
- S.A.S
- S.A.R.L
- ..
3- Our patterns
- sequence of characters
- the sequence of characters is in uppercase
- the sequence of characters is finished SAS,SARL,..
4- Here is the very simplified regex
(SARL|S.A.R.L|SAS|S.A.S|EURL|E.U.R.L)This is what it looks like when I use the Google Sheet Regex replace function on Derrick’s CompanyName column.
TUTO : Clean a Derrick database with regexes
Step 1: Download and install Derrick on Google SheetDerrick
is a free application that installs directly on your Chrome browser. It allows you to retrieve your leads’ data directly from LinkedIn Sales Navigator
Build Leads Lists in Seconds ! Install on Google Sheet : Import, Find & Enrich (with email) Leads & Companies in minutes. Start Now and get 200 leads for free !
Step 2 : Scrape leads or companies on Linkedin Sales Navigator to your Google Sheet document:
- Method 1: scrape leads
- Method 2: scrape companies
Step 3: Enriching enterprise data
Once all this work is done, you can start cleaning up your database using regexes.
I stretch my formula and get only the companies containing the regular expression “seo”. This is easily identified with the data TRUE or FALSE. Here, I will filter only on TRUE to have only the companies specialized in SEO.
The potential of these simple functions is enormous for database processing. There are also syntax rules with regexes that you can use to further refine your information
If you have any questions about the formulas, feel free to ask us in comments.
0 Comments