Share on facebook
Share on twitter
Share on linkedin
Share on email

Intro to Regular Expressions, and How to Use Them

Do you ever wish you could search for multiple strings in a Google Analytics report? Or set up one trigger to fire on multiple pages in Google Tag Manager? Are you constantly having to create complex filters? Or running out of slots in your Segments or advanced searches?

If so, I’m here to introduce you to your new best friend: regular expressions.

What are Regular Expressions (RegEx)?

A regular expression (written as RegEx, regex, or regexp) is a coded text string that describes a pattern or set of patterns in order to search text. Some variation of RegEx is in most scripting languages, but you can also use it in all Google Marketing Platform products. It sounds complex and it might be a little intimidating at first, but once you get the hang of a few key characters, you’re on your way to reporting like a pro!

Basic Characters

There are special characters in RegEx that each mean different things, but by combining these characters, you can create very powerful patterns. The list below is not comprehensive, but rather includes RegEx that I use often and recommend starting with.

Character

Definition

Example

Example Pattern

  |

A bar/pipe is used to mark an “or”

infotrust|InfoTrust

Contains:

infotrust or InfoTrust

( )

Parentheses can be used to group different text together 

(Info)Trust

Contains:

InfoTrust

[ ]

Square brackets can be used to express that anything within them is interchangeable

b[aeiou]bble

Contains:
babble, bebble, bibble, bobble, or bubble

[a-b]

[0-9]

A hyphen between digits or letters with square brackets around it can be used to designate a range of letters or digits/numbers

[b-f]at

Contains:
bat, cat , dat, eat, or fat

?

A question mark is used to declare that the previous character is optional

favou?rite

Contains:
favorite or favourite

*

An asterisks is used to indicate that the previous character can be optional or repeated unlimited times

go*gle

Contains:
ggle, gogle, google, gooogle, goooogle, etc

+

A plus sign is used to say that the previous character can be repeated unlimited times

go+gle

Contains:
gogle, google, gooogle, goooogle, etc

{ }

Curly brackets with a number inside can be used to create multiples of the previous character (or group of characters)

b{3}

Contains:
bbb

{ , }

Curly brackets with numbers separated by a comma, is used to designate a specific range of multiples for the previous character (or group of characters)

a{3,6}

Contains:
aaa, aaaa, aaaaa, or aaaaaa

\

Escape any of the special characters

InfoTrust is the Sh\*t

Contains:
InfoTrust is the Sh*t

\d

A single digit from 0 to 9 (a short hand for range [0-9])

\d

Contains:
0, 1, 2, 3, 4, 5, 6, 7, 8, or 9

\n

Designates a new line in the text

Hello\nWorld

Contains:
Hello
World

.

A period represents a single character (digit, letter, or character)

inf.trust

Contains:

infotrust, inf*trust, or inf8trust

^

A carrot helps identify the beginning of string

^cat

Begins with:

cat

$

A dollar sign marks the end of a string

dog$

Ends with: 

dog

Common Combinations

Using these common combinations, along with the basic characters, you can quickly start using RegEx in your reporting.

Combo

Definition

Example

Pattern

.*

Technically this combination is any character + any multiple of previous character. Effectively this becomes any collection of characters.

.*\.example\.com

Contains:

Example.com, sub1.example.com, or sub2.example.com

(Will count all subdomains as long as the hostname ends in “.example.com”)

(( ))

Nested parentheses are used to group different actions together especially when you want another RegEx character to act on a whole set of characters. It always reminds me of PEMDAS in middle school! 

rege(x(es)?|xps?)

Contains:

regex, regexes, regexp, or regexps

( )?

Using parentheses with an action like a question mark means that the whole group within the parentheses is subjected to the RegEx character of the question mark’s action.

g(oog)+le

Contains:

google, googoogle, googoogoogle, googoogoogoogle, etc

\d{ }

\d and curly brackets can help with any number patterns like for phone numbers or SSN.

\d{5}(-\d{4})?

Contains:

03948-4758 (aka a zip code)

\d+

\d and the plus sign can help with integers for if you have a range of values from 0 – 100000 and want to be able to account for all possible values.

\d+(\.\d\d)?

Contains: 

X.XX, XX.XX, XXX.XX, XXXX.XX, etc

(A positive integer or a floating point number with exactly two characters after the decimal point. X is a digit [0-9])

\?

\.

\/

The backslash character with any of the RegEx characters turns the RegEx character back into a regular character.

www\.example\.com\/test\?p=xtest

Contains:

www.example.com\test?p=xtest

^ $

When a carrot and dollar sign are used then you are saying that the string is EXACTLY whatever is in between these two characters.

^InfoTrust$

Is exactly:

InfoTrust

Tools to Help

The only downside of RegEx is that they take a little while to get the hang of, so you should always test them—especially when starting out so that you ensure you’re using the characters correctly, but also to test your skills if you’re trying a new combination of characters. To help me learn (and continually use) RegEx,  I have two different types of RegEx tools: one for pattern visualization and one for matching. 

Pattern Visualization:

Pattern visualizations can be used to work out RegEx to make sure you’re creating the correct pattern (especially since it’s easy to forget a parenthesis or character when writing exceptionally long expressions). There are multiple tools that are free to use, but I like the simplicity of regexper.com. I recommend using something like this tool especially as you’re starting out to make sure you are getting the hang of the new “language” of RegEx. I found it really helped when trying to help visualize nested functions since a long string can quickly get messy.

Example RegEx: .*(This Is A Tool (That Helps (Visualize|Simplify) a Complex (|or nested )Expression)).*

Matching:

Matching tools are useful when you have specific strings that you want to match, but also have others that you want to avoid. I’ve used it most commonly for when I have multiple URLs that I want to specify, but also make sure to avoid others URLs. Regextester.com is another free tool worth trying out, specifically for matching. (Side note: Be aware that a lot of RegEx matching tools are meant for developers, so when you’re working with RegEx for Google Marketing Platform products, you will want to make sure it looks at JavaScript’s version of RegEx, as they are slightly different in different languages.) 

Example RegEx: .*(This Is A Tool (That Helps (Visualize|Simplify) a Complex (|or nested )Expression)).*

Conclusion

You now have a good idea of how to start using RegEx in your filters, segments, data studio reports, or even GTM trigger. There are literally millions of possibilities for what you need, and only you will know the patterns that suit your data set. Hopefully this is enough to get you started, but if you do have questions, feel free to reach out to the analytics consultant and engineers at InfoTrust.

Now, young padawan, go out into the world and start feeling like a technical reporting genius and impress all your coworkers with the power of RegEx!

Questions About RegEx?

Reach out to our experienced analytics team if you have any questions.
Share on facebook
Facebook
Share on twitter
Twitter
Share on linkedin
LinkedIn
Share on email
Email

Other Articles You Will Enjoy

Reporting Differences and Common Discrepancies Between ‘Revenue’ and ‘Product Revenue’

Reporting Differences and Common Discrepancies Between ‘Revenue’ and ‘Product Revenue’

Have you ever noticed that the Google Analytics data model includes two metrics for reporting on revenue collection: revenue and product revenue? And not…

Four Things You Can Automate Using the Google Tag Manager API

Four Things You Can Automate Using the Google Tag Manager API

If you are a CPG (Consumer-Packaged Goods) organization, or simply a company with multiple brands/markets/locations, you may have tens, hundreds, or even thousands of…

You Could Be Missing GTM Data: Don’t Neglect the CSS Wildcard

You Could Be Missing GTM Data: Don’t Neglect the CSS Wildcard

Google Tag Manager’s CSS selector rule is arguably one of the most commonly used and talked about methods of tracking your cleverly-built pride and…

Bayesian vs. Frequentist Methodologies Explained in Five Minutes

Bayesian vs. Frequentist Methodologies Explained in Five Minutes

Every now and then I get a question about which statistical methodology is best for A/B testing, Bayesian or frequentist. And usually, as soon…

InfoTrust Analyst Mai AlOwaish Published in Applied Marketing Analytics Journal

InfoTrust Analyst Mai AlOwaish Published in Applied Marketing Analytics Journal

The experienced digital analytics consulting team at InfoTrust is proud to share that Industry Team Lead Mai AlOwaish was recently published in Applied Marketing…

OTT Analytics and Reporting Best Practices for Media and Publishing Organizations

OTT Analytics and Reporting Best Practices for Media and Publishing Organizations

At InfoTrust, our team is proud to work with many large media companies that own and operate a number of digital platforms to reach…

How to Navigate Through the Holiday Season During a Pandemic

How to Navigate Through the Holiday Season During a Pandemic

This holiday season will be a special one. Some big box stores have already announced they won’t be opening for Black Friday, and consumer…

Intro to Cookieless/Anonymous Tracking in Google Analytics

Intro to Cookieless/Anonymous Tracking in Google Analytics

You’ve heard about cookie restrictions which have spurred regulations like ePrivacy in EUR, and other regulations around customer data usage and consent such as…

First-Party Tag-Based and Custom Audience Creation in Display and Video 360 (DV360)

First-Party Tag-Based and Custom Audience Creation in Display and Video 360 (DV360)

Display and Video 360 allows businesses to develop end-to-end marketing campaigns in a single platform and to seamlessly integrate with other Google Marketing Platform…

Talk To Us

Receive Book Updates

Fill out this form to receive email announcements about Crawl, Walk, Run: Advancing Analytics Maturity with Google Marketing Platform. This includes pre-sale dates, official publishing dates, and more.

Our website uses cookies and may collect user information to provide a good experience. Read our Privacy Policy here.

Leave Us A Review

Leave a review and let us know how we’re doing. Only actual clients, please.