Skip to content

Latest commit

 

History

History
839 lines (593 loc) · 11.6 KB

regex_for_beginners.md

File metadata and controls

839 lines (593 loc) · 11.6 KB
slideOptions
transition spotlight
slide
enabled

REGEX FOR BEGINNERS

And bullies

Why use regex

  • speed up text manipulation
    • find stuff quickly
    • replace stuff quickly
  • make everyone jealous

/^(?=.*(\[Rep SG:[^\]]*\]))(?=.*(\[Rep CR:[^\]]*\]))(?=.*(\[Rep TY:[^\]]*\])).*$/


Standard text search

X => X


blastimir

blast you bleeding blastimir!


blastimir

blast you bleeding blastimir!


blastimir

blast you bleeding blastimir!


blastimir

blast you bleeding blastimir!


blastimir

blast you bleeding blastimir!


blastimir

blast you bleeding blastimir!


blastimir

blast you bleeding blastimir!


blastimir

blast you bleeding blastimir!


blastimir

blast you bleeding blastimir!


blastimir

blast you bleeding blastimir!


...


blastimir

blast you bleeding blastimir!


Problem

blast

blast you bleeding blastimir!


Standard text search


Regex Syntax

X =/= X

[ ] \ ^ $ . | ? * + ( ) { }


[...] => one of
\d    => number
A-Z   => any uppercase char
a-z   => any lowercase char
\t    => tab
\n    => line feed
...

Simple example

Looking for:

drink drank drunk

Not interested in:

drinking drunkard antidrink


Standard search

  • multiple searches (drink, drank, drunk)
  • false positives

Regex

  • one search to rule them all


 drink
 drank
 drunk
 drek
 drinking
 drunkard
 antidrink

 


+drink
 drank
 drunk
 drek
+drinking
 drunkard
+antidrink

drink


+drink
+drank
 drunk
 drek
+drinking
 drunkard
+antidrink

dr[ia]nk


+drink
+drank
+drunk
 drek
+drinking
+drunkard
+antidrink

dr[iau]nk


+drink
+drank
+drunk
 drek
 drinking
 drunkard
+antidrink

dr[iau]nk\b


+drink
+drank
+drunk
 drek
 drinking
 drunkard
 antidrink

\bdr[iau]nk\b


Example 2

Filter files


 german.jpg
 midget.png
 bondage.jpeg
 definitelyajpg.rar
 nudes.jpg.zip
 rtfm.txt
 send_nudes.eml
 goaway.tar.gz

 


+german.jpg
 midget.png
 bondage.jpeg
+definitelyajpg.rar
+nudes.jpg.zip
 rtfm.txt
 send_nudes.eml
 goaway.tar.gz

jpg


+german.jpg
 midget.png
 bondage.jpeg
 definitelyajpg.rar
+nudes.jpg.zip
 rtfm.txt
 send_nudes.eml
 goaway.tar.gz

\.jpg


+german.jpg
 midget.png
 bondage.jpeg
 definitelyajpg.rar
 nudes.jpg.zip
 rtfm.txt
 send_nudes.eml
 goaway.tar.gz

\.jpg$


+german.jpg
 midget.png
+bondage.jpeg
 definitelyajpg.rar
 nudes.jpg.zip
 rtfm.txt
 send_nudes.eml
 goaway.tar.gz

\.jpe?g$


+german.jpg
+midget.png
+bondage.jpeg
 definitelyajpg.rar
 nudes.jpg.zip
 rtfm.txt
 send_nudes.eml
 goaway.tar.gz

\.(jpe?g)|(png)$


Example 3

Looking for drunk campaigns


 Campaign: drink
 Campaign: drink too much
 Campaign: #69
 Campaign: #666
 Not a Campaign: oops
 Your mom is a Campaign
 Campaign: 666
 Campaign:

 


+Campaign: drink
+Campaign: drink too much
+Campaign: #69
+Campaign: #666
+Not a Campaign: oops
+Your mom is a Campaign
+Campaign: 666
+Campaign:

Campaign


+Campaign: drink
+Campaign: drink too much
+Campaign: #69
+Campaign: #666
+Not a Campaign: oops
 Your mom is a Campaign
+Campaign: 666
+Campaign:

Campaign:


+Campaign: drink
+Campaign: drink too much
+Campaign: #69
+Campaign: #666
 Not a Campaign: oops
 Your mom is a Campaign
+Campaign: 666
+Campaign:

^Campaign:


 Campaign: drink
 Campaign: drink too much
 Campaign: #69
 Campaign: #666
 Not a Campaign: oops
 Your mom is a Campaign
 Campaign: 666
+Campaign:

^Campaign:$


+Campaign: drink
+Campaign: drink too much
 Campaign: #69
 Campaign: #666
 Not a Campaign: oops
 Your mom is a Campaign
+Campaign: 666
 Campaign:

^Campaign: \w+


+Campaign: drink
 Campaign: drink too much
 Campaign: #69
 Campaign: #666
 Not a Campaign: oops
 Your mom is a Campaign
+Campaign: 666
 Campaign:

^Campaign: \w+$


+Campaign: drink
+Campaign: drink too much
 Campaign: #69
 Campaign: #666
 Not a Campaign: oops
 Your mom is a Campaign
+Campaign: 666
 Campaign:

^Campaign:( \w+){1}


+Campaign: drink
 Campaign: drink too much
 Campaign: #69
 Campaign: #666
 Not a Campaign: oops
 Your mom is a Campaign
+Campaign: 666
 Campaign:

^Campaign:( \w+){1}$


 Campaign: drink
+Campaign: drink too much
 Campaign: #69
 Campaign: #666
 Not a Campaign: oops
 Your mom is a Campaign
 Campaign: 666
 Campaign:

^Campaign:( \w+){3}


+Campaign: drink
+Campaign: drink too much
 Campaign: #69
 Campaign: #666
 Not a Campaign: oops
 Your mom is a Campaign
+Campaign: 666
 Campaign:

^Campaign:( \w+)+


+Campaign: drink
+Campaign: drink too much
+Campaign: #69
+Campaign: #666
 Not a Campaign: oops
 Your mom is a Campaign
+Campaign: 666
+Campaign:

^Campaign:( \w+)*


 Campaign: drink
 Campaign: drink too much
 Campaign: #69
 Campaign: #666
 Not a Campaign: oops
 Your mom is a Campaign
+Campaign: 666
 Campaign:

^Campaign: \d+


 Campaign: drink
 Campaign: drink too much
+Campaign: #69
+Campaign: #666
 Not a Campaign: oops
 Your mom is a Campaign
+Campaign: 666
 Campaign:

^Campaign: .\d+


 Campaign: drink
 Campaign: drink too much
+Campaign: #69
+Campaign: #666
 Not a Campaign: oops
 Your mom is a Campaign
 Campaign: 666
 Campaign:

^Campaign: #\d+


 Campaign: drink
 Campaign: drink too much
+Campaign: #69
+Campaign: #666
 Not a Campaign: oops
 Your mom is a Campaign
+Campaign: 666
 Campaign:

^Campaign: #?\d+


Example 4

Duplicate words


 That is not a test.
+That is not not a test.
+This is a test.
+This is is a test.

(\w+)\s\1


 That is not a test.
+That is not not a test.
 This is a test.
+This is is a test.

(\b\w+\b)\s\1


Scary Example 5

Password validation


^(?=.*[A-Za-z])(?=.*\d)[A-Za-z\d]{8,}$

^               <- start of string
(?=.*[A-Za-z])
(?=.*\d)
[A-Za-z\d]
{8,}
$

^
(?=.*[A-Za-z])  <- positive lookahead: find a character
(?=.*\d)
[A-Za-z\d]
{8,}
$

^
(?=.*[A-Za-z])
(?=.*\d)        <- positive lookahead: find a digit
[A-Za-z\d]
{8,}
$

^
(?=.*[A-Za-z])
(?=.*\d)
[A-Za-z\d]      <- find a character or a digit
{8,}
$

^
(?=.*[A-Za-z])
(?=.*\d)
[A-Za-z\d]
{8,}            <- repeat previous expression
$                     *at least* 8 times

^
(?=.*[A-Za-z])
(?=.*\d)
[A-Za-z\d]
{8,}
$               <- end of string

Finite State Machines

  • Every regular expression is a finite state machine!

\.(jpe?g)|(png)

state-machine-images


^(a|b)c*d+$

state-machine2


FSM graph for:

\bdr[iau]nk\b

?


Further points


T.Hanks

T.Hanks