Abnormal expressions (abnex) is an alternative to regular expressions (regex). This is a Python library but the abnex syntax could be ported to other languages.
-
Regex
([\w\._-]+)@([\w\.]+)
-
Abnex
{[w"._-"]1++}"@"{[w"."]1++}
-
Abnex (spaced)
{[w "._-"]1++} "@" {[w "."]1++}
-
Abnex (expanded)
{ [w "._-"]1++ } "@" { [w "."]1++ }
{{{[a-z '_']1++} {[a-z 0-9 '_-.']0++}} '@' {{[a-z 0-9]1++} '.' {[a-z 0-9]1++} {[a-z 0-9 '-_.']0++}} {[a-z 0-9]1++}}
.
- It's easier to read, write and understand.
- You can use spaces inside of the expression, you can also "expand" it, i.e. write it over multiple lines and use indention.
- You don't have to use a backslashes all the time
- More logical/common symbols like
!
for not,{}
for groups,1++
,0++
,0+
for: one or more, zero or more, zero or one. - It's easier to see if a symbol is an actual symbol you are searching for or if it's a regex character, ex:
- Regex:
[\w-]+@[\w-_]+
- Abnex:
[w "-"]1++ "@" [w "-"]1++
- Regex:
- Start of string, or start of line in multi-line pattern
^
->->
- End of string, or end of line in multi-line pattern
$
-><-
- Start of string
\A
->s>
- End of string
\Z
-><s
- Word boundary
\b
->:
- Not word boundary
\B
->!:
- Start of word
\<
->w>
- End of word
\>
-><w
- Control character
\c
->c
- White space
\s
->_
- Not white space
\S
->!_
- Digit
\d
->d
- Not digit
\D
->!d
- Word
\w
->w
- Not word
\W
->!w
- Hexadecimal digit
\x
->x
- Octal digit
\o
->o
- 0 or more
*
->0++
- 1 or more
+
->1++
- 0 or 1
?
->0+
- Any character except new line (\n)
.
->*
- a or b
a|b
->"a"|"b"
- Group
(...)
->{...}
- Passive (non-capturing) group
(?:...)
->{#...}
- Range (a or b or c)
[abc]
->['abc']
or["a" "b" "c"]
- Not in set
[^...]
->[!...]
- Lower case letter from a to Z
[a-q]
->[a-z]
- Upper case letter from A to Q
[A-Q]
->[A-Q]
- Digit from 0 to 7
[0-7]
->[0-7]
What is the recommended way to write abnexes
- Use spaces between characters in character sets:
- Correct:
[w "_-"]
- Incorrect:
[w"_-"]
- Correct:
- Put multiple exact characters between the same quotes in character sets:
- Correct:
["abc"]
- Incorrect:
["a" "b" "c"]
, especially incorrect:["a""b""c"]
- Correct:
- Put spaces between groups:
- Correct:
{w} "." {w}
- Incorrect:
{w}"."{w}
- Correct:
Match for an email address:
- Regex:
[\w-\._]+@[\w-\.]+
- Abnex (following standards):
{[w "-._"]1++} "@" {[w "-."]1++}
- Abnex (not following standards):
{[w"-._"]1++}"@"{[w"-."]1++}
Abnex has most functions from the re
library, but it also has som extra functionality like: last()
& contains()
.
match()
->match()
findall()
->all()
split()
->split()
sub()
->replace()
subn()
->replace_count()
search()
->first()
holds()
: whether or not a string matches an expression (bool).contains()
: wheter or not a string contains a match (bool).last()
: the last match in a string.