Skip to content

Commit 70f2077

Browse files
refactor: regex sequence and min docs (#39)
1 parent 02cd0cc commit 70f2077

File tree

11 files changed

+133
-38
lines changed

11 files changed

+133
-38
lines changed

README.md

Lines changed: 100 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
# TS Regex Builder
22

3-
User-friendly Regular Expression builder for TypeScript and JavaScript.
3+
A user-friendly regular expression builder for TypeScript and JavaScript.
44

55
## Goal
66

7-
Regular expressions are a powerful tool for matching complex text patterns, yet they are notorious for their hard-to-understand syntax.
7+
Regular expressions are a powerful tool for matching simple and complex text patterns, yet they are notorious for their hard-to-understand syntax.
88

9-
Inspired by Swift's Regex Builder, this library allows users to write easily and understand regular expressions.
9+
Inspired by Swift's Regex Builder, this library allows users to write and understand regular expressions easily.
1010

1111
```ts
1212
// Before
@@ -26,10 +26,10 @@ const hexColor = buildRegex(
2626
capture(
2727
choiceOf(
2828
repeat({ count: 6 }, hexDigit),
29-
repeat({ count: 3 }, hexDigit)
29+
repeat({ count: 3 }, hexDigit),
3030
)
3131
),
32-
endOfString
32+
endOfString,
3333
);
3434
```
3535

@@ -39,7 +39,13 @@ const hexColor = buildRegex(
3939
npm install ts-regex-builder
4040
```
4141

42-
## Usage
42+
or
43+
44+
```sh
45+
yarn add ts-regex-builder
46+
```
47+
48+
## Basic usage
4349

4450
```js
4551
import { buildRegex, capture, oneOrMore } from 'ts-regex-builder';
@@ -48,14 +54,101 @@ import { buildRegex, capture, oneOrMore } from 'ts-regex-builder';
4854
const regex = buildRegex(['Hello ', capture(oneOrMore(word))]);
4955
```
5056

57+
## Domain-specific language
58+
59+
TS Regex Builder allows you to build complex regular expressions using domain-specific language or regex components.
60+
61+
Terminology:
62+
* regex component (e.g., `capture()`, `oneOrMore()`, `word`) - function or object representing a regex construct
63+
* regex element (`RegexElement`) - object returned by regex components
64+
* regex sequence (`RegexSequence`) - single regex element or string (`RegexElement | string`) or array of such elements and strings (`Array<RegexElement | string>`)
65+
66+
Most of the regex components accept a regex sequence. Examples of sequences:
67+
* single string: `'Hello World'` - note all characters will be automatically escaped in the resulting regex
68+
* single element: `capture('abc')`
69+
* array of elements and strings: `['$', oneOrMore(digit)]`
70+
71+
Regex components can be composed into a complex tree:
72+
73+
```ts
74+
const currencyAmount = buildRegex([
75+
choiceOf('$', '', repeat({ count: 3 }, characterRange('A', 'Z'))),
76+
oneOrMore(digit),
77+
optionally([
78+
'.',
79+
repeat({ count: 2}, digit),
80+
]),
81+
])
82+
```
83+
84+
85+
### Building regex
86+
87+
| Regex Component | Regex Pattern | Type | Description |
88+
| --------------------------------------- | ------------- | --------------------------------------------------- | ----------------------------------- |
89+
| `buildRegex(...)` | `/.../` | `(seq: RegexSequence) => RegExp` | Create `RegExp` instance |
90+
| `buildRegex({ ignoreCase: true }, ...)` | `/.../i` | `(flags: RegexFlags, seq: RegexSequence) => RegExp` | Create `RegExp` instance with flags |
91+
92+
### Components
93+
94+
| Regex Component | Regex Pattern | Type | Notes |
95+
| ------------------- | ------------- | ---------------------------------------------------- | --------------------------- |
96+
| `capture(...)` | `(...)` | `(seq: RegexSequence) => RegexElement` | Capture group |
97+
| `choiceOf(x, y, z)` | `x\|y\|z` | `(...alternatives: RegexSequence[]) => RegexElement` | Either of provided patterns |
98+
99+
Notes:
100+
* `choiceOf()` accepts a variable number of sequences.
101+
102+
103+
### Quantifiers
104+
105+
| Regex Component | Regex Pattern | Type | Description |
106+
| -------------------------------- | ------------- | -------------------------------------------------------------------- | ------------------------------------------------- |
107+
| `zeroOrMore(x)` | `x*` | `(seq: RegexSequence) => RegexElement` | Zero or more occurence of a pattern |
108+
| `oneOrMore(x)` | `x+` | `(seq: RegexSequence) => RegexElement` | One or more occurence of a pattern |
109+
| `optionally(x)` | `x?` | `(seq: RegexSequence) => RegexElement` | Zero or one occurence of a pattern |
110+
| `repeat({ count: n }, x)` | `x{n}` | `({ count: number }, seq: RegexSequence) => RegexElement` | Pattern repeats exact number of times |
111+
| `repeat({ min: n, }, x)` | `x{n,}` | `({ min: number }, seq: RegexSequence) => RegexElement` | Pattern repeats at least given number of times |
112+
| `repeat({ min: n, max: n2 }, x)` | `x{n1,n2}` | `({ min: number, max: number }, seq: RegexSequence) => RegexElement` | Pattern repeats between n1 and n2 number of times |
113+
114+
### Character classes
115+
116+
| Regex Component | Regex Pattern | Type | Description |
117+
| -------------------------- | ------------- | ------------------------------------------------------ | ------------------------------------------- |
118+
| `any` | `.` | `CharacterClass` | Any character |
119+
| `word` | `\w` | `CharacterClass` | Word characters |
120+
| `digit` | `\d` | `CharacterClass` | Digit characters |
121+
| `whitespace` | `\s` | `CharacterClass` | Whitespace characters |
122+
| `anyOf('abc')` | `[abc]` | `(chars: string) => CharacterClass` | Any of supplied characters |
123+
| `characterRange('a', 'z')` | `[a-z]` | `(from: string, to: string) => CharacterClass` | Range of characters |
124+
| `characterClass(...)` | `[...]` | `(...charClasses: CharacterClass[]) => CharacterClass` | Concatenation of multiple character classes |
125+
| `inverted(...)` | `[^...]` | `(charClass: CharacterClass) => CharacterClass` | Inverts character class |
126+
127+
Notes:
128+
* `any`, `word`, `digit`, `whitespace` - are objects, no need to call them.
129+
* `anyof` accepts a single string of characters to match
130+
* `characterRange` accepts exactly **two single character** strings representing range start and end (inclusive).
131+
* `characterClass` accepts a variable number of character classes to join into a single class
132+
* `inverted` accepts a single character class to be inverted
133+
134+
135+
### Anchors
136+
137+
| Regex Component | Regex Pattern | Type | Notes |
138+
| --------------- | ------------- | -------- | ---------------------------------------------------------- |
139+
| `startOfString` | `^` | `Anchor` | Start of the string (or start of a line in multiline mode) |
140+
| `endOfString` | `$` | `Anchor` | End of the string (or end of a line in multiline mode) |
141+
142+
Notes:
143+
* `startOfString`, `endOfString` - are objects, no need to call them.
144+
51145
## Examples
52146

53147
See [Examples document](./docs/Examples.md).
54148

55149
## Contributing
56150

57151
See the [contributing guide](CONTRIBUTING.md) to learn how to contribute to the repository and the development workflow.
58-
59152
See the [project guidelines](GUIDELINES.md) to understand our core principles.
60153

61154
## License

src/builders.ts

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
import type { RegexNode } from './types';
1+
import type { RegexSequence } from './types';
22
import { encodeSequence } from './encoder/encoder';
33
import { asNodeArray } from './utils/nodes';
44
import { optionalFirstArg } from './utils/optional-arg';
@@ -26,7 +26,7 @@ export interface RegexFlags {
2626
* @param elements Single regex element or array of elements
2727
* @returns
2828
*/
29-
export function buildRegex(elements: RegexNode | RegexNode[]): RegExp;
29+
export function buildRegex(sequence: RegexSequence): RegExp;
3030

3131
/**
3232
* Generate RegExp object from elements with passed flags.
@@ -35,14 +35,14 @@ export function buildRegex(elements: RegexNode | RegexNode[]): RegExp;
3535
* @param flags RegExp flags object
3636
* @returns RegExp object
3737
*/
38-
export function buildRegex(flags: RegexFlags, elements: RegexNode | RegexNode[]): RegExp;
38+
export function buildRegex(flags: RegexFlags, sequence: RegexSequence): RegExp;
3939

4040
export function buildRegex(first: any, second?: any): RegExp {
4141
return _buildRegex(...optionalFirstArg(first, second));
4242
}
4343

44-
export function _buildRegex(flags: RegexFlags, elements: RegexNode | RegexNode[]): RegExp {
45-
const pattern = encodeSequence(asNodeArray(elements)).pattern;
44+
export function _buildRegex(flags: RegexFlags, sequence: RegexSequence): RegExp {
45+
const pattern = encodeSequence(asNodeArray(sequence)).pattern;
4646
const flagsString = encodeFlags(flags ?? {});
4747
return new RegExp(pattern, flagsString);
4848
}
@@ -52,8 +52,8 @@ export function _buildRegex(flags: RegexFlags, elements: RegexNode | RegexNode[]
5252
* @param elements Single regex element or array of elements
5353
* @returns regex pattern string
5454
*/
55-
export function buildPattern(elements: RegexNode | RegexNode[]): string {
56-
return encodeSequence(asNodeArray(elements)).pattern;
55+
export function buildPattern(sequence: RegexSequence): string {
56+
return encodeSequence(asNodeArray(sequence)).pattern;
5757
}
5858

5959
function encodeFlags(flags: RegexFlags): string {

src/components/capture.ts

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,17 @@
11
import { encodeSequence } from '../encoder/encoder';
22
import type { EncodeOutput } from '../encoder/types';
33
import { asNodeArray } from '../utils/nodes';
4-
import type { RegexElement, RegexNode } from '../types';
4+
import type { RegexElement, RegexNode, RegexSequence } from '../types';
55

66
export interface Capture extends RegexElement {
77
type: 'capture';
88
children: RegexNode[];
99
}
1010

11-
export function capture(nodes: RegexNode | RegexNode[]): Capture {
11+
export function capture(sequence: RegexSequence): Capture {
1212
return {
1313
type: 'capture',
14-
children: asNodeArray(nodes),
14+
children: asNodeArray(sequence),
1515
encode: encodeCapture,
1616
};
1717
}

src/components/choice-of.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
import { encodeSequence } from '../encoder/encoder';
22
import type { EncodeOutput } from '../encoder/types';
33
import { asNodeArray } from '../utils/nodes';
4-
import type { RegexElement, RegexNode } from '../types';
4+
import type { RegexElement, RegexNode, RegexSequence } from '../types';
55

66
export interface ChoiceOf extends RegexElement {
77
type: 'choiceOf';
88
alternatives: RegexNode[][];
99
}
1010

11-
export function choiceOf(...alternatives: Array<RegexNode | RegexNode[]>): ChoiceOf {
11+
export function choiceOf(...alternatives: RegexSequence[]): ChoiceOf {
1212
if (alternatives.length === 0) {
1313
throw new Error('`choiceOf` should receive at least one alternative');
1414
}

src/components/quantifiers.ts

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
import { encodeAtom } from '../encoder/encoder';
22
import type { EncodeOutput } from '../encoder/types';
33
import { asNodeArray } from '../utils/nodes';
4-
import type { RegexElement, RegexNode } from '../types';
4+
import type { RegexElement, RegexNode, RegexSequence } from '../types';
55

66
export interface OneOrMore extends RegexElement {
77
type: 'oneOrMore';
@@ -18,26 +18,26 @@ export interface ZeroOrMore extends RegexElement {
1818
children: RegexNode[];
1919
}
2020

21-
export function oneOrMore(nodes: RegexNode | RegexNode[]): OneOrMore {
21+
export function oneOrMore(sequence: RegexSequence): OneOrMore {
2222
return {
2323
type: 'oneOrMore',
24-
children: asNodeArray(nodes),
24+
children: asNodeArray(sequence),
2525
encode: encodeOneOrMore,
2626
};
2727
}
2828

29-
export function optionally(nodes: RegexNode | RegexNode[]): Optionally {
29+
export function optionally(sequence: RegexSequence): Optionally {
3030
return {
3131
type: 'optionally',
32-
children: asNodeArray(nodes),
32+
children: asNodeArray(sequence),
3333
encode: encodeOptionally,
3434
};
3535
}
3636

37-
export function zeroOrMore(nodes: RegexNode | RegexNode[]): ZeroOrMore {
37+
export function zeroOrMore(sequence: RegexSequence): ZeroOrMore {
3838
return {
3939
type: 'zeroOrMore',
40-
children: asNodeArray(nodes),
40+
children: asNodeArray(sequence),
4141
encode: encodeZeroOrMore,
4242
};
4343
}

src/components/repeat.ts

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
import { encodeAtom } from '../encoder/encoder';
22
import type { EncodeOutput } from '../encoder/types';
33
import { asNodeArray } from '../utils/nodes';
4-
import type { RegexElement, RegexNode } from '../types';
4+
import type { RegexElement, RegexNode, RegexSequence } from '../types';
55

66
export interface Repeat extends RegexElement {
77
type: 'repeat';
@@ -11,8 +11,8 @@ export interface Repeat extends RegexElement {
1111

1212
export type RepeatOptions = { count: number } | { min: number; max?: number };
1313

14-
export function repeat(options: RepeatOptions, nodes: RegexNode | RegexNode[]): Repeat {
15-
const children = asNodeArray(nodes);
14+
export function repeat(options: RepeatOptions, sequence: RegexSequence): Repeat {
15+
const children = asNodeArray(sequence);
1616

1717
if (children.length === 0) {
1818
throw new Error('`repeat` should receive at least one element');

src/types.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
import type { EncodeOutput } from './encoder/types';
22

3+
export type RegexSequence = RegexNode | RegexNode[];
4+
35
export type RegexNode = RegexElement | string;
46

57
export interface RegexElement {

src/utils/nodes.ts

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1-
import type { RegexNode } from '../types';
1+
import type { RegexNode, RegexSequence } from '../types';
22

3-
export function asNodeArray(nodeOrArray: RegexNode | RegexNode[]): RegexNode[] {
4-
return Array.isArray(nodeOrArray) ? nodeOrArray : [nodeOrArray];
3+
export function asNodeArray(sequence: RegexSequence): RegexNode[] {
4+
return Array.isArray(sequence) ? sequence : [sequence];
55
}

test-utils/to-have-pattern.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
1-
import type { RegexNode } from '../src/types';
1+
import type { RegexSequence } from '../src/types';
22
import { asRegExp } from './utils';
33

44
export function toHavePattern(
55
this: jest.MatcherContext,
6-
received: RegExp | RegexNode | RegexNode[],
6+
received: RegExp | RegexSequence,
77
expected: RegExp
88
) {
99
const receivedPattern = asRegExp(received).source;

test-utils/to-match-groups.ts

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
1-
import type { RegexNode } from '../src/types';
1+
import type { RegexSequence } from '../src/types';
22
import { asRegExp } from './utils';
33

44
export function toMatchGroups(
55
this: jest.MatcherContext,
6-
received: RegExp | RegexNode | RegexNode[],
6+
received: RegExp | RegexSequence,
77
expectedString: string,
88
expectedGroups: string[]
99
) {

0 commit comments

Comments
 (0)