Skip to content

Implementation multi pattern regular expressions#2161

Open
biathlon3 wants to merge 31 commits intomasterfrom
ag_Multi-pattern-regular-expressions
Open

Implementation multi pattern regular expressions#2161
biathlon3 wants to merge 31 commits intomasterfrom
ag_Multi-pattern-regular-expressions

Conversation

@biathlon3
Copy link
Copy Markdown
Contributor

@biathlon3 biathlon3 commented Jul 5, 2024

To compile regexp required installation of hscollider. For simplicity it built as debian package.

Add repository

sudo install -m 0755 -d /etc/apt/keyrings
sudo wget -O /etc/apt/keyrings/tempesta.asc https://tempesta-tech.com:8081/repository/public-keys/keys/public.key
sudo chmod a+r /etc/apt/keyrings/tempesta.asc
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/tempesta.asc] \
https://tempesta-tech.com:8081/repository/tempesta/ focal main" | sudo tee /etc/apt/sources.list.d/tempesta.list
sudo apt update && sudo apt upgrade

Install the package

apt install linux-regex

@biathlon3 biathlon3 linked an issue Jul 5, 2024 that may be closed by this pull request
@biathlon3 biathlon3 marked this pull request as draft July 5, 2024 14:08
@biathlon3

This comment was marked as outdated.

@biathlon3

This comment was marked as outdated.

@biathlon3 biathlon3 force-pushed the ag_Multi-pattern-regular-expressions branch 2 times, most recently from 5c02ff5 to 8c64bfa Compare July 8, 2024 17:09
@biathlon3 biathlon3 force-pushed the ag_Multi-pattern-regular-expressions branch 2 times, most recently from 4f4608e to d932ab5 Compare July 25, 2024 11:13
@biathlon3 biathlon3 marked this pull request as ready for review July 26, 2024 13:24
@biathlon3
Copy link
Copy Markdown
Contributor Author

I squashed it into one commit so it can be reviewed.

@biathlon3 biathlon3 force-pushed the ag_Multi-pattern-regular-expressions branch 2 times, most recently from 5f2b864 to 09f70b3 Compare July 29, 2024 18:23
@krizhanovsky krizhanovsky requested review from EvgeniiMekhanik, const-t and krizhanovsky and removed request for EvgeniiMekhanik and krizhanovsky July 30, 2024 19:40
@const-t const-t marked this pull request as draft April 11, 2025 13:33
@const-t const-t force-pushed the ag_Multi-pattern-regular-expressions branch 2 times, most recently from 7f2013e to 88c3570 Compare October 27, 2025 16:07
@const-t const-t force-pushed the ag_Multi-pattern-regular-expressions branch from 88c3570 to ce3286e Compare November 7, 2025 12:35
@const-t const-t force-pushed the ag_Multi-pattern-regular-expressions branch from ce3286e to 5d5654c Compare November 14, 2025 13:20
@const-t const-t force-pushed the ag_Multi-pattern-regular-expressions branch from 5d5654c to ddea565 Compare December 19, 2025 16:20
@const-t const-t marked this pull request as ready for review December 19, 2025 16:21
@const-t const-t force-pushed the ag_Multi-pattern-regular-expressions branch from ddea565 to 37e1965 Compare December 22, 2025 09:14
@const-t const-t force-pushed the ag_Multi-pattern-regular-expressions branch from 4ca8b16 to ebc646e Compare January 7, 2026 13:13
Enabling avx512 causing compilation time wanings relateed to
string fortification. Compiler can't check the size of the data,
because the length of the data calculates at runtime. Do unsafe
copy to suppress the warnings.
The config can be included using !include directive in http_chain
section. Example:

```
http_chain {
    #some rules ...
    !include /etc/tempesta/ua_block_rules.conf
}
```
@const-t const-t force-pushed the ag_Multi-pattern-regular-expressions branch 2 times, most recently from f3147f6 to 3a1196e Compare February 10, 2026 13:46
@const-t
Copy link
Copy Markdown
Contributor

const-t commented Feb 10, 2026

I catch kasan BUG in
tests.http_rules.test_http_tables.TestHttpTablesEmptyHdrPattern.test_empty_name. Please run all tests with kasan and kmemleak and check results.

Fixed

Do strict matching to not match not suitable strings like "!includeasd"
and "   #include"
What has been done:
 - Added a user-space helper `regex_setup.sh` that passes expressions
   to hscollider and copies the compiled database to the rex module
   database. Path to the script can be specified using environment
   variable REGEX_SETUP_SCRIPT_PATH and path to the regex databases
   can be specified using REGEX_DIR_PATH variable. Implemented as
   kernel module parameter.
 - Moved regular expression code to its own file.
 - Rex directory added to `rules` file.
In this patch we allocate only the amount of memory needed for the
regex ID, instead of allocating memory for the full string pattern
Remove outdated comment and change minimal regex length. Currently
we allow even 1 symbol regex, but not empty regex.
`cookie "cookie_name" ~ "cookie_value"` is not supported at this moment,
seems it doesn't make sense to have such option, matching by header
could be useed instead to find all occurrences in single regex call.
Example:
`hdr cookie ~ "/cookie_name=cookie_value/"`
@const-t const-t force-pushed the ag_Multi-pattern-regular-expressions branch from 3a1196e to e3a81a4 Compare February 11, 2026 08:46
@@ -114,7 +118,7 @@ templater()
line=$(echo "$raw_line" | sed -e '/request /s/\\r\\n/\x0d\x0a/g')
if [[ ${line:0:1} = \# ]]; then
:
elif [[ $line =~ '!include' ]]; then
elif [[ $line =~ ^[[:space:]]*!include[[:space:]]+.+$ ]]; then
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok that we can have any count of space before include?

@@ -0,0 +1 @@
../../regex.c No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No empty line at the end of the file?

Copy link
Copy Markdown
Contributor

@EvgeniiMekhanik EvgeniiMekhanik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add three stress tests with reconfiguration with regexp:

  • With success reconfiguration regexp config - regexp config
  • Fail reconfig
  • Success was regexp now not regexp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Multi-pattern regular expressions

3 participants