Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add quantitative testing #355

Merged
merged 19 commits into from
Oct 11, 2024
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
147 changes: 147 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -422,6 +422,153 @@ Now you can do that by passing the `--wait-for-host` flag. The value of this opt
- `--wait-for-no-redirect` Do not follow HTTP 3xx redirects.
- `--wait-for-timeout` Sets the timeout for all wait operations, 0 is unlimited. (default 10s)

## (EXPERIMENTAL) Quantitative testing

In the latest version of `go-ftw`, we have added a new feature that allows you to run quantitative tests.
This feature is still experimental and may change in the future.

### What is the idea behind quantitative tests?

Quantitative tests allow you to run tests using payloads to quantify the amount of false positives you might get when running in production.
fzipi marked this conversation as resolved.
Show resolved Hide resolved
We use a well-known corpora of text to generate payloads that are sent to the WAF. The WAF should not block these payloads, as they are not malicious.
fzipi marked this conversation as resolved.
Show resolved Hide resolved

Anyone can create their own corpora of text and use it to test their WAF. The corpora of text is a list of strings that are sent to the WAF to check if it blocks them.
fzipi marked this conversation as resolved.
Show resolved Hide resolved

The result of this test is a percentage of false positives. The lower the percentage, the better the WAF is at not blocking benign payloads.
fzipi marked this conversation as resolved.
Show resolved Hide resolved

### What is a corpus? Why do I need one?

A corpus is a collection of text that is used to generate payloads.
fzipi marked this conversation as resolved.
Show resolved Hide resolved
The text can be anything, from news articles to books. The idea is to have a large collection of text that can be used to generate payloads.
fzipi marked this conversation as resolved.
Show resolved Hide resolved

The default corpus is the [Leipzig Corpora Collection](https://wortschatz.uni-leipzig.de/en/download/), which is a collection of text from the web.
fzipi marked this conversation as resolved.
Show resolved Hide resolved

### How to create a corpus?

You can create your own corpus by collecting text from the web or using text from books, articles, etc.
fzipi marked this conversation as resolved.
Show resolved Hide resolved
Or even use it with your own website! What you will need to do is to implement the interface `corpus.Corpus`, the `corpus.File`,
fzipi marked this conversation as resolved.
Show resolved Hide resolved
and for iterating over the corpus, the `corpus.Iterator` and `corpus.Payload` interfaces.
fzipi marked this conversation as resolved.
Show resolved Hide resolved

You can see an example of how to implement the `corpus.Corpus` interface in the `corpus/leipzig` package.

### How to run quantitative tests?

To run quantitative tests, you just need to pass the `quantitative` flag to `ftw`.

The corpus will be downloaded and cached locally for future use. You can also specify the size of the corpus,
the language, the source, and the year of the corpus. The bare minimum parameter that you must specify is the
directory where the CRS rules are stored.

Here is the help for the `quantitative` command:

```bash
❯ ./go-ftw quantitative -h
Run all quantitative tests

Usage:
ftw quantitative [flags]

Flags:
-c, --corpus string Corpus to use for the quantitative tests (default "leipzig")
-L, --corpus-lang string Corpus language to use for the quantitative tests (default "eng")
-n, --corpus-line int Number is the payload line from the corpus to exclusively send
-s, --corpus-size string Corpus size to use for the quantitative tests. Most corpora will have sizes like "100K", "1M", etc. (default "100K")
-S, --corpus-source string Corpus source to use for the quantitative tests. Most corpus will have a source like "news", "web", "wikipedia", etc. (default "news")
-y, --corpus-year string Corpus year to use for the quantitative tests. Most corpus will have a year like "2023", "2022", etc. (default "2023")
-d, --directory string Directory where the CRS rules are stored (default ".")
-f, --file string Output file path for quantitative tests. Prints to standard output by default.
-h, --help help for quantitative
-l, --lines int Number of lines of input to process before stopping
-o, --output string Output type for quantitative tests. "normal" is the default. (default "normal")
-P, --paranoia-level int Paranoia level used to run the quantitative tests (default 1)
-p, --payload string Payload is a string you want to test using quantitative tests. Will not use the corpus.
-r, --rule int Rule ID of interest: only show false positives for specified rule ID

Global Flags:
--cloud cloud mode: rely only on HTTP status codes for determining test success or failure (will not process any logs)
--config string specify config file (default is $PWD/.ftw.yaml)
--debug debug output
--overrides string specify file with platform specific overrides
--trace trace output: really, really verbose
```



### Example of running quantitative tests

This will run with the default leipzig corpus and size of 10K payloads.
```bash
❯ ./go-ftw quantitative -d ../coreruleset -s 10K
Running quantitative tests
Run 10000 payloads in 18.482979709s
Total False positive ratio: 408/10000 = 0.0408
False positives per rule:
Rule 920220: 198 false positives
Rule 920221: 198 false positives
Rule 932235: 4 false positives
Rule 932270: 2 false positives
Rule 932380: 2 false positives
Rule 933160: 1 false positives
Rule 942100: 1 false positives
Rule 942230: 1 false positives
Rule 942360: 1 false positives
```

This will run with the default leipzig corpus and size of 10K payloads, but only for the rule 920350.
```bash
❯ ./go-ftw quantitative -d ../coreruleset -s 10K -r 932270
Running quantitative tests
Run 10000 payloads in 15.218343083s
Total False positive ratio: 2/10000 = 0.0002
False positives per rule:
Rule 932270: 2 false positives
```

If you add `--debug` to the command, you will see the payloads that cause false positives.
```bash
❯ ./go-ftw quantitative -d ../coreruleset -s 10K --debug
Running quantitative tests
12:32PM DBG Preparing download of corpus file from https://downloads.wortschatz-leipzig.de/corpora/eng_news_2023_10K.tar.gz
12:32PM DBG filename eng_news_2023_10K-sentences.txt already exists
12:32PM DBG Using paranoia level: 1

12:32PM DBG False positive with string: And finally: "I'd also say temp nurses make a lot.
12:32PM DBG **> rule 932290 => Matched Data: "I'd found within ARGS:payload: And finally: "I'd also say temp nurses make a lot.
12:32PM DBG False positive with string: But it was an experience Seguin said she "wouldn't trade for anything."
12:32PM DBG **> rule 932290 => Matched Data: "wouldn't found within ARGS:payload: But it was an experience Seguin said she "wouldn't trade for anything."
12:32PM DBG False positive with string: Consolidated Edison () last issued its earnings results on Thursday, November 3rd.
12:32PM DBG **> rule 932235 => Matched Data: () last found within ARGS:payload: Consolidated Edison () last issued its earnings results on Thursday, November 3rd.
```

The default language for the corpus is english, but you can change it to german using the `-L` flag.
fzipi marked this conversation as resolved.
Show resolved Hide resolved
```bash
❯ ./go-ftw quantitative -d ../coreruleset -s 10K -L deu
Running quantitative tests
4:18PM INF Downloading corpus file from https://downloads.wortschatz-leipzig.de/corpora/deu_news_2023_10K.tar.gz
Moved /Users/fzipitria/.ftw/extracted/deu_news_2023_10K/deu_news_2023_10K-sentences.txt to /Users/fzipitria/.ftw/deu_news_2023_10K-sentences.txt
Run 10000 payloads in 25.169846084s
Total False positive ratio: 44/10000 = 0.0044
False positives per rule:
Rule 920220: 19 false positives
Rule 920221: 19 false positives
Rule 932125: 1 false positives
Rule 932290: 5 false positives
```

Results can be shown in json format also, to be processed by other tools.
fzipi marked this conversation as resolved.
Show resolved Hide resolved
```bash
❯ ./go-ftw quantitative -d ../coreruleset -s 10K -o json

{"count":10000,"falsePositives":408,"falsePositivesPerRule":{"920220":198,"920221":198,"932235":4,"932270":2,"932380":2,"933160":1,"942100":1,"942230":1,"942360":1},"totalTime":15031086083}%
```

### Future work for quantitative tests

This feature will enable us to compare between two different versions of CRS (or any two rules) and see, for example,
if any modification to the rule has caused more false positives.

Integrating it to the CI/CD pipeline will allow us to check every PR for false positives before merging.

## Library usage

`go-ftw` can be used as a library also. Just include it in your project:
Expand Down
42 changes: 27 additions & 15 deletions cmd/quantitative.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,60 +4,65 @@
package cmd

import (
"fmt"
"os"

"github.com/spf13/cobra"

"github.com/coreruleset/go-ftw/experimental/corpus"
"github.com/coreruleset/go-ftw/internal/quantitative"
"github.com/coreruleset/go-ftw/output"
"github.com/spf13/cobra"
"os"
)

// NewQuantitativeCmd
// Returns a new cobra command for running quantitative tests
func NewQuantitativeCmd() *cobra.Command {
runCmd := &cobra.Command{
Use: "quantitative",
Short: "Run Quantitative Tests",
Short: "Run quantitative tests",
Long: `Run all quantitative tests`,
RunE: runQuantitativeE,
}

runCmd.Flags().BoolP("markdown", "m", false, "Markdown table output mode")
runCmd.Flags().IntP("fast", "x", 0, "Process 1 in every X lines of input ('fast run' mode)")
runCmd.Flags().IntP("lines", "l", 0, "Number of lines of input to process before stopping")
runCmd.Flags().IntP("paranoia-level", "P", 1, "Paranoia level used to run the quantitative tests")
fzipi marked this conversation as resolved.
Show resolved Hide resolved
runCmd.Flags().IntP("number", "n", 0, "Number is the payload line from the corpus to exclusively send")
runCmd.Flags().IntP("corpus-line", "n", 0, "Number is the payload line from the corpus to exclusively send")
runCmd.Flags().StringP("payload", "p", "", "Payload is a string you want to test using quantitative tests. Will not use the corpus.")
runCmd.Flags().IntP("rule", "r", 0, "Rule ID of interest: only show false positives for specified rule ID")
runCmd.Flags().StringP("corpus", "c", "leipzig", "Corpus to use for the quantitative tests")
runCmd.Flags().StringP("corpus-lang", "L", "eng", "Corpus language to use for the quantitative tests.")
runCmd.Flags().StringP("corpus-size", "s", "100K", "Corpus size to use for the quantitative tests. Most corpus will have a size like \"100K\", \"1M\", etc.")
runCmd.Flags().StringP("corpus-lang", "L", "eng", "Corpus language to use for the quantitative tests")
runCmd.Flags().StringP("corpus-size", "s", "100K", "Corpus size to use for the quantitative tests. Most corpora will have sizes like \"100K\", \"1M\", etc.")
runCmd.Flags().StringP("corpus-year", "y", "2023", "Corpus year to use for the quantitative tests. Most corpus will have a year like \"2023\", \"2022\", etc.")
runCmd.Flags().StringP("corpus-source", "S", "news", "Corpus source to use for the quantitative tests. Most corpus will have a source like \"news\", \"web\", \"wikipedia\", etc.")
runCmd.Flags().StringP("directory", "d", ".", "Directory where the CRS rules are stored")
runCmd.Flags().StringP("file", "f", "", "output file path for quantitative tests. Prints to standard output by default.")
runCmd.Flags().StringP("output", "o", "normal", "output type for quantitative tests. \"normal\" is the default.")
runCmd.Flags().StringP("file", "f", "", "Output file path for quantitative tests. Prints to standard output by default.")
runCmd.Flags().StringP("output", "o", "normal", "Output type for quantitative tests. \"normal\" is the default.")

return runCmd
}

func runQuantitativeE(cmd *cobra.Command, _ []string) error {
cmd.SilenceUsage = true

corpus, _ := cmd.Flags().GetString("corpus")
corpusTypeAsString, _ := cmd.Flags().GetString("corpus")
corpusSize, _ := cmd.Flags().GetString("corpus-size")
corpusLang, _ := cmd.Flags().GetString("corpus-lang")
corpusYear, _ := cmd.Flags().GetString("corpus-year")
corpusSource, _ := cmd.Flags().GetString("corpus-source")
directory, _ := cmd.Flags().GetString("directory")
fast, _ := cmd.Flags().GetInt("fast")
lines, _ := cmd.Flags().GetInt("lines")
markdown, _ := cmd.Flags().GetBool("markdown")
outputFilename, _ := cmd.Flags().GetString("file")
paranoiaLevel, _ := cmd.Flags().GetInt("paranoia-level")
payload, _ := cmd.Flags().GetString("payload")
number, _ := cmd.Flags().GetInt("number")
rule, _ := cmd.Flags().GetInt("rule")
wantedOutput, _ := cmd.Flags().GetString("output")

if paranoiaLevel > 1 && rule > 0 {
return fmt.Errorf("paranoia level and rule ID cannot be used together")
}

// use outputFile to write to file
var outputFile *os.File
var err error
Expand All @@ -71,16 +76,23 @@ func runQuantitativeE(cmd *cobra.Command, _ []string) error {
}
out := output.NewOutput(wantedOutput, outputFile)

params := quantitative.QuantitativeParams{
Corpus: corpus,
var corpusType corpus.Type
if corpusTypeAsString != "" {
err = corpusType.Set(corpusTypeAsString)
if err != nil {
return err
}
}

params := quantitative.Params{
Corpus: corpusType,
CorpusSize: corpusSize,
CorpusYear: corpusYear,
CorpusLang: corpusLang,
CorpusSource: corpusSource,
Directory: directory,
Fast: fast,
Lines: lines,
Markdown: markdown,
ParanoiaLevel: paranoiaLevel,
Number: number,
Payload: payload,
Expand Down
17 changes: 9 additions & 8 deletions cmd/quantitative_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,16 +5,17 @@ package cmd

import (
"context"
"github.com/spf13/cobra"
"github.com/stretchr/testify/suite"
"io/fs"
"os"
"path"
"testing"

"github.com/spf13/cobra"
"github.com/stretchr/testify/suite"
)

var crsSetupFileContents = `# CRS Setup Configuration File`
var emptyRulesFile = `# Empty Rules File`
var crsSetupFileContents = `# CRS Setup Configuration filename`
var emptyRulesFile = `# Empty Rules filename`

type quantitativeCmdTestSuite struct {
suite.Suite
Expand All @@ -32,12 +33,12 @@ func (s *quantitativeCmdTestSuite) SetupTest() {

err := os.MkdirAll(path.Join(s.tempDir, "rules"), fs.ModePerm)
s.Require().NoError(err)
fakeCRSSetupConf, err := os.Create(path.Join(s.tempDir, "crs-setup.conf.example"))
fakeCrsSetupConf, err := os.Create(path.Join(s.tempDir, "crs-setup.conf.example"))
s.Require().NoError(err)
n, err := fakeCRSSetupConf.WriteString(crsSetupFileContents)
n, err := fakeCrsSetupConf.WriteString(crsSetupFileContents)
s.Require().NoError(err)
s.Equal(len(crsSetupFileContents), n)
err = fakeCRSSetupConf.Close()
err = fakeCrsSetupConf.Close()
s.Require().NoError(err)
fakeRulesFile, err := os.Create(path.Join(s.tempDir, "rules", "Rules1.conf"))
s.Require().NoError(err)
Expand All @@ -55,7 +56,7 @@ func (s *quantitativeCmdTestSuite) TearDownTest() {
func (s *quantitativeCmdTestSuite) TestQuantitativeCommand() {
s.rootCmd.SetArgs([]string{"quantitative", "-d", s.tempDir})
cmd, err := s.rootCmd.ExecuteContextC(context.Background())
s.Require().NoError(err, "quantitative command should not return an error")
s.Require().NoError(err, "quantitative command should not return error")
s.Equal("quantitative", cmd.Name(), "quantitative command should have the name 'quantitative'")
s.Require().NoError(err)
}
Loading