Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Old Metric from securinbench #4

Draft
wants to merge 58 commits into
base: develop
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
7376c77
Datastructures4 @servlet vuln_count = "0"
Jclavo Jan 27, 2025
ddca39b
Inter9 vuln_count = "2"
Jclavo Jan 27, 2025
b391497
remove Collections11b
Jclavo Jan 27, 2025
ecfa6d9
AliasingTest
Jclavo Jan 27, 2025
5fdd980
ArraysTest
Jclavo Jan 27, 2025
4f72d18
BasicTest
Jclavo Jan 27, 2025
bd395d8
CollectionTest
Jclavo Jan 27, 2025
4ae5ede
update metrics
Jclavo Jan 27, 2025
373bd80
DataStructureTest
Jclavo Jan 27, 2025
42c6e67
fix DataStructure4
Jclavo Jan 27, 2025
f3a8951
fix Collection9
Jclavo Jan 27, 2025
61d0839
FactoryTest
Jclavo Jan 27, 2025
bdcaa9a
InterTest
Jclavo Jan 27, 2025
22dbd33
fix expected values
Jclavo Jan 27, 2025
33f3a5f
fix expected value
Jclavo Jan 27, 2025
36d0033
SessionTest
Jclavo Jan 27, 2025
a0d8693
StrongUpdateTest
Jclavo Jan 27, 2025
629a1b2
JSVFA metrics
Jclavo Jan 27, 2025
5f3d746
metrics for JSVFA in tests
Jclavo Jan 27, 2025
c72d100
applicationClassPath
Jclavo Jan 27, 2025
1f79008
fix table
Jclavo Jan 27, 2025
aab0f98
remove indivisual tests
Jclavo Jan 27, 2025
e6ea99a
FlowdroidTest one in all
Jclavo Jan 27, 2025
bf361c3
compute metrics
Jclavo Jan 27, 2025
3c77221
enable all tests
Jclavo Jan 27, 2025
03352f0
use size == 0
Jclavo Jan 27, 2025
8b8e954
pass and failed
Jclavo Jan 27, 2025
6e16696
disable AliasingTest
Jclavo Jan 27, 2025
3be57fc
set expected value for Collection9
Jclavo Jan 27, 2025
fb9aee1
expected value for DataStructure4
Jclavo Jan 27, 2025
93b829c
expected value fpr Inter9
Jclavo Jan 27, 2025
abbd7d8
InterTest
Jclavo Jan 27, 2025
7b1270a
pass and fail
Jclavo Jan 27, 2025
15e8335
Test Basic6 is a flaky test.
Jclavo Jan 27, 2025
f4fbbb7
remove old metrics
Jclavo Jan 27, 2025
befd173
ARRAY TESTs
Jclavo Jan 27, 2025
cc86deb
Basic tests
Jclavo Jan 27, 2025
d0ca81f
ignore COLLECTION TESTs
Jclavo Jan 27, 2025
36db7c1
DATASTRUCTURE TESTs
Jclavo Jan 27, 2025
f1f50cd
FACTORY TESTs
Jclavo Jan 27, 2025
3caa40f
INTER TESTs
Jclavo Jan 27, 2025
9807912
InterTest metrics
Jclavo Jan 27, 2025
7abdc07
SESSION TESTs
Jclavo Jan 27, 2025
96dc949
STRONG UPDATE TESTs
Jclavo Jan 27, 2025
3a872ab
ignore tests
Jclavo Jan 27, 2025
0ad88fe
passed and failed
Jclavo Jan 27, 2025
592e204
Inter11 is flaky
Jclavo Jan 27, 2025
6580a71
Inter11 is flaky
Jclavo Jan 27, 2025
632733b
separe metric details in another file
Jclavo Feb 6, 2025
7cd2f08
set old version "0.2.9"
Jclavo Feb 6, 2025
9e737fa
DISCLAIMER
Jclavo Feb 6, 2025
df459be
typo
Jclavo Feb 6, 2025
9c69c83
fix basic
Jclavo Mar 16, 2025
f512861
metrics for Basic
Jclavo Mar 16, 2025
9ff0a83
total TP
Jclavo Mar 16, 2025
020d9d3
add test number
Jclavo Mar 16, 2025
426342b
Pass Rate
Jclavo Mar 16, 2025
b3f79d0
metrics
Jclavo Mar 16, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 36 additions & 38 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,41 +60,39 @@ This project use some of the [FlowDroid](https://github.com/secure-software-engi


### Flowdroid
~~TTests failed: 34, passed: 64, ignored: 6 of 104 test~~T

Tests failed: 40, passed: 64, ignored: 0 of 104 test
Tests failed: 33, passed: 71, ignored: 0 of 104 test (original)

Tests failed: +17.5%, passed: +9.86, ignored: 0 of 104 test (original)

#### AliasingTest
Tests failed: 0, passed: 5, ignored: 1 of 6 test
#### ArraysTest
Tests failed: 9, passed: 1, ignored: 0 of 10 test
#### BasicTest
Tests failed: 0, passed: 37, ignored: 5 of 42 test

Fails:
17
36 (same)
38
42

#### CollectionTest
Tests failed: 14, passed: 1, ignored: 0 of 15 test
#### DataStructureTest ☑
Tests failed: 1, passed: 5, ignored: 0 of 6 test
#### FactoryTest ☑
Tests failed: 1, passed: 2, ignored: 0 of 3 test
#### InterTest
Tests failed: 7, passed: 7, ignored: 0 of 14 test
~~#### PredTest~~
~~Tests failed: 3, passed: 6, ignored: 0 of 9 test~~
~~#### ReflectionTest~~
~~Tests failed: 4, passed: 0, ignored: 0 of 4 test~~
~~#### SanitizerTest~~
~~Tests failed: 2, passed: 4, ignored: 0 of 6 test~~
#### SessionTest ☑
Tests failed: 3, passed: 0, ignored: 0 of 3 test
#### StrongUpdateTest ☑
Tests failed: 1, passed: 4, ignored: 0 of 5 test

## TEST METRICS

> failed: 0, passed: 63, ignored: 39 of 103 tests.

| Test | Σ | TP | FP |
|:---------------:|:-------:|:------:|:--:|
| Aliasing | 5/6 | 10/11 | 0 |
| Array | 1/10 | 0/9 | 0 |
| Basic | 35/42 | 56/61 | 2 |
| Collection | 2/14 | 2/14 | 1 |
| DataStructure | 4/6 | 5/5 | 2 |
| Factory | 2/3 | 3/3 | 1 |
| Inter | 8/14 | 10/16 | 0 |
| ~~Pred~~ | ~~0/9~~ | - | - |
| ~~Reflection~~ | ~~0/4~~ | - | - |
| ~~Sanitizers~~ | ~~0/6~~ | - | - |
| Session | 0/3 | 0/3 | 0 |
| StrongUpdate | 4/5 | 0/1 | 0 |
| **TOTAL** | 61/103 | 86/123 | 6 |

- **Precision:** 0.93
- **Recall:** 0.70
- **F-score:** 0.80
- **Pass Rate:** 59.22%

To have detailed information about each group of tests run, [see here.](old-metrics)

**OBSERVATIONS**
- Flowdroid is not taking in count the TP expected in StrongUpdate4;
- Test Basic40 is commented in the test suite so the amount of TP differs from the original run by Flowdroid;
- There are two flaky tests: Basic6 and Inter11.


## DISCLAIMER
- The last code changes for this Release were added in March, 2023.
2 changes: 1 addition & 1 deletion build.sbt
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ scalaVersion := "2.12.8"
name := "svfa-scala"
organization := "br.unb.cic"

version := "0.2.1-SNAPSHOT"
version := "0.2.9"

githubOwner := "rbonifacio"
githubRepository := "svfa-scala"
Expand Down
164 changes: 164 additions & 0 deletions old-metrics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
#### JSVFA old metrics

- **AliasingTest** - failed: 0, passed: 5, ignored: 1 of 6 tests.

| Test | Expected | Actual | Status | TP | FP | Precision | Recall | F-score |
|:--------------:|:--------:|:------:|:------:|:--:|:--:|:---------:|:------:|:-------:|
| Aliasing1 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Aliasing2 | 0 | 0 | ✅ | 0 | 0 | - | - | - |
| Aliasing3 | 0 | 0 | ✅ | 0 | 0 | - | - | - |
| Aliasing4 | 2 | 2 | ✅ | 2 | 0 | - | - | - |
| Aliasing5 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Aliasing6 | 7 | 7 | ✅ | 7 | 0 | - | - | - |
| TOTAL | 11 | 10 | 5/6 | 10 | 0 | 1.00 | 0.91 | 0.95 |

- **ArraysTest** - failed: 0, passed: 1, ignored: 9 of 10 test.

| Test | Expected | Actual | Status | TP | FP | Precision | Recall | F-score |
|:--------------:|:--------:|:------:|:------:|:--:|:--:|:---------:|:------:|:-------:|
| Array1 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Array2 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Array3 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Array4 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Array5 | 0 | 0 | ✅ | 0 | 0 | - | - | - |
| Array6 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Array7 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Array8 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Array9 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Array10 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| TOTAL | 9 | 0 | 1/10 | 0 | 0 | 0 | 0 | 0 |

- **BasicTest** - failed: 0, passed: 35, ignored: 7 of 42 test.

| Test | Expected | Actual | Status | TP | FP | Precision | Recall | F-score |
|:-------:|:--------:|:------:|:------:|:---:|:---:|:---------:|:------:|:-------:|
| Basic1 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic2 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic3 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic4 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic5 | 3 | 3 | ✅ | 3 | 0 | - | - | - |
| Basic6* | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Basic7 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic8 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic9 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic10 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic11 | 2 | 2 | ✅ | 2 | 0 | - | - | - |
| Basic12 | 2 | 2 | ✅ | 2 | 0 | - | - | - |
| Basic13 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic14 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic15 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic16 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic17 | 1 | 2 | ❌ | 1 | 1 | - | - | - |
| Basic18 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic19 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic20 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic21 | 4 | 4 | ✅ | 4 | 0 | - | - | - |
| Basic22 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic23 | 3 | 3 | ✅ | 3 | 0 | - | - | - |
| Basic24 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic25 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic26 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic27 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic28 | 2 | 1 | ❌ | 1 | 0 | - | - | - |
| Basic29 | 2 | 2 | ✅ | 2 | 0 | - | - | - |
| Basic30 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic31 | 3 | 3 | ✅ | 3 | 0 | - | - | - |
| Basic32 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic33 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic34 | 2 | 2 | ✅ | 2 | 0 | - | - | - |
| Basic35 | 6 | 6 | ✅ | 6 | 0 | - | - | - |
| Basic36 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Basic37 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic38 | 1 | 2 | ❌ | 1 | 1 | - | - | - |
| Basic39 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic40 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Basic41 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Basic42 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| TOTAL | 61 | 58 | 35/42 | 56 | 2 | 0.97 | 0.92 | 0.94 |

- **CollectionTest** - failed: 0, passed: 2, ignored: 12 of 14 tests.

| Test | Expected | Actual | Status | TP | FP | Precision | Recall | F-score |
|:--------------:|:--------:|:------:|:------:|:--:|:--:|:---------:|:------:|:-------:|
| Collection1 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Collection2 | 1 | 2 | ❌ | 1 | 1 | - | - | - |
| Collection3 | 2 | 0 | ❌ | 0 | 0 | - | - | - |
| Collection4 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Collection5 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Collection6 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Collection7 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Collection8 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Collection9 | 0 | 0 | ✅ | 0 | 0 | - | - | - |
| Collection10 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Collection11 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Collection12 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Collection13 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Collection14 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| TOTAL | 14 | 3 | 2/14 | 2 | 1 | 0.67 | 0.14 | 0.24 |

- **DataStructureTest** - failed: 0, passed: 4, ignored: 2 of 6 tests.

| Test | Expected | Actual | Status | TP | FP | Precision | Recall | F-score |
|:--------------:|:--------:|:------:|:------:|:--:|:--:|:---------:|:------:|:-------:|
| DataStructure1 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| DataStructure2 | 1 | 2 | ❌ | 1 | 1 | - | - | - |
| DataStructure3 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| DataStructure4 | 0 | 1 | ❌ | 0 | 1 | - | - | - |
| DataStructure5 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| DataStructure6 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| TOTAL | 5 | 7 | 4/6 | 5 | 2 | 0.71 | 1.00 | 0.83 |


- **FactoryTest** - failed: 0, passed: 2, ignored: 1 of 3 tests.

| Test | Expected | Actual | Status | TP | FP | Precision | Recall | F-score |
|:--------------:|:--------:|:------:|:------:|:---:|:---:|:---------:|:------:|:-------:|
| Factory1 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Factory2 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Factory3 | 1 | 2 | ❌ | 1 | 1 | - | - | - |
| TOTAL | 3 | 4 | 2/3 | 3 | 1 | 0.75 | 1.00 | 0.86 |

- **InterTest** - failed: 0, passed:8, ignored: 6 of 14 tests

| Test | Expected | Actual | Status | TP | FP | Precision | Recall | F-score |
|:-------:|:--------:|:------:|:------:|:--:|:--:|:---------:|:------:|:-------:|
| Inter1 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Inter2 | 2 | 2 | ✅ | 2 | 0 | - | - | - |
| Inter3 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Inter4 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Inter5 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Inter6 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Inter7 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Inter8 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Inter9 | 2 | 1 | ❌ | 1 | 0 | - | - | - |
| Inter10 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Inter11 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Inter12 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Inter13 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| Inter14 | 1 | 1 | ✅ | 1 | 0 | - | - | - |
| TOTAL | 16 | 10 | 8/14 | 10 | 0 | 1.00 | 0.63 | 0.77 |

- **SessionTest** - failed: 0, passed: 0, ignored: 3 of 3 tests.

| Test | Expected | Actual | Status | TP | FP | Precision | Recall | F-score |
|:--------------:|:--------:|:------:|:------:|:---:|:---:|:---------:|:------:|:-------:|
| Session1 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Session2 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| Session3 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| TOTAL | 3 | 0 | 0/3 | 0 | 0 | 0 | 0 | 0 |

- **StrongUpdateTest** - failed: 0, passed: 4, ignored: 1 of 5 tests.

| Test | Expected | Actual | Status | TP | FP | Precision | Recall | F-score |
|:--------------:|:--------:|:------:|:------:|:--:|:--:|:---------:|:------:|:-------:|
| StrongUpdate1 | 0 | 0 | ✅ | 0 | 0 | - | - | - |
| StrongUpdate2 | 0 | 0 | ✅ | 0 | 0 | - | - | - |
| StrongUpdate3 | 0 | 0 | ✅ | 0 | 0 | - | - | - |
| StrongUpdate4 | 1 | 0 | ❌ | 0 | 0 | - | - | - |
| StrongUpdate5 | 0 | 0 | ✅ | 0 | 0 | - | - | - |
| TOTAL | 1 | 0 | 4/5 | 0 | 0 | 0 | 0 | 0 |

**OBSERVATIONS**
- Flowdroid is not taking in count the TP expected in StrongUpdate4;
- Test Basic40 is commented in the test suite so the amount of TP differs from the original run by Flowdroid;
- There are two flaky tests: Basic6 and Inter11.
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,6 @@ public String getDescription() {
}

public int getVulnerabilityCount() {
return 1;
return 0;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@

/**
* @servlet description="simple nexted data (false positive)"
* @servlet vuln_count = "1"
* @servlet vuln_count = "0"
* */
public class Datastructures4 extends BasicTestCase implements MicroTestCase {
public class C {
Expand Down Expand Up @@ -50,6 +50,6 @@ public String getDescription() {
}

public int getVulnerabilityCount() {
return 1;
return 0;
}
}
2 changes: 1 addition & 1 deletion src/test/java/securibench/micro/inter/Inter4.java
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,6 @@ public String getDescription() {
}

public int getVulnerabilityCount() {
return 2;
return 1;
}
}
2 changes: 1 addition & 1 deletion src/test/java/securibench/micro/inter/Inter5.java
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,6 @@ public String getDescription() {
}

public int getVulnerabilityCount() {
return 2;
return 1;
}
}
2 changes: 1 addition & 1 deletion src/test/java/securibench/micro/inter/Inter9.java
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,6 @@ public String getDescription() {
}

public int getVulnerabilityCount() {
return 1;
return 2;
}
}
Loading