Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The probability of picking the maximum length is very low #51

Open
vlsi opened this issue Sep 17, 2019 · 0 comments
Open

The probability of picking the maximum length is very low #51

vlsi opened this issue Sep 17, 2019 · 0 comments

Comments

@vlsi
Copy link

vlsi commented Sep 17, 2019

Sample regex: [-_a-zA-Z0-9].
random(1, 70) produces the following strings:

CY
6--Y-
0I-_-R
f
O
--A77_
-
_-V
05F-43-4-
f-2-QF_w
d_-
_0
_P
762_
_
_i_t__
Y
-b
-1i
5_J
X_S
_t-
9
p8-_yqHp4-
-
H_u
_
HW
_J-S-

Almost all the strings are short which is not always good.

Current code picks the value with probability of 66%
That makes it quite impossible to produce 70 chars for random(minLength=1, maxLength=70) because Generex would try to stop at each string with probability of 66%.

What if Generex had a mode where it stops as soon as the length exceeds given minLength?
Then random(minLength, maxLength) would be pick a random V1 between min...max, and then generate a string and stop as soon as the length exceeds V1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant