Summary
An unsafe sanitization of dataset contents on the MarkovData#getNext
method used in Markov#generate
and Markov#choose
allows a maliciously crafted string on the dataset to throw and stop the function from running properly.
Details
https://github.com/xiboon/kurwov/blob/0d58dfa42135ab40e830e92622857282f980ca89/src/MarkovData.ts#L38-L44
If a string contains a forbidden substring (i.e. __proto__
) followed by a space character, the second line will access a special property in MarkovData#finalData
by removing the last character of the string, bypassing the dataset sanitization (as it is supposed to be already sanitized before this function is called).
data
is then defined as the special function found in its prototype instead of an array.
On the last line, data
is then indexed by a random number, which is supposed to return a string but returns undefined as it's a function. Calling endsWith
then throws.
PoC
https://runkit.com/embed/m6uu40r5ja9b
Impact
Any dataset can be contaminated with the substring making it unable to properly generate anything in some cases.
References
Summary
An unsafe sanitization of dataset contents on the
MarkovData#getNext
method used inMarkov#generate
andMarkov#choose
allows a maliciously crafted string on the dataset to throw and stop the function from running properly.Details
https://github.com/xiboon/kurwov/blob/0d58dfa42135ab40e830e92622857282f980ca89/src/MarkovData.ts#L38-L44
If a string contains a forbidden substring (i.e.
__proto__
) followed by a space character, the second line will access a special property inMarkovData#finalData
by removing the last character of the string, bypassing the dataset sanitization (as it is supposed to be already sanitized before this function is called).data
is then defined as the special function found in its prototype instead of an array.On the last line,
data
is then indexed by a random number, which is supposed to return a string but returns undefined as it's a function. CallingendsWith
then throws.PoC
https://runkit.com/embed/m6uu40r5ja9b
Impact
Any dataset can be contaminated with the substring making it unable to properly generate anything in some cases.
References