-
Notifications
You must be signed in to change notification settings - Fork 1
Using the Library: Custom Whitelists
The default whitelist for tags and attributes are based on the w3school's exhaustive HTML Element Reference page. Only those tags and attributes deemed dangerously prone to xss vulnerabilities are excluded, for example the SCRIPT
tag or the onclick
attribute. If this list is too exhaustive for a specific need, it is possible to define a custom list. There are two ways to accomplish this.
The first way is for special one-off needs. The SanitizeHtml()
function supports overloads, accepting List<String>
values containing your custom lists. When using custom whitelists in this way, both the tags and attributes need to be defined.
Given the below sample, the default whitelists are used.
String inputValue = "<a href=\"www.google.com\">Click Me</a>";
String cleanValue = inputValue.SanitizeHtml();
Console.Writeline(cleanValue);
The output is
<a href="www.google.com">Click Me</a>
With the following custom whitelists (note the new arguments passed to the function)
var myTags = new List<String>() { "a", "strong", "p" };
var myAttributes = new List<String>() { "href", "src" };
String inputValue = "<a href=\"www.google.com\">Click Me</a>";
String cleanValue = inputValue.SanitizeHtml(myTags, myAttributes);
Console.Writeline(cleanValue);
The output is still the same, because the tag a
is whitelisted, as well as the attribute href
.
<a href="www.google.com">Click Me</a>
If the custom attributes whitelist has no elements
var myTags = new List<String>() { "a", "strong", "p" };
var myAttributes = new List<String>();
String inputValue = "<a href=\"www.google.com\">Click Me</a>";
String cleanValue = inputValue.SanitizeHtml(myTags, myAttributes);
Console.Writeline(cleanValue);
then the output becomes <a>Click Me</a>
because while the a
tag is whitelisted, there are no whitelisted attributes, hence in effect all attributes are rejected.
Given the modified sample below
var myTags = new List<String>() { "a", "strong", "p" };
var myAttributes = new List<String>() { "src" };
String inputValue = "<a href=\"www.google.com\">Click Me</a>";
String cleanValue = inputValue.SanitizeHtml(myTags, myAttributes);
Console.Writeline(cleanValue);
the output is still <a>Click Me</a>
, because while we have given a custom attribute, href
is not included.
Another overload allows defining the attributes to be inspected for the presence of known scripting patterns. For example, the href
attribute is typically used to define the target url of an a
tag, but can also contain Javascript. The link is desirable, but the script is not. Telling MarkupSanity the list of attributes that should be inspected adds an additional check whenever these attributes are found.
var myTags = new List<String>() { "a", "strong", "p" };
var myAttributes = new List<String>() { "src" };
var myScriptableAttributes = new List<String>() { "href", "src" };
String inputValue = "<a href=\"www.google.com\">Click Me</a><a href=\"javascript=\"alert('gotcha!');\"\">Click me too</a>";
String cleanValue = inputValue.SanitizeHtml(myTags, myAttributes, myScriptableAttributes);
Console.Writeline(cleanValue);
The above example returns <a href="www.google.com">Click Me</a><a>Click me too</a>
. The first a
passes completely because the href
is not a script, but the second a
has its href
removed because it contains a script.
Defining a custom attributes whitelist can bypass the removal of tags like script
and event attributes like onclick
if these are added to the whitelist. This gives the flexibility of providing exemptions for special cases where these are desired.