Skip to content

HTML Minifier

Andrey Taritsyn edited this page Aug 10, 2015 · 31 revisions

HTML Minifier produces minification of HTML and XHTML code. As a result of minification on the output we get valid HTML code.

Consider a simple example of usage of the HTML Minifier:

using System;
using System.Collections.Generic;

using WebMarkupMin.Core;

namespace WebMarkupMin.Sample.ConsoleApplication
{
	class Program
	{
		static void Main(string[] args)
		{
			const string htmlInput = @"<!DOCTYPE html>
<html>
	<head>
		<meta charset=""utf-8"" />
		<title>The test document</title>
		<link href=""favicon.ico"" rel=""shortcut icon"" type=""image/x-icon"" />
		<meta name=""viewport"" content=""width=device-width"" />
		<link rel=""stylesheet"" type=""text/css"" href=""/Content/Site.css"" />
	</head>
	<body>
		<p>Lorem ipsum dolor sit amet...</p>

		<script src=""http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.9.1.min.js""></script>
		<script>
			(window.jquery) || document.write('<script src=""/Scripts/jquery-1.9.1.min.js""><\/script>');
		</script>
	</body>
</html>";

			var htmlMinifier = new HtmlMinifier();

			MarkupMinificationResult result = htmlMinifier.Minify(htmlInput,
				generateStatistics: true);
			if (result.Errors.Count == 0)
			{
				MinificationStatistics statistics = result.Statistics;
				if (statistics != null)
				{
					Console.WriteLine("Original size: {0:N0} Bytes",
						statistics.OriginalSize);
					Console.WriteLine("Minified size: {0:N0} Bytes",
						statistics.MinifiedSize);
					Console.WriteLine("Saved: {0:N2}%",
						statistics.SavedInPercent);
				}
				Console.WriteLine("Minified content:{0}{0}{1}",
					Environment.NewLine, result.MinifiedContent);
			}
			else
			{
				IList<MinificationErrorInfo> errors = result.Errors;

				Console.WriteLine("Found {0:N0} error(s):", errors.Count);
				Console.WriteLine();

				foreach (var error in errors)
				{
					Console.WriteLine("Line {0}, Column {1}: {2}",
						error.LineNumber, error.ColumnNumber, error.Message);
					Console.WriteLine();
				}
			}
		}
	}
}

First we create an instance of the HtmlMinifier class, and then call its the Minify method with the following parameters: first parameter contains HTML code, and second - flag for whether to allow generate minification statistics (default value - false, because generation of statistics requires time and additional resources). Minify method returns an object of the MarkupMinificationResult type, which has the following properties:

  • MinifiedContent - minified HTML code;
  • Errors - list of errors, that occurred during minification;
  • Warnings - list of warnings about the problems, which were found during minification;
  • Statistics - statistical information about minified code.

If list of errors is empty, then print minification statistics and minified code to the console, otherwise print error information to the console.

Consider an example of a more advanced usage of the HTML Minifier:

using System;
using System.Collections.Generic;
using System.Text;

using WebMarkupMin.Core;
using WebMarkupMin.Core.Loggers;

namespace WebMarkupMin.Sample.ConsoleApplication
{
	class Program
	{
		static void Main(string[] args)
		{
			const string htmlInput = @"<!DOCTYPE html>
<html>
	<head>
		<meta charset=""utf-8"" />
		<title>The test document</title>
		<link href=""favicon.ico"" rel=""shortcut icon"" type=""image/x-icon"" />
		<meta name=""viewport"" content=""width=device-width"" />
		<link rel=""stylesheet"" type=""text/css"" href=""/Content/Site.css"" />
	</head>
	<body>
		<p>Lorem ipsum dolor sit amet...</p>

		<script src=""http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.9.1.min.js""></script>
		<script>
			(window.jquery) || document.write('<script src=""/Scripts/jquery-1.9.1.min.js""><\/script>');
		</script>
	</body>
</html>";

			var settings = new HtmlMinificationSettings();
			var cssMinifier = new KristensenCssMinifier();
			var jsMinifier = new CrockfordJsMinifier();
			var logger = new NullLogger();

			var htmlMinifier = new HtmlMinifier(settings, cssMinifier,
				jsMinifier, logger);

			MarkupMinificationResult result = htmlMinifier.Minify(htmlInput,
				fileContext: string.Empty,
				encoding: Encoding.GetEncoding(0),
				generateStatistics: false);
			if (result.Errors.Count == 0)
			{
				Console.WriteLine("Minified content:{0}{0}{1}",
					Environment.NewLine, result.MinifiedContent);
			}
			else
			{
				IList<MinificationErrorInfo> errors = result.Errors;

				Console.WriteLine("Found {0:N0} error(s):", errors.Count);
				Console.WriteLine();

				foreach (var error in errors)
				{
					Console.WriteLine("Line {0}, Column {1}: {2}",
						error.LineNumber, error.ColumnNumber, error.Message);
					Console.WriteLine();
				}
			}
		}
	}
}

When creating an instance of the HtmlMinifier class, we pass through the constructor: HTML minification settings, CSS minifier, JS minifier and logger. In the Minify method passed another two additional parameters:

  • fileContext. Can contain a path to the file or URL of the web page. The value of this parameter is used when logging.
  • encoding. Contains a text encoding, which is used in the minification process and statistics generation.

The values of parameters in the above code correspond to the default values.

And now let's consider in detail properties of the HtmlMinificationSettings class:

Property name Data type Default value Description
WhitespaceMinificationMode Enumeration Medium Whitespace minification mode. Can take the following values:
  • None. Keep whitespace.
  • Safe. Safe whitespace minification: removes whitespace characters from top and bottom of HTML document; multiple whitespace characters are replaced by a single space; removes all leading and trailing whitespace characters from DOCTYPE declaration; removes all leading and trailing whitespace characters from outer and inner contents of invisible tags (html, head, body, meta, link, script, etc.); removes all leading and trailing whitespace characters from outer contents of non-independent tags (li, dt, dd, rt, rp, option, tr, td, th, etc.).
  • Medium. Medium whitespace minification: executes all operations of the safe whitespace minification + removes all leading and trailing whitespace characters from outer and internal contents of block-level tags.
  • Aggressive. Aggressive whitespace minification: executes all operations of the medium whitespace minification + removes all leading and trailing whitespace characters from internal contents of inline and inline-block tags.
RemoveHtmlComments Boolean true Flag for whether to remove all HTML comments, except conditional, noindex, KnockoutJS containerless comments and AngularJS comment directives.
RemoveHtmlComments­FromScriptsAndStyles Boolean true Flag for whether to remove HTML comments from script and style tags.
RemoveCdataSections­FromScriptsAndStyles Boolean true Flag for whether to remove CDATA sections from script and style tags.
UseShortDoctype Boolean true Flag for whether to replace existing document type declaration by short declaration - <!DOCTYPE html>.
UseMetaCharsetTag Boolean true Flag for whether to replace <meta http-equiv="content-type" content="text/html; charset=…"> tag by <meta charset="…"> tag
EmptyTagRenderMode Enumeration NoSlash Render mode of HTML empty tag. Can take the following values:
  • NoSlash. Without slash (for example, <br>).
  • Slash. With slash (for example, <br/>).
  • SpaceAndSlash. With space and slash (for example, <br />).
RemoveOptionalEndTags Boolean true Flag for whether to remove optional end tags (html, head, body, p, li, dt, dd, rt, rp, optgroup, option, colgroup, thead, tfoot, tbody, tr, th and td).
RemoveTagsWithoutContent Boolean false Flag for whether to remove tags without content, except for textarea, tr, th and td tags, and tags with class, id, name, role, src and data-* attributes.
CollapseBooleanAttributes Boolean true Flag for whether to remove values from boolean attributes (for example, checked="checked" is transforms to checked).
RemoveEmptyAttributes Boolean true Flag for whether to remove attributes, which have empty value (valid attributes are: class, id, name, style, title, lang, dir, event attributes, action attribute of form tag and value attribute of input tag).
AttributeQuotesRemovalMode Enumeration Html5 HTML attribute quotes removal mode. Can take the following values:
  • KeepQuotes. Keep quotes.
  • Html4. Removes a quotes in accordance with standard HTML 4.X.
  • Html5. Removes a quotes in accordance with standard HTML5.
RemoveRedundantAttributes Boolean false
  • <script language="javascript" …>
  • <script src="…" charset="…" …>
  • <link rel="stylesheet" charset="…" …>
  • <form method="get" …>
  • <input type="text" …>
  • <a id="…" name="…" …>
  • <area shape="rect" …>
RemoveJsTypeAttributes Boolean true Flag for whether to remove type="text/javascript" attributes from script tags.
RemoveCssTypeAttributes Boolean true Flag for whether to remove type="text/css" attributes from style and link tags.
RemoveHttpProtocol­FromAttributes Boolean false Flag for whether to remove the HTTP protocol portion (http:) from URI-based attributes (tags marked with rel="external" are skipped).
RemoveHttpsProtocol­FromAttributes Boolean false Flag for whether to remove the HTTPS protocol portion (https:) from URI-based attributes (tags marked with rel="external" are skipped).
RemoveJsProtocol­FromAttributes Boolean true Flag for whether to remove the javascript: pseudo-protocol portion from event attributes.
MinifyEmbeddedCssCode Boolean true Flag for whether to minify CSS code in style tags.
MinifyInlineCssCode Boolean true Flag for whether to minify CSS code in style attributes.
MinifyEmbeddedJsCode Boolean true Flag for whether to minify JS code in script tags.
MinifyInlineJsCode Boolean true Flag for whether to minify JS code in event attributes and hyperlinks with javascript: pseudo-protocol.
ProcessableScriptTypeList String empty string Comma-separated list of types of script tags, that are processed by minifier (e.g. "text/html, text/ng-template"). Currently only supported the KnockoutJS, Kendo UI MVVM and AngularJS views.
MinifyKnockout­BindingExpressions Boolean false Flag for whether to minify the KnockoutJS binding expressions in data-bind attributes and containerless comments.
MinifyAngular­BindingExpressions Boolean false Flag for whether to minify the AngularJS binding expressions in Mustache-style tags ({{}}) and directives.
CustomAngularDirectiveList String empty string Comma-separated list of names of custom AngularJS directives (e.g. "myDir, btfCarousel"), that contain expressions. If value of the MinifyAngularBindingExpressions property equal to true, then the expressions in custom directives will be minified.