Ensure HTML is read as UTF-8

This has been tested on Windows where Python could detect HTML files as a windows codepage rather than UTF-8. As it seems Sphinx generates HTML in UTF-8, it should be safe to enforce it here. Signed-off-by: Jonatan "jaw" Wallmander <jonatan@vovoid.com>
boschglobal · Oct 12, 2023 · 30fe129 · 30fe129
1 parent 66fe695
commit 30fe129
Show file tree

Hide file tree

Showing 2 changed files with 3 additions and 1 deletion.
diff --git a/NOTICE.md b/NOTICE.md
@@ -42,3 +42,5 @@ Please keep the list sorted.
   * Stefan Schulz - <stefan.schulz@itemis.com>
 * Stream HPC B.V.
   * Gergely Meszaros <gergely@streamhpc.com>
+* Klarälvdalens Datakonsult AB (KDAB)
+  * Jonatan Wallmander <jonatan.wallmander@kdab.com>
diff --git a/doxysphinx/html_parser.py b/doxysphinx/html_parser.py
@@ -484,7 +484,7 @@ def parse(self, file: Path) -> HtmlParseResult:
         :return: The result of the parsing
         :rtype: ParseResult
         """
-        buffer = file.read_text()
+        buffer = file.read_text(encoding="utf-8")
         tree = etree.document_fromstring(buffer).getroottree()
 
         meta_title, project, title = self._read_project_and_title(buffer, file)