===> determine different characters (eg., start tag character, an atrrbute and so on) ===> tokenizer ===> tokens (DOCTYPE, start tag, end tag, comment, and so on) ===> DOM (a tree structure that captures the content and properties of the HTML and all the relationships between the nodes)