Skip to content

Commit 06bf029

Browse files
compulimOEvgeny
andauthored
Add HTML content transformer middleware (#5338)
* Add HTML content transformer * Add entry * Fix alt text * Apply to fenced code blocks only * Add breaking changes * Update entry * Add PR * Fix tests * Fix test * Add xmlns and remove HTML content provider related attributes --------- Co-authored-by: Eugene <EOlonov@gmail.com>
1 parent e5145f3 commit 06bf029

25 files changed

+378
-207
lines changed

CHANGELOG.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,8 @@ Notes: web developers are advised to use [`~` (tilde range)](https://github.com/
3030
- `styleOptions.bubbleMaxWidth`/`bubbleMinWidth` is being deprecated in favor of `styleOptions.bubbleAttachmentMaxWidth`/`bubbleAttachmentMinWidth` and `styleOptions.bubbleMessageMaxWidth`/`bubbleMessageMinWidth`. The option will be removed on or after 2026-10-08
3131
- Moved to `micromark` for rendering Markdown, instead of `markdown-it`
3232
- Please refer to PR [#5330](https://github.com/microsoft/BotFramework-WebChat/pull/5330) for details
33+
- HTML sanitizer is moved from `renderMarkdown` to HTML content transformer middleware, please refer to PR [#5338](https://github.com/microsoft/BotFramework-WebChat/pull/5338)
34+
- If you customized `renderMarkdown` with a custom HTML sanitizer, please move the HTML sanitizer to the new HTML content transformer middleware
3335

3436
### Added
3537

@@ -64,6 +66,10 @@ Notes: web developers are advised to use [`~` (tilde range)](https://github.com/
6466
- Added code viewer dialog with syntax highlighting, in PR [#5335](https://github.com/microsoft/BotFramework-WebChat/pull/5335), by [@OEvgeny](https://github.com/OEvgeny)
6567
- Added copy button to code blocks, in PR [#5334](https://github.com/microsoft/BotFramework-WebChat/pull/5334), by [@compulim](https://github.com/compulim)
6668
- Added copy button to view code dialog, in PR [#5336](https://github.com/microsoft/BotFramework-WebChat/pull/5336), by [@compulim](https://github.com/compulim)
69+
- Added HTML content transformer middleware, in PR [#5338](https://github.com/microsoft/BotFramework-WebChat/pull/5338), by [@compulim](https://github.com/compulim)
70+
- HTML content transformer is used by `useRenderMarkdown` to transform the result from `renderMarkdown`
71+
- HTML sanitizer is moved from `renderMarkdown` into HTML content transformer for better coverage
72+
- Copy button is added to fenced code blocks (`<pre><code>`)
6773

6874
### Changed
6975

__tests__/html2/markdown/codeBlockCopyButton/adaptiveCards/behavior.html

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -75,11 +75,10 @@
7575
7676
Ea sint elit anim enim voluptate aliquip aliqua nulla veniam.
7777
78-
<pre>
79-
Ea et pariatur sint Lorem ex veniam adipisicing.
78+
<pre><code>Ea et pariatur sint Lorem ex veniam adipisicing.
8079
8180
Aliqua magna aliquip nisi quis.
82-
</pre>
81+
</code></pre>
8382
8483
Cupidatat nulla duis dolor nulla ut pariatur minim incididunt quis adipisicing velit id Lorem.`,
8584
wrap: true

__tests__/html2/markdown/codeBlockCopyButton/adaptiveCards/layout.html

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -56,11 +56,10 @@
5656
5757
Ea sint elit anim enim voluptate aliquip aliqua nulla veniam.
5858
59-
<pre>
60-
Ea et pariatur sint Lorem ex veniam adipisicing.
59+
<pre><code>Ea et pariatur sint Lorem ex veniam adipisicing.
6160
6261
Aliqua magna aliquip nisi quis.
63-
</pre>
62+
</code></pre>
6463
6564
Cupidatat nulla duis dolor nulla ut pariatur minim incididunt quis adipisicing velit id Lorem.`,
6665
wrap: true

__tests__/html2/markdown/codeBlockCopyButton/behavior.denied.html

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -61,11 +61,10 @@
6161
6262
Ea sint elit anim enim voluptate aliquip aliqua nulla veniam.
6363
64-
<pre>
65-
Ea et pariatur sint Lorem ex veniam adipisicing.
64+
<pre><code>Ea et pariatur sint Lorem ex veniam adipisicing.
6665
6766
Aliqua magna aliquip nisi quis.
68-
</pre>
67+
</code></pre>
6968
7069
Cupidatat nulla duis dolor nulla ut pariatur minim incididunt quis adipisicing velit id Lorem.`,
7170
type: 'message'

__tests__/html2/markdown/codeBlockCopyButton/behavior.html

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -61,11 +61,10 @@
6161
6262
Ea sint elit anim enim voluptate aliquip aliqua nulla veniam.
6363
64-
<pre>
65-
Ea et pariatur sint Lorem ex veniam adipisicing.
64+
<pre><code>Ea et pariatur sint Lorem ex veniam adipisicing.
6665
6766
Aliqua magna aliquip nisi quis.
68-
</pre>
67+
</code></pre>
6968
7069
Cupidatat nulla duis dolor nulla ut pariatur minim incididunt quis adipisicing velit id Lorem.`,
7170
type: 'message'

__tests__/html2/markdown/codeBlockCopyButton/behavior.stringsChange.html

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -36,11 +36,10 @@
3636
3737
Ea sint elit anim enim voluptate aliquip aliqua nulla veniam.
3838
39-
<pre>
40-
Ea et pariatur sint Lorem ex veniam adipisicing.
39+
<pre><code>Ea et pariatur sint Lorem ex veniam adipisicing.
4140
4241
Aliqua magna aliquip nisi quis.
43-
</pre>
42+
</code></pre>
4443
4544
Cupidatat nulla duis dolor nulla ut pariatur minim incididunt quis adipisicing velit id Lorem.`,
4645
type: 'message'

__tests__/html2/markdown/codeBlockCopyButton/fluent/layout.html

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -54,11 +54,10 @@
5454
5555
Ea sint elit anim enim voluptate aliquip aliqua nulla veniam.
5656
57-
<pre>
58-
Ea et pariatur sint Lorem ex veniam adipisicing.
57+
<pre><code>Ea et pariatur sint Lorem ex veniam adipisicing.
5958
6059
Aliqua magna aliquip nisi quis.
61-
</pre>
60+
</code></pre>
6261
6362
Cupidatat nulla duis dolor nulla ut pariatur minim incididunt quis adipisicing velit id Lorem.`,
6463
type: 'message'

__tests__/html2/markdown/codeBlockCopyButton/layout.html

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -42,11 +42,10 @@
4242
4343
Ea sint elit anim enim voluptate aliquip aliqua nulla veniam.
4444
45-
<pre>
46-
Ea et pariatur sint Lorem ex veniam adipisicing.
45+
<pre><code>Ea et pariatur sint Lorem ex veniam adipisicing.
4746
4847
Aliqua magna aliquip nisi quis.
49-
</pre>
48+
</code></pre>
5049
5150
Cupidatat nulla duis dolor nulla ut pariatur minim incididunt quis adipisicing velit id Lorem.`,
5251
type: 'message'

packages/bundle/src/AddFullBundle.tsx

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@ import {
33
type AttachmentMiddleware,
44
type StyleOptions
55
} from 'botframework-webchat-api';
6+
import { type HTMLContentTransformMiddleware } from 'botframework-webchat-component';
67
import { singleToArray, warnOnce, type OneOrMany } from 'botframework-webchat-core';
78
import React, { type ReactNode } from 'react';
89

@@ -18,6 +19,7 @@ type AddFullBundleProps = Readonly<{
1819
attachmentForScreenReaderMiddleware?: OneOrMany<AttachmentForScreenReaderMiddleware>;
1920
attachmentMiddleware?: OneOrMany<AttachmentMiddleware>;
2021
children: ({ extraStyleSet }: { extraStyleSet: any }) => ReactNode;
22+
htmlContentTransformMiddleware?: HTMLContentTransformMiddleware[];
2123
renderMarkdown?: (
2224
markdown: string,
2325
newLineOptions: { markdownRespectCRLF: boolean },
@@ -41,6 +43,7 @@ const AddFullBundle = ({
4143
attachmentForScreenReaderMiddleware,
4244
attachmentMiddleware,
4345
children,
46+
htmlContentTransformMiddleware,
4447
renderMarkdown,
4548
styleOptions,
4649
styleSet
@@ -50,6 +53,7 @@ const AddFullBundle = ({
5053
const patchedProps = useComposerProps({
5154
attachmentForScreenReaderMiddleware: singleToArray(attachmentForScreenReaderMiddleware),
5255
attachmentMiddleware: singleToArray(attachmentMiddleware),
56+
htmlContentTransformMiddleware,
5357
renderMarkdown,
5458
styleOptions,
5559
styleSet

packages/bundle/src/__tests__/renderMarkdown.spec.js

Lines changed: 12 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -23,23 +23,23 @@ describe('renderMarkdown', () => {
2323
const styleOptions = { markdownRespectCRLF: true };
2424

2525
expect(renderMarkdown('Same line.\nSame line. \n2nd line.', styleOptions, renderMarkdownOptions)).toBe(
26-
'<p>Same line.\nSame line.<br />\n2nd line.</p>'
26+
'<p xmlns="http://www.w3.org/1999/xhtml">Same line.\nSame line.<br />\n2nd line.</p>'
2727
);
2828
});
2929

3030
it('should respect CRLF', () => {
3131
const styleOptions = { markdownRespectCRLF: true };
3232

3333
expect(renderMarkdown('Same Line.\n\rSame Line.\r\n2nd line.', styleOptions, renderMarkdownOptions)).toBe(
34-
'<p>Same Line.\nSame Line.</p>\n<p>2nd line.</p>'
34+
'<p xmlns="http://www.w3.org/1999/xhtml">Same Line.\nSame Line.</p>\n<p xmlns="http://www.w3.org/1999/xhtml">2nd line.</p>'
3535
);
3636
});
3737

3838
it('should respect LFCR', () => {
3939
const styleOptions = { markdownRespectCRLF: false };
4040

4141
expect(renderMarkdown('Same Line.\r\nSame Line.\n\r2nd line.', styleOptions, renderMarkdownOptions)).toBe(
42-
'<p>Same Line.\nSame Line.</p>\n<p>2nd line.</p>'
42+
'<p xmlns="http://www.w3.org/1999/xhtml">Same Line.\nSame Line.</p>\n<p xmlns="http://www.w3.org/1999/xhtml">2nd line.</p>'
4343
);
4444
});
4545

@@ -48,7 +48,9 @@ describe('renderMarkdown', () => {
4848

4949
expect(
5050
renderMarkdown('**Message with Markdown**\r\nShould see bold text.', styleOptions, renderMarkdownOptions)
51-
).toBe('<p><strong>Message with Markdown</strong></p>\n<p>Should see bold text.</p>');
51+
).toBe(
52+
'<p xmlns="http://www.w3.org/1999/xhtml"><strong>Message with Markdown</strong></p>\n<p xmlns="http://www.w3.org/1999/xhtml">Should see bold text.</p>'
53+
);
5254
});
5355

5456
it('should render code correctly', () => {
@@ -60,11 +62,7 @@ describe('renderMarkdown', () => {
6062
styleOptions,
6163
renderMarkdownOptions
6264
)
63-
)
64-
.toBe(`<pre class="webchat__render-markdown__code-block"><webchat--code-block-copy-button class="webchat__code-block-copy-button" data-alt-copied="Copied" data-alt-copy="Copy" data-value="{
65-
&quot;hello&quot;: &quot;World!&quot;
66-
}
67-
"></webchat--code-block-copy-button><code>{
65+
).toBe(`<pre xmlns="http://www.w3.org/1999/xhtml"><code>{
6866
"hello": "World!"
6967
}
7068
</code></pre>`);
@@ -74,7 +72,7 @@ describe('renderMarkdown', () => {
7472
const styleOptions = { markdownRespectCRLF: true };
7573

7674
expect(renderMarkdown('[example](https://sample.com)', styleOptions, renderMarkdownOptions)).toBe(
77-
`<p>\u200B<a href="https://sample.com" aria-label="example " rel="noopener noreferrer" target="_blank">example<img src="" alt="" class="webchat__render-markdown__external-link-icon" /></a>\u200B</p>`
75+
`<p xmlns="http://www.w3.org/1999/xhtml">\u200B<a href="https://sample.com" aria-label="example " rel="noopener noreferrer" target="_blank">example<img src="" alt="" class="webchat__render-markdown__external-link-icon" /></a>\u200B</p>`
7876
);
7977
});
8078

@@ -83,31 +81,31 @@ describe('renderMarkdown', () => {
8381
const options = { externalLinkAlt: 'Opens in a new window, external.' };
8482

8583
expect(renderMarkdown('[example](https://sample.com)', styleOptions, options)).toBe(
86-
`<p>\u200B<a href="https://sample.com" aria-label="example Opens in a new window, external." rel="noopener noreferrer" target="_blank">example<img src="" alt="" class="webchat__render-markdown__external-link-icon" title="Opens in a new window, external." /></a>\u200B</p>`
84+
`<p xmlns="http://www.w3.org/1999/xhtml">\u200B<a href="https://sample.com" aria-label="example Opens in a new window, external." rel="noopener noreferrer" target="_blank">example<img src="" alt="" class="webchat__render-markdown__external-link-icon" title="Opens in a new window, external." /></a>\u200B</p>`
8785
);
8886
});
8987

9088
it('should render sip protocol links correctly', () => {
9189
const styleOptions = { markdownRespectCRLF: true };
9290

9391
expect(renderMarkdown(`[example@test.com](sip:example@test.com)`, styleOptions, renderMarkdownOptions)).toBe(
94-
'<p>\u200B<a href="sip:example@test.com" rel="noopener noreferrer" target="_blank">example@test.com</a>\u200B</p>'
92+
'<p xmlns="http://www.w3.org/1999/xhtml">\u200B<a href="sip:example@test.com" rel="noopener noreferrer" target="_blank">example@test.com</a>\u200B</p>'
9593
);
9694
});
9795

9896
it('should render tel protocol links correctly', () => {
9997
const styleOptions = { markdownRespectCRLF: true };
10098

10199
expect(renderMarkdown(`[(505)503-4455](tel:505-503-4455)`, styleOptions, renderMarkdownOptions)).toBe(
102-
'<p>\u200B<a href="tel:505-503-4455" rel="noopener noreferrer" target="_blank">(505)503-4455</a>\u200B</p>'
100+
'<p xmlns="http://www.w3.org/1999/xhtml">\u200B<a href="tel:505-503-4455" rel="noopener noreferrer" target="_blank">(505)503-4455</a>\u200B</p>'
103101
);
104102
});
105103

106104
it('should render strikethrough text correctly', () => {
107105
const styleOptions = { markdownRespectCRLF: true };
108106

109107
expect(renderMarkdown(`~~strike text~~`, styleOptions, renderMarkdownOptions)).toBe(
110-
'<p><del>strike text</del></p>'
108+
'<p xmlns="http://www.w3.org/1999/xhtml"><del>strike text</del></p>'
111109
);
112110
});
113111
});
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
import { type HTMLContentTransformMiddleware } from 'botframework-webchat-component';
2+
3+
import createCodeBlockCopyButtonMiddleware from './middleware/createCodeBlockCopyButtonMiddleware';
4+
import createSanitizeMiddleware from './middleware/createSanitizeMiddleware';
5+
6+
export default function createHTMLContentTransformMiddleware(): readonly HTMLContentTransformMiddleware[] {
7+
return Object.freeze([createCodeBlockCopyButtonMiddleware(), createSanitizeMiddleware()]);
8+
}
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
import { type HTMLContentTransformMiddleware } from 'botframework-webchat-component';
2+
3+
import codeBlockCopyButtonDocumentMod from '../private/codeBlockCopyButtonDocumentMod';
4+
5+
export default function createCodeBlockCopyButtonMiddleware(): HTMLContentTransformMiddleware {
6+
return () => next => request =>
7+
next(
8+
Object.freeze({
9+
...request,
10+
documentFragment: codeBlockCopyButtonDocumentMod(request.documentFragment, {
11+
codeBlockCopyButtonAltCopied: request.codeBlockCopyButtonAltCopied,
12+
codeBlockCopyButtonAltCopy: request.codeBlockCopyButtonAltCopy,
13+
codeBlockCopyButtonClassName: request.codeBlockCopyButtonClassName,
14+
codeBlockCopyButtonTagName: request.codeBlockCopyButtonTagName
15+
})
16+
})
17+
);
18+
}
Lines changed: 109 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
import {
2+
parseDocumentFragmentFromString,
3+
serializeDocumentFragmentIntoString
4+
} from 'botframework-webchat-component/internal';
5+
import sanitizeHTML from 'sanitize-html';
6+
7+
const BASE_SANITIZE_HTML_OPTIONS = Object.freeze({
8+
allowedAttributes: {
9+
a: ['aria-label', 'class', 'href', 'name', 'rel', 'target'],
10+
button: ['aria-label', 'class', 'type', 'value'],
11+
img: ['alt', 'aria-label', 'class', 'src', 'title'],
12+
pre: ['class'],
13+
span: ['aria-label']
14+
},
15+
allowedSchemes: ['data', 'http', 'https', 'ftp', 'mailto', 'sip', 'tel'],
16+
allowedTags: [
17+
'a',
18+
'b',
19+
'blockquote',
20+
'br',
21+
'button',
22+
'caption',
23+
'code',
24+
'del',
25+
'div',
26+
'em',
27+
'h1',
28+
'h2',
29+
'h3',
30+
'h4',
31+
'h5',
32+
'h6',
33+
'hr',
34+
'i',
35+
'img',
36+
'ins',
37+
'li',
38+
'nl',
39+
'ol',
40+
'p',
41+
'pre',
42+
's',
43+
'span',
44+
'strike',
45+
'strong',
46+
'table',
47+
'tbody',
48+
'td',
49+
'tfoot',
50+
'th',
51+
'thead',
52+
'tr',
53+
'ul',
54+
55+
// Followings are for MathML elements, from https://developer.mozilla.org/en-US/docs/Web/MathML.
56+
'annotation-xml',
57+
'annotation',
58+
'math',
59+
'merror',
60+
'mfrac',
61+
'mi',
62+
'mmultiscripts',
63+
'mn',
64+
'mo',
65+
'mover',
66+
'mpadded',
67+
'mphantom',
68+
'mprescripts',
69+
'mroot',
70+
'mrow',
71+
'ms',
72+
'mspace',
73+
'msqrt',
74+
'mstyle',
75+
'msub',
76+
'msubsup',
77+
'msup',
78+
'mtable',
79+
'mtd',
80+
'mtext',
81+
'mtr',
82+
'munder',
83+
'munderover',
84+
'semantics'
85+
],
86+
// Bug of https://github.com/apostrophecms/sanitize-html/issues/633.
87+
// They should not remove `alt=""` even though it is empty.
88+
nonBooleanAttributes: []
89+
});
90+
91+
export default function createSanitizeMiddleware() {
92+
return () => () => request => {
93+
const { codeBlockCopyButtonTagName, documentFragment } = request;
94+
const sanitizeHTMLOptions = {
95+
...BASE_SANITIZE_HTML_OPTIONS,
96+
allowedAttributes: {
97+
...BASE_SANITIZE_HTML_OPTIONS.allowedAttributes,
98+
[codeBlockCopyButtonTagName]: ['class', 'data-alt-copy', 'data-alt-copied', 'data-testid', 'data-value']
99+
},
100+
allowedTags: [...BASE_SANITIZE_HTML_OPTIONS.allowedTags, codeBlockCopyButtonTagName]
101+
};
102+
103+
const htmlAfterBetterLink = serializeDocumentFragmentIntoString(documentFragment);
104+
105+
const htmlAfterSanitization = sanitizeHTML(htmlAfterBetterLink, sanitizeHTMLOptions);
106+
107+
return parseDocumentFragmentFromString(htmlAfterSanitization);
108+
};
109+
}

0 commit comments

Comments
 (0)