Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added MathJax rendering #4

Merged
merged 25 commits into from
Jul 23, 2024
Merged

Added MathJax rendering #4

merged 25 commits into from
Jul 23, 2024

Conversation

dat-boris
Copy link
Collaborator

@dat-boris dat-boris commented Aug 11, 2023

Summary:

With the recent need for Math rendering in Khan Academy, we need to pick up the MathJax
rendering again.

This added MathJax rendering to the conversation whenever we detect any
conversation that contains MathJax inline in the conversation.

Note we need to also consider the slot filling features - we do this by setting an attribute
(data-skip-node) which when met this will allow dynamic MathJax rendering to happen
within the node, without affecting the highlight of the elements.

Issue: https://khanacademy.atlassian.net/browse/DI-533

Test plan:

yarn test

yarn start and see the equation rendered.

Note that we also added the slots to demonstrate different granularity of the tags

image

With the highlight slot (unfortunately we cannot highlight the MathJax expression)

image

And also test out dynamic rendering by using our local label studio repo with multiiple entries.

E2E tests

Although the official E2E test is broken (it seems like it depends on some external resources on some audio test which times out) - We can manaully run those E2E tests manually to confirm that the rich-text component is not borken.

% npx codeceptjs run tests/rich-text/*.js                              
CodeceptJS v3.3.3 #StandWithUkraine
Using test root "/Users/borislau/khan/label-studio-frontend/e2e"

Richtext basic functional --
Warning: Timeout was set to 2400secs.
Global timeout should be specified in seconds.
  ✔ Creating, removing and restoring regions in 8394ms
  ✔ Rich text content consistency in 25356ms

  OK  | 2 passed   // 35s

Previously although RichTextTable display properly, the annotaiton
didn't work right.

This PR is a result of a bunch of learning on how the registry works,
and realiase a few small things that we did wrong and need to hook up
properly.  These will be documented in our documentation [1].

[1] https://khanacademy.atlassian.net/wiki/spaces/INFRA/pages/2373714068/Label+Studio+Frontend+guide

Issue: https://khanacademy.atlassian.net/browse/DI-533

Test plan:
- `yarn start` and goes to https://localhost:3000, and try out the
  annotation.
@dat-boris dat-boris self-assigned this Aug 11, 2023
@dat-boris dat-boris marked this pull request as ready for review August 11, 2023 23:34
src/tags/object/RichText/table.js Outdated Show resolved Hide resolved
src/tags/object/RichText/table.js Outdated Show resolved Hide resolved
src/tags/object/RichText/table.js Outdated Show resolved Hide resolved
@dat-boris
Copy link
Collaborator Author

Ah ha, love the linter output - I will fix that next week (the import is already fixed in #3)

This added MathJax rendering to the conversation whenever we detect any
conversation that contains MathJax inline in the conversation.

Note this will interact with highlighting annotation.  Instead of making
the annotation code much more complex, we just display a seperate
formula rendering below the conversation (see screenshot).  This seems
to be the easiest soluation to annotate and without losing conversation context.

Issue: https://khanacademy.atlassian.net/browse/DI-533

Test plan:
`yarn start` and experiment with annotating the expressions
Copy link
Collaborator Author

@dat-boris dat-boris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall code should be set, but highlighting some of the to fix here.

src/tags/object/RichText/table.js Outdated Show resolved Hide resolved
};

return (
<MathJaxContext config={mathJaxConfig}>
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs fixing, looks like I need to set the dynamic flag in the config

https://github.com/fast-reflexes/better-react-mathjax#dynamic-boolean--undefined

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dynamic flag seems to be for MathJax content changing. However, it does not seems to trigger a new component to typeset (i.e. render into Latex), which is what we want. I added a hacky way to drill a function into ComponentDidMount, which works good enough for now.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/fast-reflexes/better-react-mathjax#rendermode-pre--post--undefined
I was skimming the docs for the mathjax stuff since I'm totally unfamiliar with it, would setting the rendermode to post fix this? sounds like it causes the typeset to process on every render, which would be what we want here, right? I barely speak React, so maybe i'm way off here, or just not understanding the problem space.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha picking this up after almost a year!

Thanks for the suggestions - I agree it sounds promising, but tried the renderMode="pre" together with combination of <MathJax text="...."/>, and it didnt work. I didn't dig too deeply since the ComponentDidMount solution is working okay for now.

Copy link

@jimmykodes jimmykodes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To the extent I understand react, this looks good. one Q in the comments, but approving so as not to block

};

return (
<MathJaxContext config={mathJaxConfig}>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/fast-reflexes/better-react-mathjax#rendermode-pre--post--undefined
I was skimming the docs for the mathjax stuff since I'm totally unfamiliar with it, would setting the rendermode to post fix this? sounds like it causes the typeset to process on every render, which would be what we want here, right? I barely speak React, so maybe i'm way off here, or just not understanding the problem space.

Copy link

@wwells wwells left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to Jimmy's note around react as a foreign language.

Have a question about the benefit - forgive my lack of understanding. What triggers the rendering? Is it automatic or does the labeling coordinator have to specify something in the schema that will give the option of rendering. My limited understanding is that it's automatic.

If that's correct, I'm a bit worried that while this will be necessary and useful for workstreams like better-math, since we aren't scraping out the rendered math from the transcript, I worry a bit about how this changes the UX for a user of something like bad-links...

@dat-boris
Copy link
Collaborator Author

dat-boris commented Aug 15, 2023

Turn out that this actually interfere with Math annotation unfortunately. So I'm going to keep this open and fix this in future ticket (https://khanacademy.atlassian.net/browse/DI-660).

I was skimming the docs for the mathjax stuff since I'm totally unfamiliar with it, would setting the rendermode to post fix this? sounds like it causes the typeset to process on every render, which would be what we want here, right? I barely speak React, so maybe i'm way off here, or just not understanding the problem space.

Good suggestion! I will try that.... the issue I feel is less with react, but more with the MathJax interaction which I probably need to dig into how better-react-mathjax does that.

If that's correct, I'm a bit worried that while this will be necessary and useful for workstreams like better-math, since we aren't scraping out the rendered math from the transcript, I worry a bit about how this changes the UX for a user of something like bad-links...

Good question! The rendering should only happen with has_math flag is detected, so it should only affect conversation that has math that uses this component. But good idea that maybe we should explicitly enable this in the component to avoid noise.

src/tags/object/RichText/table.js Show resolved Hide resolved
src/utils/selection-tools.js Outdated Show resolved Hide resolved
src/utils/selection-tools.js Outdated Show resolved Hide resolved
src/utils/selection-tools.js Outdated Show resolved Hide resolved
src/utils/selection-tools.js Outdated Show resolved Hide resolved
src/utils/selection-tools.js Outdated Show resolved Hide resolved
src/utils/selection-tools.js Outdated Show resolved Hide resolved
src/utils/selection-tools.js Outdated Show resolved Hide resolved
src/utils/selection-tools.js Outdated Show resolved Hide resolved
src/utils/selection-tools.js Outdated Show resolved Hide resolved
dat-boris added a commit to Khan/label-studio that referenced this pull request Jul 17, 2024
Create a new FE build to include new mathjax.

Khan/label-studio-frontend#4
dat-boris added a commit to Khan/label-studio that referenced this pull request Jul 17, 2024
Create a new FE build to include new mathjax.

Khan/label-studio-frontend#4
@dat-boris
Copy link
Collaborator Author

dat-boris commented Jul 22, 2024

Okay, this is now ready for review. Internally, this PR is now deployed on our staging server, which can be tested to confirm that it is rendering.

Note that the E2E test is broken on Github - that is due to some external depdnency on resource 404 that makes the E2E test hang and timeout. I'm manually run the E2E test (see test plan) and confrm it is working.

(cc @lizfaubell )

Copy link

@wwells wwells left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for working on these improvements! I'm definitely not qualified to review React/JS, but what I can grok seems reasonable.

Skimming some of the test projects, it seems the rendering integrity is hit or miss?

Copy link

@lizfaubell lizfaubell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!!! This makes math so much easier to read in threads.

// However we also want to keep the original \(expressions\), so we swap into
// a differnet marker instead.
// We use a different marker than what is used in Khanmigo.
const MAJX_MARKER = '$';

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: should this be MATHJX_MARKER

Comment on lines +75 to +77
if (i % 2 === 0) {
// Non math
return <span key={`eq=${i}`}>{convo}</span>;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I think I'm missing something - how come every other one is not math?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should explain this better in comment - parseConvoWithMath return a list of Math / non-math expression alternatively, which we wraps around here.

@dat-boris dat-boris merged commit ab88304 into master Jul 23, 2024
4 of 5 checks passed
khan-actions-bot pushed a commit to Khan/label-studio that referenced this pull request Jul 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants