Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MDX language #6339

Merged
merged 3 commits into from
May 30, 2023
Merged

Add MDX language #6339

merged 3 commits into from
May 30, 2023

Conversation

wooorm
Copy link
Contributor

@wooorm wooorm commented Mar 22, 2023

Description

MDX is a literate programming language that combines markdown with several JavaScript language features, such as JSX, expressions, and ESM.

Previously, there was some temporary duct-tape in place, to map .mdx extensions to markdown: GH-4416.
This improved the situation from nothing to something, but has issues, because the syntax that MDX brings wasn’t highlighted.

With this grammar, those can be properly highlighted.

Related-to: GH-4416.

Checklist:


P.S. the URL in the issue template doesn’t work with the new GH search:

https://github.com/search?utf8=%E2%9C%93&type=Code&ref=searchresults&q=extension%3AMDX+KEYWORDS+NOT+nothack

@wooorm wooorm requested a review from a team as a code owner March 22, 2023 12:08
@wooorm wooorm mentioned this pull request Mar 22, 2023
4 tasks
@lildude
Copy link
Member

lildude commented Mar 22, 2023

P.S. the URL in the issue template doesn’t work with the new GH search:

Yup. The new search still in beta and I'd rather keep things simple for now, especially as the new search does give a clue on how to correct things if you use the old syntax. The template will be changed when it becomes the default search for all users.

@wooorm
Copy link
Contributor Author

wooorm commented Mar 22, 2023

Good point! ID added.


Ah, it’s still in beta, I forgot that I toggled it on!


I have some further Qs btw, hoping someone can help me!

  1. The grammar in question uses \g<name> and later (?<name>...){0} patterns. That way, recursion can be supported: markdown has a couple cases where matching parens are fine, but unmatched ones aren’t. These are oniguruma extensions (https://github.com/kkos/oniguruma/blob/e35581d6568a47a981a57806f239f9de401c6511/doc/RE#LL460C14-L460C14). From what I can read on PCRE, they should also be supported here by GH: http://www.pcre.org/original/doc/html/pcrepattern.html#SEC25. I also find a couple of such uses in grammars that already exist here. But as I can’t easily test GHs internal highlighting, I wanted to make sure: are they supported?

  2. As MDX, the language in question here, is literate programming, github/markup can’t really render (it would need to evaluate JavaScript to do so). So it’s important that github/markup doesn’t think it’s markdown. Can someone confirm whether this is enough (I think so?), or whether that’s done based on type: markup or ace_mode: markdown or so?

  3. In the grammar repo, I also have a text.md grammar, for plain markdown. It’s a lot better than other grammars that I checked, including atom/language-gfm. My grammar has less bugs (take the sample in this PR for an example of lists and fenced code with tildes going haywire), supports more GFM features (tables, footnotes, plain URLs), and is maintained (atom/language-gfm is archived). I understand markdown is quite important to GitHub. Would it be of interest to swap out atom/language-gfm for this new repo? If there is potential interest, I can open a new discussion for this.

wooorm added 2 commits March 22, 2023 14:35
MDX is a literate programming language that combines markdown with
several JavaScript language features, such as JSX, expressions, and ESM.

*   markdown: <https://commonmark.org/>
*   JavaScript: <https://tc39.es/ecma262/>
*   JSX: <https://facebook.github.io/jsx/>

Previously, there was some temporary duct-tape in place, to map `.mdx`
extensions to markdown: github-linguistGH-4416.
This improved the situation from nothing to something, but has issues,
because the syntax that MDX brings wasn’t highlighted.

With this grammar, those can be properly highlighted.

Related-to: github-linguistGH-4416.
@lildude
Copy link
Member

lildude commented Mar 22, 2023

1. The grammar in question uses \g<name> and later (?<name>...){0} patterns. That way, recursion can be supported: markdown has a couple cases where matching parens are fine, but unmatched ones aren’t. These are oniguruma extensions (https://github.com/kkos/oniguruma/blob/e35581d6568a47a981a57806f239f9de401c6511/doc/RE#LL460C14-L460C14). From what I can read on PCRE, they should also be supported here by GH: http://www.pcre.org/original/doc/html/pcrepattern.html#SEC25. I also find a couple of such uses in grammars that already exist here. But as I can’t easily test GHs internal highlighting, I wanted to make sure: are they supported?

GitHub uses PCRE for performance reasons, so anything that PCRE supports, GitHub supports. The grammar compiler will flag anything that isn't PCRE compatible and any regex errors. You can see examples of this in #3924

2. As MDX, the language in question here, is literate programming, github/markup can’t really render (it would need to evaluate JavaScript to do so). So it’s important that github/markup doesn’t think it’s markdown. Can someone confirm whether this is enough (I think so?), or whether that’s done based on type: markup or ace_mode: markdown or so?

It won't. github/markup will only render whole markup files as HTML if they have one of these extensions. All ```mdx codeblocks within markdown files or GitHub comments, issues and PRs will be rendered as the raw code with syntax highlighting applied.

3. In the grammar repo, I also have a text.md grammar, for plain markdown. It’s a lot better than other grammars that I checked, including atom/language-gfm. My grammar has less bugs (take the sample in this PR for an example of lists and fenced code with tildes going haywire), supports more GFM features (tables, footnotes, plain URLs), and is maintained (atom/language-gfm is archived). I understand markdown is quite important to GitHub. Would it be of interest to swap out atom/language-gfm for this new repo? If there is potential interest, I can open a new discussion for this.

Most definitely! We're always on the lookout for actively maintained grammars, especially if they're better than anything else we're using. Feel free to skip the discussion and go straight to submitting a PR. Include examples where your grammar does things better than the current grammar too.

@wooorm
Copy link
Contributor Author

wooorm commented Mar 23, 2023

github/markup will only render whole markup files as HTML if they have one of these extensions

Hmm, that source seems incorrect: MDX files, since the introduction of .mdx here, also started being rendered. For example: https://github.com/mdx-js/mdx-analyzer/blob/main/fixtures/node16/mixed.mdx. That’s why I wanted to be extra sure!

Feel free to skip the discussion and go straight to submitting a PR. Include examples where your grammar does things better than the current grammar too.

Awesome, will do!
I thought about waiting for this PR to land, as there might be a conflict.
But I think you wait with merging PRs like this until the next release cycle?
So I’ll just go ahead and make a PR :)

@lildude
Copy link
Member

lildude commented May 17, 2023

Hmm, that source seems incorrect: MDX files, since the introduction of .mdx here, also started being rendered. For example: https://github.com/mdx-js/mdx-analyzer/blob/main/fixtures/node16/mixed.mdx.

I've looked into this and it this would be our fault when we merged #4416 😁. Markup uses the language Linguist tells it to and if it matches one of the expected "markups", it'll attempt to render it using the appropriate rendered. Of course, this means when this PR is merged that rendering will disappear.

Copy link
Member

@lildude lildude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks.

Note: this PR will not be merged until close to when the next release is made. See here for more details.

@lildude lildude requested a review from a team as a code owner May 30, 2023 09:16
@lildude lildude added this pull request to the merge queue May 30, 2023
Merged via the queue into github-linguist:master with commit 2c4898a May 30, 2023
@github-linguist github-linguist locked as resolved and limited conversation to collaborators Jun 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants