Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URL punycode differs from nodejs / chrome behaviour #1223

Open
amardeep opened this issue Mar 17, 2023 · 6 comments
Open

URL punycode differs from nodejs / chrome behaviour #1223

amardeep opened this issue Mar 17, 2023 · 6 comments

Comments

@amardeep
Copy link

Consider the following url which has non-ascii characters: https://𝚍𝚒𝚜𝚌𝚘𝚛𝚍.gg

While trying to parse this for the hostname, both nodejs and chrome return ascii string discord.gg but corejs returns xn--ci2hbbs5ase.gg

Here is the code:

import configurator from 'core-js-pure/configurator.js';

configurator({
    // By default polyfills are not used if they are available natively.
    usePolyfill: ['URL'], // Override that behaviour for URL.
});

import URL from 'core-js-pure/web/url.js'; // For URL

const url = new URL('https://𝚍𝚒𝚜𝚌𝚘𝚛𝚍.gg');
console.log(url.hostname); 
@zloirock
Copy link
Owner

zloirock commented Mar 17, 2023

Yes, I can confirm it. core-js URL punycode logic is not perfect (and I'm not sure that a complete acceptable fix for that is possible).

I can work on this issue only after some days, so if someone wanna work on it before - feel free.

@tasawar-hussain
Copy link

@zloirock It looks interesting, I can start looking into it, if you haven't already

@zloirock
Copy link
Owner

@tasawar-hussain 👍

@ehoogeveen-medweb
Copy link

I don't know if it would be useful (as it is written in C++), but Node.js recently switched to ada for URL parsing, and this uses idna for converting between unicode and ascii.

Maybe some inspiration could be taken from their utf32_to_punycode implementation, which seems relatively short and free of dependencies (though obviously JS doesn't start from UTF32).

@iTsingchen
Copy link

I'm trying to use pdfjs on the lower version of Chrome. There is a piece of code used to determine whether the worker src is of the same origin. When using the blob url as the worker src, it will be judged as false. Here is an example below.

https://github.com/mozilla/pdf.js/blob/63371eaed8326f1ba4d4cdf6a1360a9333bd0bcf/src/display/api.js#L2029-L2041

      this._isSameOrigin = (baseUrl, otherUrl) => {
        let base;
        try {
          base = new URL(baseUrl);
          if (!base.origin || base.origin === "null") {
            return false; // non-HTTP url
          }
        } catch {
          return false;
        }
        const other = new URL(otherUrl, base);
        return base.origin === other.origin;
      };

https://stackblitz.com/edit/vitejs-vite-kekzvf?embed=1&file=url.js

image

@zloirock
Copy link
Owner

zloirock commented Aug 1, 2024

@iTsingchen could you create a separate issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants