Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: migrate to ESM #12

Open
wants to merge 2 commits into
base: staging
Choose a base branch
from
Open

feat: migrate to ESM #12

wants to merge 2 commits into from

Conversation

CMCDragonkai
Copy link
Member

@CMCDragonkai CMCDragonkai commented Aug 13, 2023

Description

Migrating to ESM.

The QueuedTask should be exported now: andywer/threads.js@7ef6eed

Also this is broken currently due to andywer/threads.js#470

Issues Fixed

Tasks

  • 1. Fix threads.js types
  • 2. Test the usage of async-init
  • 3. Create docs

Final checklist

  • Domain specific tests
  • Full tests
  • Updated inline-comment documentation
  • Lint fixed
  • Squash and rebased
  • Sanity check the final build

@ghost
Copy link

ghost commented Aug 13, 2023

👇 Click on the image for a new way to code review

Review these changes using an interactive CodeSee Map

Legend

CodeSee Map legend

@CMCDragonkai
Copy link
Member Author

Not sure if we need to fork this: andywer/threads.js#470

@CMCDragonkai CMCDragonkai self-assigned this Aug 13, 2023
@CMCDragonkai
Copy link
Member Author

Might be able to use:

Use a Package Alias: Some package managers allow aliasing a package to a local directory or version. You could then modify the local copy to your needs.

To bypass it.

Now I think this would require a special "imports" alias and combined with tsconfig paths to hack around the incorrect "exports" key. Better than forking and maintaining it. Although if it could be done, that would be great.

@CMCDragonkai
Copy link
Member Author

Actually I think I need to do a git submodule. That would be the easiest.

@CMCDragonkai
Copy link
Member Author

Ok so trying to use a git submodule can be complicated due to the lack of dependencies being acquired under src/threads.js. And potentially requiring a different set of compilation tools.

Going back to attempting with an import path.

@CMCDragonkai
Copy link
Member Author

Actually even subpath imports does not work because the node_modules wouldn't exist at the relevant location.

The only solution now is to either entirely fork the project or just provide overrides on the types, meaning we type out what ModuleMethods is likely to be.

The problem is none of the types work anymore.

  • QueuedTask
  • ModuleThread
  • ModuleMethods

Because of errors in how threads.js exposes the types.

So we have to define all these types.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Aug 14, 2023

Ok after setting the overriding the types to any, we still end up with a problem. Attempting to create a worker from a file written in TS requires ts-node. So threads.js is basically needing ts-node to execute the file.

type ModuleMethods = {
  [methodName: string]: (...args: any) => any;
};
type ModuleThread<Methods = any> = any;
type QueuedTask<ThreadType, Return> = any;

export type {
  ModuleMethods,
  ModuleThread,
  QueuedTask
};

Of course if I change to just using regular js, it can load the .js file without any transpilation thus avoiding the ts-node requirement.

But then it's not possible to do import threads from 'threads';... probably because it's now running them like CJS code? I'm not entirely sure.

import * as threads from 'threads';

const { Transfer, isWorkerRuntime } = threads;

Is necessary to actually get the constructs necessary, but the workers no longer have any corresponding types.

I think though one could use annotations.

But generally speaking, it's just not a good idea to use typescript based workers atm since we shouldn't be tied to ts-node anyway (even if it is only during development), because after compilation it would all be JS files anyway.

For some reason isWorkerRuntime no longer exists either.


All in all, I don't think as of now js-workers can be converted to ESM simply because threads.js is just not properly exporting its things. And needing to convert to using .ts workers not great either, although that is necessity on ts-node.

I might have to be forced to keep js-workers as CJS, and just import CJS to ESM by doing the trick by importing the default, then pattern matching out of it.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented Aug 14, 2023

Going to try keeping js-workers as CJS, and long term wise look into removing threadsjs and favour of something in our flavour.

Can see https://github.com/piscinajs/piscina for inspiration.

@CMCDragonkai CMCDragonkai mentioned this pull request Aug 14, 2023
9 tasks
@CMCDragonkai
Copy link
Member Author

I think we just remove browser support for the moment, and focus on nodejs worker threads, similar to our project in js-ws, and then slowly add back in webworker (browser support) afterwards. This can radically simplify this project and give us ESM support too. This could be assigned to @addievo.

@CMCDragonkai
Copy link
Member Author

Going over the https://nodejs.org/api/worker_threads.html shows that the worker threads implementation will be quite complex. Here's a brief overview of things that need to be considered:

  1. How MessagePort works - this is basically the communication mechanism between the parent thread and all the worker threads. You have to use to communicate what functions you want to execute, as well as all the results of execution. Remember that the worker threads are like mini-servers, receiving messages asynchronously and handling them. Because execution is potentially asynchronous, you also have to asynchronously manage the results and to send back the results. You have message passing API between the main thread and worker threads.
  2. The creation of a worker involves using the new Worker that is provided by node:worker_threads. This call creates a thread with an existing nodejs runtime. Worker threads are real threads so they do share memory, but access is transferred either by copying or ownership. There's also a SharedArrayBuffer which is really mutable multithreaded buffer, but this no longer easily used in browsers anyway, so transferrable arraybuffers is easier to work with. (Note that in the case of js-quic, if we were using node threads, shared array buffers would work, or we would at the very least need to be able to transfer to a worker and transfer back out).
  3. The code of a worker thread is ESM based with ESM nodejs. So you are passing a file path or a URL, and it's possible that node understands the file path to be ESM native, or understands the URL to actually embed the worker code. There's no native support for TS, any TS should be precompiled to JS, but this does impact the new Worker() file path, which might need to load the .js version. It's possible to use some interfaces types to expose typesafe functionality.
    image
  4. We should be able to take advantage of the latest nodejs capabilities... but also have the common denominator with WebWorker.
  5. There's also a broadcast system that can enable one to many communication.
  6. There may need to be asynchronous initialisation on the worker threads. Generally they can start immediately receiving messages on the message port, however we may need to do any async setup in the worker first. One could imagine a "worker" script hooks like how threads.js has done it, and enable the ability to pass in some async setup code that needs to be done.
  7. Since worker threads are just nodejs runtimes, you can just run arbitrary code, but it is easier to understand how to do this if instead the workers exposes a flat record of function calls to call. The problem with allowing arbitrary function calls is the problem of serialising closures, and this is not a solved problem atm, so instead of trying to do this (I know this was complicated in Haskell), we just say that workers must expose a fixed set of operations, and instead data can be transferred over, and you'd have to mark certain things as transferrable otherwise by default things get copied over (when serialised).
  8. There's alot of edgecases that threads.js covers right now, with webpack bundling, and even electron usage where things are bundled into an .asar file.

Point is, fixing up this worker ecosystem is extremely complicated. The threads.js code is actually complex and difficult to untangle. The fastest solution right now is for upstream to fix their type exports so we can just continue using it... Without which ESM migration won't really work for us. Unless we just switch to using piscena.

This would be significant undertaking - estimated work would have to be 2 - 4 months to build a robust worker system that abides by the rest of PK's principles (I'm comparing it to how complicated js-quic became, but it should be simpler). Will need to schedule this for later after testnet 7.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

1 participant