Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clean up workers #1005

Merged
merged 17 commits into from
Apr 29, 2020
Merged

Conversation

andyscott
Copy link
Contributor

@andyscott andyscott commented Feb 19, 2020

Description

Simplifies and cleans up the workers and the shared worker infrastructure.

First of all, I believe the diff makes the changes look complicated when it's mostly just refactoring and some signature changes.

A hopefully exhaustive list of changes:

Updates to the worker base:

  • GenericWorker and Processor are collapsed into a single Worker base.
  • Basic tests are added for the new Worker base
  • All stdout/stderr/stdin wrapping is moved to the worker base. Previously some of this logic was handled by each worker implementation.
  • The appendToString and merge methods were only used by the Scalac worker implementation, so they were moved out of the worker base.
  • Exit trap handling was added. This makes it safer to wrap other libraries that may attempt to exit and inadvertently shut down a worker process.
  • The main interface was changed from passing a java.util.List<String> to a String array to make Scala and Java both feel like first class citizens when writing worker implementations.

Updates to worker implementations:

  • ScalacProcessor and ScalaCInvoker are collapsed into ScalacWorker. Scalac worker specific methods were moved out of the shared worker base and into the Scalac worker.
  • ScalafmtWorker and ScalaformatRunner are collapsed into ScalafmtWorker. Exception handling is now managed by the worker base.
  • ScroogeWorker and ScroogeGenerator are collapsed into ScroogeWorker. Exception handling is now managed by the worker base.
  • ScalaPBWorker and ScalaPGBenerator are collapsed into ScalaPBWorker. Exception handling and stdout/stderr setup are now managed by the worker base.
  • JacocoInstrumenter was mostly unchanged as the worker implementation already aligned well with the simplified worker base.

Motivation

This isn't critical or necessary work, but it does provide a few niceties:

  • Scala and Java are now both first class citizens for writing workers (due to use of String arrays instead of Java lists).
  • Previously each worker leveraged the worker base a bit differently. Now it's all the same, so there's no ambiguity about which pattern to follow when writing a new worker implementation.
  • Cleanup! Less code and fewer classes if you exclude the introduction of new of tests. Shared code should go in the shared placed, and worker specific code belongs with that worker.

I know I said I would hold off on any cleanup. This is perhaps my only exception, as it's code split out from the existing PR #959 and leverages some learnings from when I implemented the Jacoco worker.

@andyscott andyscott requested a review from ittaiz as a code owner February 19, 2020 21:36
@andyscott andyscott changed the title clean up worker clean up workers Feb 19, 2020
@andyscott andyscott force-pushed the andyscott/clean-up-worker branch from 99a08a3 to 2cc952c Compare February 19, 2020 23:28
@andyscott andyscott changed the title clean up workers [wip] clean up workers Feb 19, 2020
@andyscott andyscott changed the title [wip] clean up workers clean up workers Feb 19, 2020
@ittaiz
Copy link
Member

ittaiz commented Feb 20, 2020

I’ll take a look, can you share a bit more info on what you changed in the design and why it’s easier?
There are a lot of details in the PR and from a quick glance I didn’t see yet why it’s easier

@andyscott
Copy link
Contributor Author

Yep! I'll update the original message.

Copy link
Contributor

@borkaehw borkaehw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it's simpler to merge GenericWorker and Processor, but I will defer to Andy for more explanation.

Just came to my mind. If we are refactoring (if it counts) worker, do we consider to rewrite it in Scala?

@andyscott
Copy link
Contributor Author

andyscott commented Feb 20, 2020

Just came to my mind. If we are refactoring (if it counts) worker, do we consider to rewrite it in Scala?

You'd need a way to bootstrap a Scala compiler to compile the worker base itself.

In higherkindness/rules_scala there was a bootstrap compiler backend that shelled out to scalac directly. We don't have that here-- and I'm not convinced that it's worth it-- so there has to be a worker base in Java.

@borkaehw
Copy link
Contributor

You'd need a way to bootstrap a Scala compiler to compile the worker base itself.

It makes sense and thanks for the explanation.

@ittaiz
Copy link
Member

ittaiz commented Feb 21, 2020

Thanks Andy for the detailed description!
It definitely makes my work easier...

Copy link
Member

@ittaiz ittaiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.
I added some comments and questions.
I hope I didn't miss any nuances in the move between the GenericWorker and the Worker because I found some changes and maybe missed others.
In tests we trust ;)

src/java/io/bazel/rulesscala/worker/Worker.java Outdated Show resolved Hide resolved
ByteArrayOutputStream outStream = new ByteArrayOutputStream();
PrintStream out = new PrintStream(outStream);

System.setIn(new ByteArrayInputStream(new byte[0]));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this needed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And also how does it work with you not calling System.in later on to get this input stream?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a comment, but basically: there is no stdin. So we attach stdin to an empty buffer and call it a day.

If I ever get around to finishing the standalone worker tests it would make this more obvious.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment added.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that Worker.Interface implementations shouldn't access stdin because that's the transport layer of the persistent worker itself? And so you "null it out" so they won't mess things up?
If so I'd love for the comment to reflect that because I didn't understand what "can't support" means

@ittaiz ittaiz mentioned this pull request Feb 21, 2020
@borkaehw
Copy link
Contributor

borkaehw commented Mar 5, 2020

Hi I am wondering what's the status of this PR? I am interested in moving forward on the discussion of #959.

@ittaiz
Copy link
Member

ittaiz commented Mar 5, 2020 via email

@borkaehw
Copy link
Contributor

borkaehw commented Mar 5, 2020

Lucid is still on the way to migrate to bazelbuild/rules_scala, the major blocker now is the scala_test framework since we use specs2. #959 should fix it then we can start the migration process step by step.

I would say this PR is not blocking us but #959 is. But I suppose #959 is blocked by this PR.

@ittaiz
Copy link
Member

ittaiz commented Mar 5, 2020

Just to make sure- you’re not using specs2 junit runner, right?

@borkaehw
Copy link
Contributor

borkaehw commented Mar 6, 2020

I don't think so.

ByteArrayOutputStream outStream = new ByteArrayOutputStream();
PrintStream out = new PrintStream(outStream);

System.setIn(new ByteArrayInputStream(new byte[0]));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that Worker.Interface implementations shouldn't access stdin because that's the transport layer of the persistent worker itself? And so you "null it out" so they won't mess things up?
If so I'd love for the comment to reflect that because I didn't understand what "can't support" means

src/java/io/bazel/rulesscala/worker/WorkerTest.java Outdated Show resolved Hide resolved
Comment on lines +96 to +97
out.flush();
outStream.reset();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like I'm jerking your chain a bit but I was actually talking only about the gc line as that is what the old implementation did.
doing flush and reset in a finally block feels a bit adventurous :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think this is reasonable! This way we guarantee each new work request doesn't get any leftover output if a previous work request caused a crash.

@borkaehw
Copy link
Contributor

@andyscott Friendly ping here, do you get a chance to take another look?

@borkaehw
Copy link
Contributor

@andyscott Friendly ping here again (sorry for bothering). We would like to see this moving forward, we are also happy to help if you are busy. Thanks.

@ittaiz
Copy link
Member

ittaiz commented Apr 13, 2020 via email

@andyscott
Copy link
Contributor Author

andyscott commented Apr 29, 2020

I've merged master and addressed all recent (😆) feedback. Hopefully I didn't miss anything!

Copy link
Member

@ittaiz ittaiz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 merging

@ittaiz ittaiz merged commit 73c0dbb into bazelbuild:master Apr 29, 2020
@andyscott andyscott deleted the andyscott/clean-up-worker branch April 29, 2020 16:16
SocksDevil pushed a commit to SocksDevil/rules_scala that referenced this pull request Jul 6, 2020
* clean up workers

* whoops!

* git grep --name-only 'String args\[\]' | xargs sed -i 's|String args\[\]|String[] args|'

* subclass ByteArrayOutputStream to allow shrinking of internal buffer

* gc after each work request

* split workerMain implementation into two private methods for readability

* add comment about stdin

* add a test for the buffer

* improve tests

* wrap gc and friends in the finally clause

* comments

* make the worker public

* cleanup

* comments

* add stdin test

* clean up test
gergelyfabian pushed a commit to gergelyfabian/rules_scala that referenced this pull request May 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants