Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ruby SDK - Phase 1 - Initial Comment Period #92

Closed
wants to merge 4 commits into from

Conversation

cretz
Copy link
Member

@cretz cretz commented May 24, 2024

⚠️ Next PR round opened at #93

Summary

  • This is the proposal for the first phase of the Ruby SDK.
  • See it rendered.
  • Much of this is inspired by the current Temporal Ruby SDK (and its proposals which were moved to a sub directory) and our other SDKs.
  • Use this PR for comments/discussion. Don't be afraid of making too many comments; if it gets too noisy, I'll just create another PR.
  • Any points of contention will be discussed in-person by the SDK team and a decision will be made.

@cretz cretz requested a review from a team May 24, 2024 15:48

These are the general requirements of the project

* Ruby >= 3.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering what we've done to validate this is acceptable.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All older versions of Ruby are EOL; I don't think we need to support EOL software: https://www.ruby-lang.org/en/downloads/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct: https://en.wikipedia.org/wiki/History_of_Ruby. But all of this is still pending user feedback and so we'll likely be surveying the community on how acceptable they consider this.

such thing as a `Metric` (i.e. base class for counter, gauge, and histogram). So it could be
`Temporalio::SearchAttributes` (the collection) and `Temporalio::Metric` (the base class), and they are not modules
but classes with the nested pieces underneath like we do elsewhere.
* No compatibility with any existing Ruby SDK
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* No compatibility with any existing Ruby SDK
* No compatibility with _any_ existing Ruby SDK

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure the emphasis is needed

* 💭 Why only a single thread? We expect callbacks to Ruby from Rust to be short (hopefully only a block invocation
that is a `Queue.push`) so we don't expect making all of these run serial is an issue. However if it does become an
issue, we can use multiple threads and a Rust approach which allows multiple channel consumers.
* Users can create their own runtime with their own telemetry options and such and pass that to clients.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never seen the need to create more than a single runtime in our SDKs. Seems like we can avoid this extra API surface to expose accepting a runtime and just have all constructs reference a singleton.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even for things we never think we'll need more than one of, I personally strongly dislike creating singletons. In my experience, it's much easier to take something that's not-a-singleton and turn it into a singleton than it is to do the inverse.

At worst, that extra API surface becomes noise (e.g. passing Runtime::default or whatever everywhere), but if you ultimately DO need multiple runtimes later, you're going to be really grateful you have it—adding it later would be a really painful breaking change.

Copy link
Member Author

@cretz cretz May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They will reference a lazy global which is basically a singleton. This is deemed acceptable in other core-based SDKs and so should be here too, no need to make Ruby special in this case.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm fine with it being a singleton/lazy global under the hood. I just don't want us to design ourselves into a corner.


```ruby
# Connect client
client = Client.connect('localhost:7233')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

namespace defaults to default?

Copy link
Member Author

@cretz cretz May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes (same with most other SDKs, granted we are requiring target host because we found defaulting to localhost to never be what prod code would do)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could argue for the same reason that we should just require namespace, since prod is very unlikely to use that, but not a huge deal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can definitely be argued, but I think enough of our SDKs default to "default" and some prod clusters do use "default" that yeah it's not a big deal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are going to update this to require the namespace. The default of default is unreasonable for cloud and many others.

50/50 call, but most users will probably add an encoding impl instead of override at this level. This is admittedly
a slight deviation from other SDKs.
* `Temporalio::Converter::PayloadConverter::Encoding` - Base unimplemented encoding converter
* ❓ Would we prefer `Temporalio::Converter::EncodingPayloadConverter`?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't know what the Ruby standard is here but as you know in our other SDKs we've chosen the longer name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Ruby standard here is to keep it shorter if already in the module name. Ruby is the only language I can think of that does this commonly. While you can require a file, there is no "importing" a module, it must always be qualified (but people often alias it).

* It has instance methods for info, heartbeat, etc.
* It has a class method for `current` that returns from thread/fiber-local and class method shortcuts for all instance
methods.
* 💭 Why class methods equivalents? `Temporalio::Activity::Context.heartbeat` is easier to read than
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other SDKs, heartbeat, log, info, etc... is on the activity module.
Why not do that here too? E.g:

Temporalio::Activity::Context.current

Temporalio::Activity::heartbeat
Temporalio::Activity::info
Temporalio::Activity::log

Copy link
Member Author

@cretz cretz May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because Temporalio::Activity is the class that is being extended by the activity implementation and we don't want people to think that this is only usable in that extension nor do we want to clutter the inherited class with methods like these when they are part of the context (whether those are instance methods or class methods and we will have both of the same name).

SDKs are expected to be able to pass around a context to other functions if needed in the language's preferred context approach (so in Python we tell users to copy the contextvars to get this but other languages often have a context object)

Comment on lines +304 to +305
* Ruby has a `JSON` module in the standard library, but by default it's only primitives, arrays, hashes, etc and not
classes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about JSON Protobuf, I'm assuming not supported...

Also, what are the implications for cross-language integration. For example, an activity in python that returns a json serialized object called from a Ruby workflow. Is the workaround to manually create the ruby object from json?

Copy link
Member Author

@cretz cretz May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about JSON Protobuf, I'm assuming not supported...

JSON protobuf will be supported out of the box

Also, what are the implications for cross-language integration. For example, an activity in python that returns a json serialized object called from a Ruby workflow. Is the workaround to manually create the ruby object from json?

Yes, that will be a Ruby hash (just like it would be a JS object). If you want to convert it to a Ruby class you would have to do that yourself. We can consider more advanced JSON-to-class conversion mechanisms later, but by default Ruby doesn't really have any without the "addition" approach which is not very cross-language compatible.

ruby/sdk-phase-1.md Show resolved Hide resolved
Comment on lines 358 to 359
* Activity classes are instantiated at registration time, not for each invocation.
* 💭 Why? Existing Temporal Ruby SDK does instantiate for each activity task, but this discourages use of shared

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The temptation for programmers is to add state (in the activity object) that lingers across multiple activity invocations, something really fragile... Is there a way to freeze that activity object so that it can only be used in a pure functional style?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's my concern too. I think instantiating them per-invocation is probably safer, and sharing things like dependencies could be done w/ either static fields or simply by passing them in each time.

Copy link
Member Author

@cretz cretz May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically in .NET if you use DI w/ scoped activities it does instantiate them per activity task, but overall in our SDKs you are allowed to share object/outside-of-function state across activity invocations without using globals. I am a hesitant to make Ruby the special case of forcing globals to share state (or some homemade DI).

But we can probably have ways to let the user choose. For instance, we can accept two ways of providing an activity: providing an instance and providing a class. I wouldn't want the activity class form to instantiate per attempt by default because it'd be confusing to have whether you did instance vs class to have different lifetimes. But we could support some setting on the activity class to say they want it per attempt and hope they have no constructor parameters. Or we'd have to devise some DI because in general it is difficult for Temporal to control lifetimes of user activity objects due to statefulness needs.

The current Temporal Ruby SDK does instantiate the activity for every task and never took into account the need to do things like share a database client.

Copy link
Member

@Sushisource Sushisource left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💎

ruby/sdk-phase-1.md Outdated Show resolved Hide resolved
* 💭 Why not top-level? Cluttering top-level module makes discoverability difficult and API docs unclear.
* 💭 Why no `Temporalio::Common`? Ruby expects to have common things where they make sense instead of under a `Common`
module.
* ❓ To we want plural, e.g. `Temporalio::SearchAttributes` and `Temporalio::Metrics`? While there is no such thing as
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
*To we want plural, e.g. `Temporalio::SearchAttributes` and `Temporalio::Metrics`? While there is no such thing as
*Do we want plural, e.g. `Temporalio::SearchAttributes` and `Temporalio::Metrics`? While there is no such thing as

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since they are not modules but classes, I think SearchAttributes and Metric make sense

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in the case of SearchAttributes (plural) and Metric (singular), they may in fact be classes (the former being the search attribute collection and the latter being the base metric class).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think you maybe misread me, I said because they are classes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did, sorry!


```ruby
# Connect client
client = Client.connect('localhost:7233')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could argue for the same reason that we should just require namespace, since prod is very unlikely to use that, but not a huge deal.

included in conversion and most regular users won't ever use them.
* 💭 Why duplicate words instead of `Temporalio::Converter::Data`? `Temporalio::Converter` is not a base class and a
`DataConverter` is a completely separate thing from, say, a `PayloadConverter`.
* ❓ Would we prefer `Temporalio::Conversion` as the module name?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might call it Serialization but, that might be too much of a departure from the others. Otherwise I think probably just stick with Converter or Converters since it's the most similar to existing naming.

classes, they are more easily referenced, a bit more easily typed in RBS, and metadata can be applied more easily at
the class level. This comes at a cost of not being able to easily share state across classes, but users can provide
something in the constructor multiple activities use. Open to considering activity methods instead.
* ❓ Would we prefer activities as methods knowing they may not be able to be typed/referenced well? We can't really
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think classes makes good sense for Ruby. The typing is good and Ruby is all about the OOP nonsense. The biggest downside is users thinking that storing class state is somehow magically going to be synchronized across workers or something, but I'm not hugely concerned about that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 FWIW the more research I do the less I am convinced that any good typing will occur, but still there are enough reasons for activities as classes

Comment on lines 358 to 359
* Activity classes are instantiated at registration time, not for each invocation.
* 💭 Why? Existing Temporal Ruby SDK does instantiate for each activity task, but this discourages use of shared
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's my concern too. I think instantiating them per-invocation is probably safer, and sharing things like dependencies could be done w/ either static fields or simply by passing them in each time.

ruby/sdk-phase-1.md Show resolved Hide resolved
activities reference a `Fiber` executor.
* 💭 Why wait until worker run to fail? Workers can be created outside of an async environment, it's just important
that they are run within an async environment if there are async activities.
* ❓ Or should we just enforce this at worker instantiation time too?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's best if doable

Copy link
Member Author

@cretz cretz May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is doable, it just means you have to instantiate the worker in the async context as well (which is probably not too much to ask)

Comment on lines +43 to +46
* Activities need to work in both ways and _do_ have to make a distinction (discussed later).
* 💭 Why async-capable? Core and all SDKs are very parallel and so we want to give that benefit to users if they want
it.
* 💭 Why thread-blocking? That's the most common way Ruby is used.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Does either sync or async depend on the other? Or are the implementations independent?
  • Does it make sense to do one first, release it, get feedback, etc. and then do the other later? (Thinking here about how we did async in Python first and then had to go back and do sync later.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does either sync or async depend on the other? Or are the implementations independent?

Not following what is meant by "depend on the other", but they are independent for the most part. Basically it's just the difference of starting a new thread or a new fiber when invoking the activity, not much else changes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not following what is meant by "depend on the other",

Is sync implemented in terms of—or does it reuse code written for—async, or vice versa? I think you've answered my first question here though.

For my second question, I'm still trying to understand what is MVP-shaped and what isn't—e.g. would we be comfortable doing a first/early release with one but not the other?

Copy link
Member Author

@cretz cretz Jul 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, missed this question. This is integral behavior and most of the same code will work async/sync so there is no real separating it. I do think there is separation on activity type, but that is not a heavy burden worth separating out.

ruby/sdk-phase-1.md Outdated Show resolved Hide resolved
ruby/sdk-phase-1.md Show resolved Hide resolved
ruby/sdk-phase-1.md Outdated Show resolved Hide resolved
* 💭 Why only a single thread? We expect callbacks to Ruby from Rust to be short (hopefully only a block invocation
that is a `Queue.push`) so we don't expect making all of these run serial is an issue. However if it does become an
issue, we can use multiple threads and a Rust approach which allows multiple channel consumers.
* Users can create their own runtime with their own telemetry options and such and pass that to clients.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even for things we never think we'll need more than one of, I personally strongly dislike creating singletons. In my experience, it's much easier to take something that's not-a-singleton and turn it into a singleton than it is to do the inverse.

At worst, that extra API surface becomes noise (e.g. passing Runtime::default or whatever everywhere), but if you ultimately DO need multiple runtimes later, you're going to be really grateful you have it—adding it later would be a really painful breaking change.

included in conversion and most regular users won't ever use them.
* 💭 Why duplicate words instead of `Temporalio::Converter::Data`? `Temporalio::Converter` is not a base class and a
`DataConverter` is a completely separate thing from, say, a `PayloadConverter`.
* ❓ Would we prefer `Temporalio::Conversion` as the module name?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a convention for such names in Ruby? Wonder if Convert would also work here. (I do like avoiding nouns for module names; I have no strong preferences here though.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is not a strict convention. Most module and class names are nouns.

Comment on lines +351 to +354
* 💭 Why classes instead of methods/functions like every other SDK? There are tradeoffs to both approaches. As
classes, they are more easily referenced, a bit more easily typed in RBS, and metadata can be applied more easily at
the class level. This comes at a cost of not being able to easily share state across classes, but users can provide
something in the constructor multiple activities use. Open to considering activity methods instead.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wouldn't be for an MVP, but could we eventually provide both if we wanted to? It doesn't seem like there's an obvious trade-off winner here.

Another plus in favor of functions is they are lighter-weight to define; you don't have to make a class just to wrap what is effectively a function anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can technically do this, and could technically provide activities as classes/interfaces/structs in every other SDK but we have tried to stick with a single way that is best with the language. I think we should only do one here.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah if we want to pick a single way, then I'm in favor of doing the most-flexible thing. Long as we have room to add the other way later if we learn that users are really opinionated about it.

ruby/sdk-phase-1.md Outdated Show resolved Hide resolved
ruby/sdk-phase-1.md Outdated Show resolved Hide resolved
* Following the API of the current Temporal Ruby SDK, workers can be run individually, but we will also provide a class
method to run several at once.
* 💭 Why is something needed to run multiple workers for the user? Because in Ruby they don't have easy ways of
running multiple blocking things at once without external libraries or forcing threads on users.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because in Ruby they don't have easy ways of running multiple blocking things at once without external libraries or forcing threads on users.

What do you have in mind here that this is as opposed to? I.e. can you expand a little to clarify this, what Ruby is missing that other languages have.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In most of our SDK languages there are better ways to run multiple workers at once without eating a thread per worker. JS is always single threaded, Python has asyncio.gather, Go has goroutines, and .NET has tasks. In Java we have a WorkerFactory that lets you run multiple workers at once because it lacks good language-level reuse-thread capabilities. Ruby has the same problem. So I was thinking a WorkerPool or similar that can run many workers in a single blocking run call.

else if we can help it.
* 💭 Why? Our dependencies as a library have transitive effects on our users, so we have a responsibility to be
lightweight.
* Must be both thread-blocking/thread-safe and async-capable

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean exactly by "async-capable"? (The term is used in a few places)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Ruby there are threads and fibers (basically coroutines). By async-capable I mean fiber-capable. So you can use https://github.com/socketry/async and, say, use the client to wait on 100 workflow results without using a thread each. But many Ruby users are totally fine using a thread each and we need to support that too.

This is very much like Python, we just chose not to provide a synchronous version of the Python client (though we still could if we wanted).

Comment on lines +43 to +46
* Activities need to work in both ways and _do_ have to make a distinction (discussed later).
* 💭 Why async-capable? Core and all SDKs are very parallel and so we want to give that benefit to users if they want
it.
* 💭 Why thread-blocking? That's the most common way Ruby is used.
Copy link

@drewhoskins-temporal drewhoskins-temporal May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Context related to this topic, ruby @ Stripe was single-threaded by default. Multithreaded carve-outs did happen but were relatively unusual.
This meant some libraries weren't designed to be thread-safe and used things like class variables.
Default when transitioning to temporal was therefore to have many workflow pollers but only one activity poller per process, which was inefficient, but we added a config flag folks could turn on to make their workers run multiple activity pollers. I really wanted to make the multi-poller version the default but didn't do so before I left.

Copy link
Member Author

@cretz cretz May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why we plan to also support single-threaded/coroutine-based implementations. It's important to us at Temporal that people be able to do a lot of concurrent things with the SDK in just a single thread. Granted the current design is queue-pop-based which makes it work with fiber-based async. With activities we will support custom executors, but I wasn't planning on supporting other async approaches for clients (e.g. event machine) but if there is enough desire for pluggable-non-fiber-based async for clients, we can look into that.

Multithreaded carve-outs did happen but were relatively unusual.

These carve-outs were likely required for to use the existing Ruby SDKs for Temporal. We aim not to force that kind of thing on users.

Copy link

@drewhoskins-temporal drewhoskins-temporal Jun 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the carve-outs, I'm referring to the "pre-temporal world." Later, with the old SDK, we simply limited to one activity poller by default, which fixed any race conditions. It was a cautious approach, as most customers probably would have been fine with more pollers (once we fixed our mongo library to be fiber-safe), but there was FUD.

@cretz
Copy link
Member Author

cretz commented Jul 10, 2024

I have updated the proposal with the following:

  • Clarify why we need typing at all
  • Use Temporalio::SearchAttributes as both the collection and class under which all nested search attribute things reside
  • Use Temporalio::Metric as both the base metric and class under which all nested metric things reside
  • Clarified use of Options structs on client/connection
  • Clarified there will be a CloudOperationsClient
  • Made namespace a required field instead of defaulting to default
  • Changed Temporalio::Converter module to Temporalio::Converters
  • Changed Temporalio::Converters::PayloadConverter to be a simple base class instead of having it impl composite, and created Temporalio::Converters::PayloadConverter::Composite
  • Updated activity lifecycle to say we will instantiate activities on each attempt if registered as a class, or we will just call method on existing instance if registered as an instance
  • Clarified that activity executors can validate activities at registration time instead of runtime, which means that workers have to be created in an async context if they have any async-context-needed activities (as opposed to just run in an async context)

@cretz
Copy link
Member Author

cretz commented Jul 10, 2024

Closing this PR in favor of #93 for a fresh round of comments. Will link back to this one for posterity, but unaware of any unresolved issues here. Anyone here can feel free to comment on #93.

@cretz cretz closed this Jul 10, 2024
@cretz cretz mentioned this pull request Jul 10, 2024
@cretz cretz deleted the ruby-phase-1 branch October 7, 2024 19:34
@cretz cretz mentioned this pull request Oct 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants