-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ruby SDK - Phase 1 - Initial Comment Period #92
Conversation
|
||
These are the general requirements of the project | ||
|
||
* Ruby >= 3.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering what we've done to validate this is acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All older versions of Ruby are EOL; I don't think we need to support EOL software: https://www.ruby-lang.org/en/downloads/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct: https://en.wikipedia.org/wiki/History_of_Ruby. But all of this is still pending user feedback and so we'll likely be surveying the community on how acceptable they consider this.
such thing as a `Metric` (i.e. base class for counter, gauge, and histogram). So it could be | ||
`Temporalio::SearchAttributes` (the collection) and `Temporalio::Metric` (the base class), and they are not modules | ||
but classes with the nested pieces underneath like we do elsewhere. | ||
* No compatibility with any existing Ruby SDK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* No compatibility with any existing Ruby SDK | |
* No compatibility with _any_ existing Ruby SDK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure the emphasis is needed
* 💭 Why only a single thread? We expect callbacks to Ruby from Rust to be short (hopefully only a block invocation | ||
that is a `Queue.push`) so we don't expect making all of these run serial is an issue. However if it does become an | ||
issue, we can use multiple threads and a Rust approach which allows multiple channel consumers. | ||
* Users can create their own runtime with their own telemetry options and such and pass that to clients. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never seen the need to create more than a single runtime in our SDKs. Seems like we can avoid this extra API surface to expose accepting a runtime and just have all constructs reference a singleton.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even for things we never think we'll need more than one of, I personally strongly dislike creating singletons. In my experience, it's much easier to take something that's not-a-singleton and turn it into a singleton than it is to do the inverse.
At worst, that extra API surface becomes noise (e.g. passing Runtime::default
or whatever everywhere), but if you ultimately DO need multiple runtimes later, you're going to be really grateful you have it—adding it later would be a really painful breaking change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They will reference a lazy global which is basically a singleton. This is deemed acceptable in other core-based SDKs and so should be here too, no need to make Ruby special in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I'm fine with it being a singleton/lazy global under the hood. I just don't want us to design ourselves into a corner.
ruby/sdk-phase-1.md
Outdated
|
||
```ruby | ||
# Connect client | ||
client = Client.connect('localhost:7233') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
namespace defaults to default
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes (same with most other SDKs, granted we are requiring target host because we found defaulting to localhost to never be what prod code would do)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could argue for the same reason that we should just require namespace, since prod is very unlikely to use that, but not a huge deal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can definitely be argued, but I think enough of our SDKs default to "default" and some prod clusters do use "default" that yeah it's not a big deal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are going to update this to require the namespace. The default of default
is unreasonable for cloud and many others.
ruby/sdk-phase-1.md
Outdated
50/50 call, but most users will probably add an encoding impl instead of override at this level. This is admittedly | ||
a slight deviation from other SDKs. | ||
* `Temporalio::Converter::PayloadConverter::Encoding` - Base unimplemented encoding converter | ||
* ❓ Would we prefer `Temporalio::Converter::EncodingPayloadConverter`? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't know what the Ruby standard is here but as you know in our other SDKs we've chosen the longer name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Ruby standard here is to keep it shorter if already in the module name. Ruby is the only language I can think of that does this commonly. While you can require a file, there is no "importing" a module, it must always be qualified (but people often alias it).
* It has instance methods for info, heartbeat, etc. | ||
* It has a class method for `current` that returns from thread/fiber-local and class method shortcuts for all instance | ||
methods. | ||
* 💭 Why class methods equivalents? `Temporalio::Activity::Context.heartbeat` is easier to read than |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other SDKs, heartbeat
, log
, info
, etc... is on the activity
module.
Why not do that here too? E.g:
Temporalio::Activity::Context.current
Temporalio::Activity::heartbeat
Temporalio::Activity::info
Temporalio::Activity::log
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because Temporalio::Activity
is the class that is being extended by the activity implementation and we don't want people to think that this is only usable in that extension nor do we want to clutter the inherited class with methods like these when they are part of the context (whether those are instance methods or class methods and we will have both of the same name).
SDKs are expected to be able to pass around a context to other functions if needed in the language's preferred context approach (so in Python we tell users to copy the contextvars
to get this but other languages often have a context object)
* Ruby has a `JSON` module in the standard library, but by default it's only primitives, arrays, hashes, etc and not | ||
classes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about JSON Protobuf, I'm assuming not supported...
Also, what are the implications for cross-language integration. For example, an activity in python that returns a json serialized object called from a Ruby workflow. Is the workaround to manually create the ruby object from json?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about JSON Protobuf, I'm assuming not supported...
JSON protobuf will be supported out of the box
Also, what are the implications for cross-language integration. For example, an activity in python that returns a json serialized object called from a Ruby workflow. Is the workaround to manually create the ruby object from json?
Yes, that will be a Ruby hash (just like it would be a JS object). If you want to convert it to a Ruby class you would have to do that yourself. We can consider more advanced JSON-to-class conversion mechanisms later, but by default Ruby doesn't really have any without the "addition" approach which is not very cross-language compatible.
ruby/sdk-phase-1.md
Outdated
* Activity classes are instantiated at registration time, not for each invocation. | ||
* 💭 Why? Existing Temporal Ruby SDK does instantiate for each activity task, but this discourages use of shared |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The temptation for programmers is to add state (in the activity object) that lingers across multiple activity invocations, something really fragile... Is there a way to freeze that activity object so that it can only be used in a pure functional style?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's my concern too. I think instantiating them per-invocation is probably safer, and sharing things like dependencies could be done w/ either static fields or simply by passing them in each time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically in .NET if you use DI w/ scoped activities it does instantiate them per activity task, but overall in our SDKs you are allowed to share object/outside-of-function state across activity invocations without using globals. I am a hesitant to make Ruby the special case of forcing globals to share state (or some homemade DI).
But we can probably have ways to let the user choose. For instance, we can accept two ways of providing an activity: providing an instance and providing a class. I wouldn't want the activity class form to instantiate per attempt by default because it'd be confusing to have whether you did instance vs class to have different lifetimes. But we could support some setting on the activity class to say they want it per attempt and hope they have no constructor parameters. Or we'd have to devise some DI because in general it is difficult for Temporal to control lifetimes of user activity objects due to statefulness needs.
The current Temporal Ruby SDK does instantiate the activity for every task and never took into account the need to do things like share a database client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💎
ruby/sdk-phase-1.md
Outdated
* 💭 Why not top-level? Cluttering top-level module makes discoverability difficult and API docs unclear. | ||
* 💭 Why no `Temporalio::Common`? Ruby expects to have common things where they make sense instead of under a `Common` | ||
module. | ||
* ❓ To we want plural, e.g. `Temporalio::SearchAttributes` and `Temporalio::Metrics`? While there is no such thing as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* ❓ To we want plural, e.g. `Temporalio::SearchAttributes` and `Temporalio::Metrics`? While there is no such thing as | |
* ❓ Do we want plural, e.g. `Temporalio::SearchAttributes` and `Temporalio::Metrics`? While there is no such thing as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since they are not modules but classes, I think SearchAttributes
and Metric
make sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But in the case of SearchAttributes
(plural) and Metric
(singular), they may in fact be classes (the former being the search attribute collection and the latter being the base metric class).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think you maybe misread me, I said because they are classes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did, sorry!
ruby/sdk-phase-1.md
Outdated
|
||
```ruby | ||
# Connect client | ||
client = Client.connect('localhost:7233') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could argue for the same reason that we should just require namespace, since prod is very unlikely to use that, but not a huge deal.
ruby/sdk-phase-1.md
Outdated
included in conversion and most regular users won't ever use them. | ||
* 💭 Why duplicate words instead of `Temporalio::Converter::Data`? `Temporalio::Converter` is not a base class and a | ||
`DataConverter` is a completely separate thing from, say, a `PayloadConverter`. | ||
* ❓ Would we prefer `Temporalio::Conversion` as the module name? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might call it Serialization
but, that might be too much of a departure from the others. Otherwise I think probably just stick with Converter
or Converters
since it's the most similar to existing naming.
classes, they are more easily referenced, a bit more easily typed in RBS, and metadata can be applied more easily at | ||
the class level. This comes at a cost of not being able to easily share state across classes, but users can provide | ||
something in the constructor multiple activities use. Open to considering activity methods instead. | ||
* ❓ Would we prefer activities as methods knowing they may not be able to be typed/referenced well? We can't really |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think classes makes good sense for Ruby. The typing is good and Ruby is all about the OOP nonsense. The biggest downside is users thinking that storing class state is somehow magically going to be synchronized across workers or something, but I'm not hugely concerned about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 FWIW the more research I do the less I am convinced that any good typing will occur, but still there are enough reasons for activities as classes
ruby/sdk-phase-1.md
Outdated
* Activity classes are instantiated at registration time, not for each invocation. | ||
* 💭 Why? Existing Temporal Ruby SDK does instantiate for each activity task, but this discourages use of shared |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's my concern too. I think instantiating them per-invocation is probably safer, and sharing things like dependencies could be done w/ either static fields or simply by passing them in each time.
ruby/sdk-phase-1.md
Outdated
activities reference a `Fiber` executor. | ||
* 💭 Why wait until worker run to fail? Workers can be created outside of an async environment, it's just important | ||
that they are run within an async environment if there are async activities. | ||
* ❓ Or should we just enforce this at worker instantiation time too? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's best if doable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is doable, it just means you have to instantiate the worker in the async context as well (which is probably not too much to ask)
* Activities need to work in both ways and _do_ have to make a distinction (discussed later). | ||
* 💭 Why async-capable? Core and all SDKs are very parallel and so we want to give that benefit to users if they want | ||
it. | ||
* 💭 Why thread-blocking? That's the most common way Ruby is used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Does either sync or async depend on the other? Or are the implementations independent?
- Does it make sense to do one first, release it, get feedback, etc. and then do the other later? (Thinking here about how we did async in Python first and then had to go back and do sync later.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does either sync or async depend on the other? Or are the implementations independent?
Not following what is meant by "depend on the other", but they are independent for the most part. Basically it's just the difference of starting a new thread or a new fiber when invoking the activity, not much else changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not following what is meant by "depend on the other",
Is sync implemented in terms of—or does it reuse code written for—async, or vice versa? I think you've answered my first question here though.
For my second question, I'm still trying to understand what is MVP-shaped and what isn't—e.g. would we be comfortable doing a first/early release with one but not the other?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, missed this question. This is integral behavior and most of the same code will work async/sync so there is no real separating it. I do think there is separation on activity type, but that is not a heavy burden worth separating out.
* 💭 Why only a single thread? We expect callbacks to Ruby from Rust to be short (hopefully only a block invocation | ||
that is a `Queue.push`) so we don't expect making all of these run serial is an issue. However if it does become an | ||
issue, we can use multiple threads and a Rust approach which allows multiple channel consumers. | ||
* Users can create their own runtime with their own telemetry options and such and pass that to clients. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even for things we never think we'll need more than one of, I personally strongly dislike creating singletons. In my experience, it's much easier to take something that's not-a-singleton and turn it into a singleton than it is to do the inverse.
At worst, that extra API surface becomes noise (e.g. passing Runtime::default
or whatever everywhere), but if you ultimately DO need multiple runtimes later, you're going to be really grateful you have it—adding it later would be a really painful breaking change.
ruby/sdk-phase-1.md
Outdated
included in conversion and most regular users won't ever use them. | ||
* 💭 Why duplicate words instead of `Temporalio::Converter::Data`? `Temporalio::Converter` is not a base class and a | ||
`DataConverter` is a completely separate thing from, say, a `PayloadConverter`. | ||
* ❓ Would we prefer `Temporalio::Conversion` as the module name? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a convention for such names in Ruby? Wonder if Convert
would also work here. (I do like avoiding nouns for module names; I have no strong preferences here though.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is not a strict convention. Most module and class names are nouns.
* 💭 Why classes instead of methods/functions like every other SDK? There are tradeoffs to both approaches. As | ||
classes, they are more easily referenced, a bit more easily typed in RBS, and metadata can be applied more easily at | ||
the class level. This comes at a cost of not being able to easily share state across classes, but users can provide | ||
something in the constructor multiple activities use. Open to considering activity methods instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wouldn't be for an MVP, but could we eventually provide both if we wanted to? It doesn't seem like there's an obvious trade-off winner here.
Another plus in favor of functions is they are lighter-weight to define; you don't have to make a class just to wrap what is effectively a function anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can technically do this, and could technically provide activities as classes/interfaces/structs in every other SDK but we have tried to stick with a single way that is best with the language. I think we should only do one here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah if we want to pick a single way, then I'm in favor of doing the most-flexible thing. Long as we have room to add the other way later if we learn that users are really opinionated about it.
* Following the API of the current Temporal Ruby SDK, workers can be run individually, but we will also provide a class | ||
method to run several at once. | ||
* 💭 Why is something needed to run multiple workers for the user? Because in Ruby they don't have easy ways of | ||
running multiple blocking things at once without external libraries or forcing threads on users. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because in Ruby they don't have easy ways of running multiple blocking things at once without external libraries or forcing threads on users.
What do you have in mind here that this is as opposed to? I.e. can you expand a little to clarify this, what Ruby is missing that other languages have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In most of our SDK languages there are better ways to run multiple workers at once without eating a thread per worker. JS is always single threaded, Python has asyncio.gather
, Go has goroutines, and .NET has tasks. In Java we have a WorkerFactory
that lets you run multiple workers at once because it lacks good language-level reuse-thread capabilities. Ruby has the same problem. So I was thinking a WorkerPool
or similar that can run many workers in a single blocking run call.
else if we can help it. | ||
* 💭 Why? Our dependencies as a library have transitive effects on our users, so we have a responsibility to be | ||
lightweight. | ||
* Must be both thread-blocking/thread-safe and async-capable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean exactly by "async-capable"? (The term is used in a few places)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Ruby there are threads and fibers (basically coroutines). By async-capable I mean fiber-capable. So you can use https://github.com/socketry/async and, say, use the client to wait on 100 workflow results without using a thread each. But many Ruby users are totally fine using a thread each and we need to support that too.
This is very much like Python, we just chose not to provide a synchronous version of the Python client (though we still could if we wanted).
* Activities need to work in both ways and _do_ have to make a distinction (discussed later). | ||
* 💭 Why async-capable? Core and all SDKs are very parallel and so we want to give that benefit to users if they want | ||
it. | ||
* 💭 Why thread-blocking? That's the most common way Ruby is used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Context related to this topic, ruby @ Stripe was single-threaded by default. Multithreaded carve-outs did happen but were relatively unusual.
This meant some libraries weren't designed to be thread-safe and used things like class variables.
Default when transitioning to temporal was therefore to have many workflow pollers but only one activity poller per process, which was inefficient, but we added a config flag folks could turn on to make their workers run multiple activity pollers. I really wanted to make the multi-poller version the default but didn't do so before I left.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is why we plan to also support single-threaded/coroutine-based implementations. It's important to us at Temporal that people be able to do a lot of concurrent things with the SDK in just a single thread. Granted the current design is queue-pop-based which makes it work with fiber-based async. With activities we will support custom executors, but I wasn't planning on supporting other async approaches for clients (e.g. event machine) but if there is enough desire for pluggable-non-fiber-based async for clients, we can look into that.
Multithreaded carve-outs did happen but were relatively unusual.
These carve-outs were likely required for to use the existing Ruby SDKs for Temporal. We aim not to force that kind of thing on users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the carve-outs, I'm referring to the "pre-temporal world." Later, with the old SDK, we simply limited to one activity poller by default, which fixed any race conditions. It was a cautious approach, as most customers probably would have been fine with more pollers (once we fixed our mongo library to be fiber-safe), but there was FUD.
I have updated the proposal with the following:
|
Summary