Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevents single split searches from different leaf_search from interleaving #5509

Merged
merged 2 commits into from
Oct 23, 2024

Conversation

fulmicoton
Copy link
Contributor

@fulmicoton fulmicoton commented Oct 22, 2024

Before this PR, we just used a semaphore to acquire a permit and start a new tokio task to run our single split search.

In pseudo code, a leaf_search would look like:

for split in splits {
    let permit = semaphore.acquire().await;
    tokio::spawn(async move {
       single_split_search(split);
       drop(permit)
    })
}

The problem with this is that it allows interleaving split search from one search request with another one.

This interleaving strongly impacts search latency. For instance, one can imagine two queries A and B with 3 splits each. Executing as follows
| A | A | A | B | B | B |
offers a much short latency for A than
| A | B | B | A | B | A |

This PR also adds two metrics to monitor the number of queue single split search.

@fulmicoton fulmicoton changed the title Prevents single split searches from different leaf_search from Prevents single split searches from different leaf_search from interleaving Oct 22, 2024
@fulmicoton fulmicoton force-pushed the prevent-split-search-interleaving branch 5 times, most recently from 5567ee9 to e708950 Compare October 22, 2024 07:55
interleaving.

Before this PR, we just used a semaphore to acquire a permit and
start a new tokio task to run our single split search.

In pseudo code, a leaf_search would look like:
```
for split in splits {
    let permit = semaphore.acquire().await;
    tokio::spawn(async move {
       single_split_search(split);
       drop(permit)
    })
}
```

The problem with this is that it allows interleaving split search from
one search request with another one.

This interleaving strongly impacts search latency.
For instance, one can imagine two queries A and B with 3 splits each.
Executing as follows
| A | A | A | B | B | B |
offers a much short latency for A than
| A | B | B | A | B | A |

This PR also adds two metrics to monitor the number of queue single
split search.
@fulmicoton fulmicoton force-pushed the prevent-split-search-interleaving branch from e708950 to 61483b9 Compare October 22, 2024 08:09
@fulmicoton fulmicoton marked this pull request as ready for review October 22, 2024 08:38
Copy link

github-actions bot commented Oct 22, 2024

On SSD:

Average search latency is 0.99x that of the reference (lower is better).
Ref run id: 3940, ref commit: 5e5b360
Link

On GCS:

Average search latency is 1.02x that of the reference (lower is better).
Ref run id: 3941, ref commit: 5e5b360
Link

Copy link
Contributor

@trinity-1686a trinity-1686a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the changes make sense, but some struct/params names make the code more confusing than it could be

fn get_permits(
&mut self,
num_permits: usize,
inner: &Arc<Mutex<InnerSearchPermits>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think inner is a very confusing name, the InnerSearchPermits is already "the inner thing", so an Arc<Mutex<_>> should outer, or a handle

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite get the confusion/comment.
The "outter" is SearchPermits.

I have not found a common practise on how to name the object of this pattern in Rust (in C++ pimpl is common, even if this is not the common usage of the pimpl pattern).

What would be a better name? would inner_arc help a bit?

///
/// Requests are served in order.
#[derive(Clone)]
pub struct SearchPermits {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a misnomer, it sounds like this is a akin to a tokio SemaphorePermit with num_permits > 1, but this is actually a semaphore, or a PermitProvider, or something of the like

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah you are right.


pub struct SearchPermit {
_ongoing_gauge_guard: GaugeGuard<'static>,
inner_opt: Option<Weak<Mutex<InnerSearchPermits>>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe there is need for Weak here: InnerSearchPermits doesn't hold any SearchPermit (a oneshot sender doesn't hold the thing to transmit, the oneshot receiver do)

i find this type, Option<Weak<Mutex<InnerSearchPermits>>> rather confusing, in part because SearchPermits isn't the right name imo, in part because the Option<> very much looks like a way to store a bool alongside the rest of the structure to reduce the size of the struct by one usize. I'm not against using an Option that way, but if that's the reason, it ought to be written down, and if it isn't the goal, then i'm missing what the goal of this Option is, rather than a disable_drop: bool

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the Option. Setting it to None seems easier to "proofread". I agree that it makes the code less readable however, as the intent is way less clearer than with the boolean solution.

I'll add some comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you ended up changing it to a bool. If you think an Option is better, you can put it back, just add a 1-2 line comment on the purpose of None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I thought after packing everything as drop_without_recycling_permit it was fine like that.

@fulmicoton fulmicoton force-pushed the prevent-split-search-interleaving branch 4 times, most recently from ded67a9 to a3d81f0 Compare October 23, 2024 01:48
@fulmicoton fulmicoton force-pushed the prevent-split-search-interleaving branch from a3d81f0 to e028746 Compare October 23, 2024 05:48
@fulmicoton fulmicoton merged commit b045483 into main Oct 23, 2024
5 checks passed
@fulmicoton fulmicoton deleted the prevent-split-search-interleaving branch October 23, 2024 08:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants