Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(sentryapp): new SentryAppWebhookRequestsEndpoint control endpoint #81676

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

ameliahsu
Copy link
Member

We need a new control endpoint in order to fetch sentry app requests from all regions.

Note:
Since installation requests are stored in control, we also have to fetch requests from the control buffer. We specifically query the region buffers for non-installation event requests to avoid duplicate requests (in case a region shares the same buffer as control).

@ameliahsu ameliahsu requested review from a team as code owners December 4, 2024 18:14
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Dec 4, 2024
Copy link

codecov bot commented Dec 4, 2024

Codecov Report

Attention: Patch coverage is 99.40120% with 1 line in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
...c/sentry/sentry_apps/api/utils/webhook_requests.py 98.03% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #81676      +/-   ##
==========================================
+ Coverage   80.41%   81.24%   +0.82%     
==========================================
  Files        7241     7280      +39     
  Lines      321573   338135   +16562     
  Branches    20830    20830              
==========================================
+ Hits       258604   274723   +16119     
- Misses      62574    63017     +443     
  Partials      395      395              

Copy link
Member

@cathteng cathteng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think there is some opportunity to condense code here

publish_status = {
"GET": ApiPublishStatus.UNKNOWN,
}
permission_classes = (SentryAppStatsPermission,)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be SentryAppPermission?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SentryAppStatsPermission is the one to use for this endpoint, since we only want the integrator org to be able to get the requests/stats.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there seems to only be a GET for this endpoint, so only the GET scopes would apply. the PARANOID_GET scopes from SentryAppPermission applies to anyone with member and above permissions which also seems appropriate

Comment on lines 71 to 75
event_type = request.GET.get("eventType")
errors_only = request.GET.get("errorsOnly")
org_slug = request.GET.get("organizationSlug")
start_parameter = request.GET.get("start", None)
end_parameter = request.GET.get("end", None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can use a serializer class to validate these rather than have it done inside the endpoint, like this

serializer = ApiTokenSerializer(data=request.data)
if serializer.is_valid():
result = serializer.validated_data

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is for this entire section before the "major changes" part

Comment on lines 117 to 150
if event_type == "installation.created" or event_type == "installation.deleted":

requests.extend(
[
serialize_rpc_sentry_app_request(req)
for req in buffer.get_requests(
event=event_type, errors_only=control_errors_only
)
]
)
elif event_type is None:
control_event_type = [
"installation.created",
"installation.deleted",
]
requests.extend(
[
serialize_rpc_sentry_app_request(req)
for req in buffer.get_requests(
event=control_event_type, errors_only=control_errors_only
)
]
)

# If event type is installation.created or installation.deleted, we don't need to fetch requests from other regions
if event_type != "installation.created" and event_type != "installation.deleted":
for region_name in find_all_region_names():
buffer_requests = app_request_service.get_buffer_requests_for_region(
sentry_app_id=sentry_app.id,
region_name=region_name,
filter=filter,
)
if buffer_requests is not None:
requests.extend(buffer_requests)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a better way to write this? i think there's a lot of repeated code where you extend the list of requests. the first two ifs are REALLY similar other than changing the event

class SentryAppRequestsEndpointV2(SentryAppBaseEndpoint):
owner = ApiOwner.INTEGRATIONS
publish_status = {
"GET": ApiPublishStatus.UNKNOWN,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we plan to publish this in the future, this can be ApiPublishStatus.EXPERIMENTAL

Comment on lines 154 to 158
for i, req in enumerate(requests):
if filter_by_date(req, start_time, end_time) and filter_by_organization(
req, organization
):
filtered_requests.append(BufferedRequest(id=i, data=req))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to filter the requests before we even add them to the list, so we don't have to iterate over them again? this is where i think a helper function would be useful, for this part and the repeated logic above

@@ -2906,6 +2907,11 @@ def create_group_urls(name_prefix: str) -> list[URLPattern | URLResolver]:
SentryAppPublishRequestEndpoint.as_view(),
name="sentry-api-0-sentry-app-publish-request",
),
re_path(
r"^(?P<sentry_app_id_or_slug>[^\/]+)/requests-v2/$",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of encoding requests-v2 in the path, could we instead name this something different? Like request-logs or webhook-responses?

choices=EXTENDED_VALID_EVENTS,
required=False,
)
errors_only = serializers.BooleanField(required=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting a default value here would be nice.

Comment on lines 44 to 45
if "start" in data and "end" in data and data["start"] > data["end"]:
raise serializers.ValidationError("Invalid timestamp (start must be before end).")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we combine this with the validation for validate_end? This seems like a field error on end vs a general exception.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to use validate in order to access multiple fields (in this case start and end) 😔

Comment on lines 120 to 144
def filter_and_append_requests(unfiltered_requests: list[RpcSentryAppRequest]) -> None:
for i, req in enumerate(unfiltered_requests):
if filter_by_date(req, start_time, end_time) and filter_by_organization(
req, organization
):
requests.append(BufferedRequest(id=i, data=req))

def get_buffer_requests_for_control(event_type) -> list[RpcSentryAppRequest]:
control_requests = [
serialize_rpc_sentry_app_request(req)
for req in control_buffer.get_requests(
event=event_type, errors_only=control_errors_only
)
]
filter_and_append_requests(control_requests)

def get_buffer_requests_for_regions() -> list[RpcSentryAppRequest]:
for region_name in find_all_region_names():
buffer_requests = app_request_service.get_buffer_requests_for_region(
sentry_app_id=sentry_app.id,
region_name=region_name,
filter=region_filter,
)
if buffer_requests is not None:
filter_and_append_requests(buffer_requests)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of nesting these closures, could these live in a separate class/service that handles retrieving buffer requests? I could see a case where we use these outside of this endpoint.

Comment on lines 170 to 178
def data_fn(offset, limit):
page_offset = offset * limit
return requests[page_offset : page_offset + limit]

return self.paginate(
request=request,
paginator=GenericOffsetPaginator(data_fn),
on_results=lambda x: serialize(x, request.user, RequestSerializer(sentry_app)),
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this paginate correctly? I'm concerned the results will be weird if an org has a high volume of webhooks and older request buffer responses are constantly rotating off.

Also, do we need to be paginating if our limit is ~100 buffers per region?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just updating here that we decided to remove pagination

from sentry.sentry_apps.models.sentry_app import SentryApp


class RequestSerializer(Serializer):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name is a bit generic, could we make it more specific to SentryApp Webhook Requests?

self.sentry_app = sentry_app

def get_attrs(
self, item_list: Sequence[Any], user: Any, **kwargs: Any
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is is possible to add more narrow typing to item_list and user?

@github-actions github-actions bot added the Scope: Frontend Automatically applied to PRs that change frontend components label Dec 6, 2024
Copy link
Contributor

github-actions bot commented Dec 6, 2024

🚨 Warning: This pull request contains Frontend and Backend changes!

It's discouraged to make changes to Sentry's Frontend and Backend in a single pull request. The Frontend and Backend are not atomically deployed. If the changes are interdependent of each other, they must be separated into two pull requests and be made forward or backwards compatible, such that the Backend or Frontend can be safely deployed independently.

Have questions? Please ask in the #discuss-dev-infra channel.

Copy link

codecov bot commented Dec 6, 2024

Bundle Report

Changes will increase total bundle size by 136.86kB (0.43%) ⬆️. This is within the configured threshold ✅

Detailed changes
Bundle name Size Change
app-webpack-bundle-array-push 32.11MB 136.86kB (0.43%) ⬆️

@ameliahsu ameliahsu changed the title chore(sentry app): new SentryAppRequestsEndpointV2 control endpoint chore(sentryapp): new SentryAppWebhookRequestsEndpoint control endpoint Dec 6, 2024
Copy link
Member

@iamrajjoshi iamrajjoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, good job!

@iamrajjoshi iamrajjoshi requested a review from a team December 18, 2024 17:05
Copy link
Contributor

@Christinarlong Christinarlong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UHHH I FORGOT TO CLICK SUBMIT. I did this on monday so it may be out of date

Comment on lines +50 to +53
def validate_end(self, end):
if end is None:
end = datetime.now(tz=timezone.utc)
return end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: where does this get used ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is called when .is_valid() is called (same as validate above). See docs here

organizations_by_id.get(item.data.organization_id)
if item.data.organization_id
else None
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we know the shape of the data we could TypedDict this instead of Mapping[Any,Any]


def serialize(
self, obj: Any, attrs: Mapping[Any, Any], user: Any, **kwargs: Any
) -> Mapping[str, Any]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above 🥺

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Avoid using Any typing unless absolutely necessary.

def _get_requests(self, event: str | None = None, error: bool = False) -> list[dict[str, Any]]:
def _get_requests(
self, event: str | list[str] | None = None, error: bool = False
) -> list[dict[str, Any]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the list composed of RpcSentryAppRequest or SentryAppRequests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to translate the results of this function call into an RpcSentryAppRequest, so I don't think we can type this?

serialize_rpc_sentry_app_request(req)
for req in control_buffer.get_requests(event=event, errors_only=errors_only)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can ? It looks like we know what's going to be in the request data for that serializer so we could type the return here.


response2 = self.client.get(f"{url}?eventType=issue.assigned", format="json")
assert response2.status_code == 200
assert len(response2.data) == 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be good to test the shape of the data is what we want

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add some tests to for cross region functionality
e.g two different orgs in diff regions do something that adds to buffer

  • validate that the requests are not in the same buffer
  • validate that when we get the requests both reqs show up

etc. ( I can add more cases in a bit)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I talked to Gabe about this and we said we wouldn't test cross region, since it's not easy to mock multiple Redis instances

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, unfortunately I don't think we have the testing infrastructure for this unless the redis key we're generating in each silo is distinct somehow. Worth verifying at least, but I'm not sure it's worth the extra effort when we can manually validate this behavior ourselves using feature flags.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sees RIP

Copy link
Member

@sentaur-athena sentaur-athena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started reviewing but then I realized there are multiple open questions that I also have too. Can you address them and ping the channel again?


@control_silo_endpoint
class SentryAppWebhookRequestsEndpoint(SentryAppBaseEndpoint):
owner = ApiOwner.INTEGRATIONS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can update this to ecosystem. Integrations is the old stuff.


def filter_and_append_requests(unfiltered_requests: list[RpcSentryAppRequest]) -> None:
for i, req in enumerate(unfiltered_requests):
if self._filter_by_date(req, start_time, end_time) and self._filter_by_organization(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic living here is a bit unusual. Could we move this date filtering behavior into the get_buffer_requests_from_control and get_buffer_requests_from_regions helper functions instead? I could also see this being a standalone utility method for general RpcSentryAppRequest filtering.

Pulling more of this business logic out of the controller and putting it into shared modules will help with reusability and testing.


def serialize(
self, obj: Any, attrs: Mapping[Any, Any], user: Any, **kwargs: Any
) -> Mapping[str, Any]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Avoid using Any typing unless absolutely necessary.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, unfortunately I don't think we have the testing infrastructure for this unless the redis key we're generating in each silo is distinct somehow. Worth verifying at least, but I'm not sure it's worth the extra effort when we can manually validate this behavior ourselves using feature flags.

:qparam string start: Optionally specify a date to begin at. Format must be YYYY-MM-DD HH:MM:SS
:qparam string end: Optionally specify a date to end at. Format must be YYYY-MM-DD HH:MM:SS
"""
serializer = IncomingRequestSerializer(data=request.GET)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to gate access here with a feature flag maybe? We may want to validate the behavior of this endpoint before releasing it to the public.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would marking the endpoint as experimental be enough? I think we could update the callee side to use this new endpoint to test then roll the cutover out to few customers before going public.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: We decided to feature flag this endpoint while we test fetching requests from multiple regions

def _get_requests(self, event: str | None = None, error: bool = False) -> list[dict[str, Any]]:
def _get_requests(
self, event: str | list[str] | None = None, error: bool = False
) -> list[dict[str, Any]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can ? It looks like we know what's going to be in the request data for that serializer so we could type the return here.

Comment on lines +57 to +72
def serialize(
self, obj: BufferedRequest, attrs: Mapping[Any, Any], user: Any, **kwargs: Any
) -> SentryAppWebhookRequestSerializerResponse:
organization = attrs.get("organization")
response_code = obj.data.response_code

data: SentryAppWebhookRequestSerializerResponse = {
"webhookUrl": obj.data.webhook_url,
"sentryAppSlug": self.sentry_app.slug,
"eventType": obj.data.event_type,
"date": obj.data.date,
"responseCode": response_code,
}

if organization:
data["organization"] = {"name": organization.name, "slug": organization.slug}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we're potentially missing the error response & request bodies from error webhook requests here

Same in the RpcModel

Copy link
Member Author

@ameliahsu ameliahsu Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We decided to remove some properties for now since they're not shown on the dashboard, we can add them back later if needed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uh can you link me to this discussion ? We shouldn't or we should log them instead since we & customers use that data a lot as a debugging resource

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, do we have customer stories where they inspect the response we are sending rather than the dashboard?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the conversation was on Slack but #81267 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, for this customer issue we needed to see our req bodies to validate our webhooks were working. It's also a customer request in this issue. It's currently not a priority to add the UI to the dashboard but without returning at all to the frontend there is no visibility into the buffer for both customers and us.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general guidance I've been giving @ameliahsu is: exclude it until we actually implement this feature. It should be trivial to add to the response body in the future, but I don't want to be exposing this data right now if we aren't using it. It also gives us an opportunity to analyze the overall performance of this endpoint with smaller payloads to start before adding a bunch of additional data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clarification q, is 'this feature' == the endpoint ? or the UI ? Either way though, I still think it's very important to expose that data for debugging. The comment 2 up, sums up the reasons and related issues/asks. Without this data being exposed debugging webhooks & buffer issues becomes incredibly hard without setting up an integration on that sentry app's org.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I missed that. Fair point, alright let's go ahead and add it then (sorry @ameliahsu)

@getsantry
Copy link
Contributor

getsantry bot commented Jan 11, 2025

This pull request has gone three weeks without activity. In another week, I will close it.

But! If you comment or otherwise update it, I will reset the clock, and if you add the label WIP, I will leave it alone unless WIP is removed ... forever!


"A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀

@getsantry getsantry bot added Stale and removed Stale labels Jan 11, 2025
@ameliahsu ameliahsu added the WIP label Jan 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Scope: Backend Automatically applied to PRs that change backend components Scope: Frontend Automatically applied to PRs that change frontend components WIP
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants