[Integration][GCP] Improved quota handling implementation and performance #1362

oiadebayo · 2025-01-30T14:11:03Z

Description

What - This PR attempts to deal with the memory spike observed in version 0.1.98

Why - Queuing tasks in the background while rate limiter suspends the processor leads to a spike in memory usage

How - Improved the performance by setting a threshold for the number of background tasks in the queue so new requests timeout when the queue is bigger than the threshold. The retry mechanism in pub/sub will retry the request.

Type of change

Please leave one option from the following and delete the rest:

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
New Integration (non-breaking change which adds a new integration)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Non-breaking change (fix of existing functionality that will not change current behavior)
Documentation (added/updated documentation)

All tests should be run against the port production environment(using a testing org).

Core testing checklist

Integration able to create all default resources from scratch
Resync finishes successfully
Resync able to create entities
Resync able to update entities
Resync able to detect and delete entities
Scheduled resync able to abort existing resync and start a new one
Tested with at least 2 integrations from scratch
Tested with Kafka and Polling event listeners
Tested deletion of entities that don't pass the selector

Integration testing checklist

Integration able to create all default resources from scratch
Resync able to create entities
Resync able to update entities
Resync able to detect and delete entities
Resync finishes successfully
If new resource kind is added or updated in the integration, add example raw data, mapping and expected result to the examples folder in the integration directory.
If resource kind is updated, run the integration with the example data and check if the expected result is achieved
If new resource kind is added or updated, validate that live-events for that resource are working as expected
Docs PR link here

Preflight checklist

Handled rate limiting
Handled pagination
Implemented the code in async
Support Multi account

Screenshots

Include screenshots from your environment showing how the resources of the integration will look.

API Documentation

Provide links to the API documentation used for this integration.

…andling

mk-armah · 2025-02-05T10:47:31Z

integrations/gcp/gcp_core/helpers/ratelimiter/base.py

@@ -12,7 +11,7 @@
 import asyncio


-_DEFAULT_RATE_LIMIT_TIME_PERIOD: float = 60.0
+_DEFAULT_RATE_LIMIT_TIME_PERIOD: float = 61.0


we already utilize 80%, Is there a need to increase the time ?, and why 61 ?

This is just based totally off observation; I do not have any documentation to back me up but when I run with the 60s time limit, I hit the rate limit. With the extra second, I do not hit a 429 at all, even with larger batches of requests. I tried removing it, and I hit the limit again.

mk-armah · 2025-02-05T10:52:15Z

integrations/gcp/gcp_core/helpers/ratelimiter/base.py

+
+    async def persistent_rate_limiter(
+        self, container_id: str
+    ) -> PersistentAsyncLimiter | AsyncLimiter:


Suggested change

) -> PersistentAsyncLimiter | AsyncLimiter:

) -> PersistentAsyncLimiter:

mk-armah · 2025-02-05T10:53:34Z

integrations/gcp/gcp_core/helpers/ratelimiter/base.py

-        "fetches the rate limiter for the given container"
+    async def _get_limiter(
+        self, container_id: str, persistent: bool = False
+    ) -> AsyncLimiter | PersistentAsyncLimiter:


Suggested change

) -> AsyncLimiter | PersistentAsyncLimiter:

) -> AsyncLimiter:

mk-armah · 2025-02-05T10:54:12Z

integrations/gcp/gcp_core/helpers/ratelimiter/base.py

+        return await self._get_limiter(container_id, persistent=True)
+


cast to PersistentAsyncLimiter

mk-armah · 2025-02-05T10:57:31Z

integrations/gcp/main.py

@@ -38,7 +40,8 @@
    resolve_request_controllers,
 )

-PROJECT_V3_GET_REQUESTS_RATE_LIMITER: AsyncLimiter
+PROJECT_V3_GET_REQUESTS_RATE_LIMITER: PersistentAsyncLimiter | AsyncLimiter


Lets be explicit, PROJECT_V3_GET_REQUESTS_RATE_LIMITER should never be AsyncLimiter

mk-armah · 2025-02-05T10:57:53Z

integrations/gcp/main.py

@@ -38,7 +40,8 @@
    resolve_request_controllers,
 )

-PROJECT_V3_GET_REQUESTS_RATE_LIMITER: AsyncLimiter
+PROJECT_V3_GET_REQUESTS_RATE_LIMITER: PersistentAsyncLimiter | AsyncLimiter


Suggested change

PROJECT_V3_GET_REQUESTS_RATE_LIMITER: PersistentAsyncLimiter | AsyncLimiter

PROJECT_V3_GET_REQUESTS_RATE_LIMITER: PersistentAsyncLimiter

matan84

left comments

matan84 · 2025-02-05T11:10:18Z

integrations/gcp/main.py

+            logger.debug(
+                "Background processing threshold reached. Closing incoming real-time event"
+            )
+            return Response(status_code=http.HTTPStatus.SERVICE_UNAVAILABLE)


Shouldnt we return a 4XX status code and not 5XX? I fear the this might cause the feed flow to stop if we send 5XXs

@oiadebayo 429 would do ? can you test that ?

I experimented with timeout (408) but I will check the 429 too

matan84

left another comment

integrations/gcp/gcp_core/helpers/ratelimiter/base.py

Co-authored-by: Matan <51418643+matan84@users.noreply.github.com>

…andling

mk-armah · 2025-02-07T11:31:59Z

integrations/gcp/gcp_core/helpers/ratelimiter/base.py

+    """
+
+    _global_event_loop: Optional[asyncio.AbstractEventLoop] = None
+    _limiter_instances: dict[tuple[float, float], "PersistentAsyncLimiter"] = {}


why do we need to maintain instances based on max rate and time period ?

mk-armah · 2025-02-07T11:32:58Z

integrations/gcp/gcp_core/helpers/ratelimiter/base.py

+        key: tuple[float, float] = (max_rate, time_period)
+        if key not in cls._limiter_instances:
+            logger.info(
+                f"Creating new persistent limiter for {max_rate} requests per {time_period} sec"
+            )
+            cls._limiter_instances[key] = cls(max_rate, time_period)


why are we maintaining multiple instances of the limiter in a single class instantiation

You're right, maintaining multiple instances isn't necessary here. I'll refactor the code to use a single global PersistentAsyncLimiter instance instead.

oiadebayo added 3 commits January 30, 2025 14:55

Improved quota handling performance

6fb3e55

Updated version

76c3c24

Update CHANGELOG.md

a55cade

oiadebayo requested a review from a team as a code owner January 30, 2025 14:11

github-actions bot added the size/S label Jan 30, 2025

oiadebayo and others added 4 commits January 30, 2025 15:14

Lint fix

86b79e8

Merge branch 'main' into PORT-12220-improving-performance-for-quota-h…

c799d94

…andling

Updated event processing

14d4e7e

Update resource_searches.py

4e1fc9a

github-actions bot added size/M and removed size/S labels Feb 3, 2025

oiadebayo and others added 6 commits February 4, 2025 00:47

Override AsyncLimiter to fix persistence issue

1e927e6

Update base.py

6f9dd72

Merge branch 'main' into PORT-12220-improving-performance-for-quota-h…

a719808

…andling

Update main.py

8b80580

Added threshold to background task queue

2c923ef

cleanup

1317643

github-actions bot added size/L and removed size/M labels Feb 5, 2025

oiadebayo and others added 2 commits February 5, 2025 10:30

Merge branch 'main' into PORT-12220-improving-performance-for-quota-h…

5100a24

…andling

version bump

490649a

mk-armah requested changes Feb 5, 2025

View reviewed changes

matan84 requested changes Feb 5, 2025

View reviewed changes

integrations/gcp/gcp_core/helpers/ratelimiter/base.py Outdated Show resolved Hide resolved

oiadebayo and others added 6 commits February 5, 2025 04:42

Update integrations/gcp/gcp_core/helpers/ratelimiter/base.py

2366490

Co-authored-by: Matan <51418643+matan84@users.noreply.github.com>

Merge branch 'main' into PORT-12220-improving-performance-for-quota-h…

ce27525

…andling

Attended to review comments

da28647

Merge branch 'main' into PORT-12220-improving-performance-for-quota-h…

4d91e26

…andling

Merge branch 'main' into PORT-12220-improving-performance-for-quota-h…

97246ac

…andling

Merge branch 'main' into PORT-12220-improving-performance-for-quota-h…

c7a0c41

…andling

mk-armah requested changes Feb 7, 2025

View reviewed changes

Attended to comment

f53cc12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Integration][GCP] Improved quota handling implementation and performance #1362

[Integration][GCP] Improved quota handling implementation and performance #1362

oiadebayo commented Jan 30, 2025 •

edited

Loading

mk-armah Feb 5, 2025

oiadebayo Feb 5, 2025 •

edited

Loading

mk-armah Feb 5, 2025

mk-armah Feb 5, 2025

mk-armah Feb 5, 2025

mk-armah Feb 5, 2025

mk-armah Feb 5, 2025

matan84 left a comment

matan84 Feb 5, 2025

mk-armah Feb 5, 2025

oiadebayo Feb 5, 2025

matan84 left a comment

mk-armah Feb 7, 2025

mk-armah Feb 7, 2025

oiadebayo Feb 7, 2025

	) -> PersistentAsyncLimiter \| AsyncLimiter:
	) -> PersistentAsyncLimiter:

	) -> AsyncLimiter \| PersistentAsyncLimiter:
	) -> AsyncLimiter:

		return await self._get_limiter(container_id, persistent=True)

	PROJECT_V3_GET_REQUESTS_RATE_LIMITER: PersistentAsyncLimiter \| AsyncLimiter
	PROJECT_V3_GET_REQUESTS_RATE_LIMITER: PersistentAsyncLimiter

[Integration][GCP] Improved quota handling implementation and performance #1362

Are you sure you want to change the base?

[Integration][GCP] Improved quota handling implementation and performance #1362

Conversation

oiadebayo commented Jan 30, 2025 • edited Loading

Description

Type of change

All tests should be run against the port production environment(using a testing org).

Core testing checklist

Integration testing checklist

Preflight checklist

Screenshots

API Documentation

Choose a reason for hiding this comment

oiadebayo Feb 5, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matan84 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

matan84 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oiadebayo commented Jan 30, 2025 •

edited

Loading

oiadebayo Feb 5, 2025 •

edited

Loading