Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(BA-615): Collect metrics for the RPC server #3555

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

HyeockJinKim
Copy link
Collaborator

resolves #3549 (BA-615)

Checklist: (if applicable)

  • Milestone metadata specifying the target backport version
  • Mention to the original issue
  • Installer updates including:
    • Fixtures for db schema changes
    • New mandatory config options
  • Update of end-to-end CLI integration tests in ai.backend.test
  • API server-client counterparts (e.g., manager API -> client SDK)
  • Test case(s) to:
    • Demonstrate the difference of before/after
    • Demonstrate the flow of abstract/conceptual models with a concrete implementation
  • Documentation
    • Contents in the docs directory
    • docstrings in public interfaces and type annotations

@HyeockJinKim HyeockJinKim self-assigned this Jan 30, 2025
@github-actions github-actions bot added comp:agent Related to Agent component size:M 30~100 LoC labels Jan 30, 2025
@HyeockJinKim HyeockJinKim force-pushed the feat/add-rpc-metrics branch 2 times, most recently from 6d265ad to f605742 Compare January 31, 2025 06:49
duration=duration,
)
return res
except Exception as e:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about handling BaseException here?


def __init__(self) -> None:
self.functions = set()
self._metric_observer = RPCMetricObserver.instance()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it used as a singleton object? Is it expected to use this observer in other context?

Comment on lines +203 to +206
def _collect_metrics(observer: RPCMetricObserver) -> Callable:
def decorator(meth: Callable) -> Callable[[AgentRPCServer, RPCMessage], Any]:
@functools.wraps(meth)
async def _inner(self: AgentRPCServer, *args, **kwargs) -> Any:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can apply more strict type hints here but then we have to update the type hints of RPCFunctionRegistry.__call__().
Let's leave it as a minor issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
comp:agent Related to Agent component size:M 30~100 LoC
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Prometheus metrics for RPC server in agent component
2 participants