Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mloginfo: Improving mloginfo to include displaying pattern for aggregate operation #861

Open
PrasannaSM opened this issue Apr 27, 2022 · 2 comments

Comments

@PrasannaSM
Copy link

Running mloginfo on a mongo log file with --queries option returns None as pattern for aggregate operation.
The property definition is as follows

    @property
    def pattern(self):
        """Extract query pattern from operations."""
        if not self._pattern:

            # trigger evaluation of operation
            if (self.operation in ['query', 'getmore', 'update', 'remove'] or
                    self.command in ['count', 'findandmodify']):
                self._pattern = self._find_pattern('query: ')
                # Fallback check for q: variation (eg "remove" command in 3.6+)
                if self._pattern is None:
                    self._pattern = self._find_pattern('q: ')
            elif self.command == 'find':
                self._pattern = self._find_pattern('filter: ')
        return self._pattern

There is no case for handling aggregate command in the above snippet. This behavior of mloginfo restricts the context of having a common place where the complete summary (in table form) would be available.

Expected behavior

namespace                  operation    pattern         count    min (ms)    max (ms)    95%-ile (ms)    sum (ms)    mean (ms)    allowDiskUse
test_db.test_coll1         find        {"field1": 1, "field2": 1, "field3": 1, "field4": 1, "field5": 1}          1         470         470           470.0         470        470.0    None
test_db.test_coll2         aggregate         [{"$match": {"field1": 1}}, {"$unwind": 1}, {"$match": {"field2": {"$ne": 1}, "field3": 1, "field4": 1}}, {"$group": {"Count": {"$sum": 1}, "_id": 1}}]         1         252         252           252.0         252        252.0    None

Actual/current behavior

namespace                  operation    pattern         count    min (ms)    max (ms)    95%-ile (ms)    sum (ms)    mean (ms)    allowDiskUse
test_db.test_coll1         find         {"field1": 1, "field2": 1, "field3": 1, "field4": 1, "field5": 1}         1         470         470           470.0         470        470.0    None
test_db.test_coll2         aggregate         None         1         252         252           252.0         252        252.0    None
@stennie
Copy link
Collaborator

stennie commented Apr 27, 2022

Hi @PrasannaSM,

I looked into this previously and unfortunately logged aggregation pipelines didn't seem well suited for a concise summary of query patterns being executed per #338 (comment). This comment also includes some suggestions on how to investigate slow aggregation queries.

It was intentional to use None for the aggregation pattern as output becomes extremely difficult to reduce & read with longer aggregations.

Aside from index usage in initial pipeline stages that fetch data, most of the processing time for an aggregation pipeline will typically be spent on data manipulation rather than queries.

Regards,
Stennie

@PrasannaSM
Copy link
Author

Thanks @stennie

I get where you're coming from. In that case, can't we provide arg support to show aggregate pattern
mloginfo mongo.log --queries --show-aggregate-pattern

Only if --show-aggregate-pattern is provided, we would display pattern. otherwise, it will be morphed as None (current behavior)

Readability issue can be addressed if user can write it to a file instead of viewing a table

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants