[Discover] Presentation of document and row counts #196444

ryankeairns · 2024-10-15T20:56:35Z

Related: #177156 #166219 #195787

Problem

Users desire to know the total 'hits' (aka documents, rows, etc.) returned by their search/query
In relation, users desire to know what the histogram reflects and why/how it may differ from the number of hits returned

User experience considerations

Presentation of counts are inconsistent between data view and ES|QL modes; they should behave similarly to the degree possible
The term 'Documents' has less relevance in ES|QL mode; we might instead consider a term like 'Rows'
It is unclear what the various values presented in the UI mean (value in tab above data grid; count value when hovering the histogram; LIMIT value in the ES|QL editor, etc.)
For experienced users, the value has moved around for better or worse. It used to be in the histogram section, now it is in the tab
Related, there can be multiple queries happening - one for the histogram, one for the data grid

Technical considerations

( @kertal 's findings below )
In the data view world, we can have the following number of hits

x documents actually returned from ES
x documents matching the query in ES
x downsampled documents were used for the histogram

It’s complicated. Using ES|QL is simpler since it doesn’t seem to be aware of downsampled data, but once it is, same challenge

Current UI

Total hits in tab

Old placement in histogram area (pre tabs)

Values in ES|QL mode

elasticmachine · 2024-10-15T20:56:37Z

Pinging @elastic/kibana-data-discovery (Team:DataDiscovery)

kertal · 2024-10-16T06:41:44Z

I'm adding the testdata for the downsampled data use case (just work in DataView Mode)

Click to expand

``` PUT stats-index { "mappings": { "properties": { "agg_metric": { "type": "aggregate_metric_double", "metrics": [ "min", "max", "sum", "value_count" ], "default_metric": "max" } } } }

PUT stats-index/_doc/1
{
"@timestamp": "2024-10-15T08:00:00Z",
"agg_metric": {
"min": -302.50,
"max": 702.30,
"sum": 200.0,
"value_count": 25
}
}

PUT stats-index/_doc/2
{
"@timestamp": "2024-10-15T09:00:00Z",
"agg_metric": {
"min": -93.00,
"max": 1702.30,
"sum": 300.00,
"value_count": 25
}
}

PUT my_index
{
"mappings": {
"properties": {
"@timestamp": {
"type": "date"
},
"my_histogram": {
"type": "histogram"
},
"my_text": {
"type": "keyword"
}
}
}
}

PUT my_index/_doc/1
{
"@timestamp": "2023-11-23T17:47:23Z",
"my_text": "histogram_1",
"my_histogram": {
"values": [0.1, 0.2, 0.3, 0.4, 0.5],
"counts": [3, 7, 23, 12, 6]
},
"_doc_count": 45
}

PUT my_index/_doc/2
{
"@timestamp": "2023-11-23T17:49:23Z",
"my_text": "histogram_2",
"my_histogram": {
"values": [0.1, 0.25, 0.35, 0.4, 0.45, 0.5],
"counts": [8, 17, 8, 7, 6, 2]
},
"_doc_count": 62
}

PUT my_index/_doc/3
{
"@timestamp": "2023-11-25T17:49:23Z",
"my_text": "histogram_2",
"my_histogram": {
"values": [0.1, 0.25, 0.35, 0.4, 0.45, 0.5],
"counts": [8, 17, 8, 7, 6, 2]
},
"_doc_count": 62
}

PUT my_index/_doc/4
{
"@timestamp": "2023-11-25T18:49:23Z",
"my_text": "histogram_2",
"my_histogram": {
"values": [0.1, 0.25, 0.35, 0.4, 0.45, 0.5],
"counts": [8, 17, 8, 7, 6, 2]
},
"_doc_count": 12
}

PUT my_index/_doc/5
{
"@timestamp": "2023-11-26T18:49:23Z",
"my_text": "histogram_2",
"my_histogram": {
"values": [0.1, 0.25, 0.35, 0.4, 0.45, 0.5],
"counts": [8, 17, 8, 7, 6, 2]
},
"_doc_count": 1
}

PUT my_index/_doc/6
{
"@timestamp": "2023-11-27T18:49:23Z",
"my_text": "histogram_2",
"my_histogram": {
"values": [0.1, 0.25, 0.35, 0.4, 0.45, 0.5],
"counts": [8, 17, 8, 7, 6, 2]
},
"_doc_count": 1
}

PUT my_index/_doc/7
{
"@timestamp": "2023-11-28T18:49:23Z",
"my_text": "histogram_2",
"my_histogram": {
"values": [0.1, 0.25, 0.35, 0.4, 0.45, 0.5],
"counts": [8, 17, 8, 7, 6, 2]
},
"_doc_count": 2323
}

PUT my_index/_doc/8
{
"@timestamp": "2023-11-28T18:49:23Z",
"my_text": "histogram_2",
"my_histogram": {
"values": [0.1, 0.25, 0.35, 0.4, 0.45, 0.5],
"counts": [8, 17, 8, 7, 6, 2]
},
"_doc_count": 23234
}

PUT my_index/_doc/9
{
"@timestamp": "2023-11-28T18:49:23Z",
"my_text": "histogram_2",
"my_histogram": {
"values": [0.1, 0.25, 0.35, 0.4, 0.45, 0.5],
"counts": [8, 17, 8, 7, 6, 2]
},
"_doc_count": 2
}

PUT my_index/_doc/10
{
"@timestamp": "2023-11-28T18:49:23Z",
"my_text": "histogram_2",
"my_histogram": {
"values": [0.1, 0.25, 0.35, 0.4, 0.45, 0.5],
"counts": [8, 17, 8, 7, 6, 2]
},
"_doc_count": 2
}

PUT my_index/_doc/11
{
"@timestamp": "2023-11-29T18:49:23Z",
"my_text": "histogram_2",
"my_histogram": {
"values": [0.1, 0.25, 0.35, 0.4, 0.45, 0.5],
"counts": [8, 17, 8, 7, 6, 2]
},
"_doc_count": 12
}

GET my_index/_search
{
"aggs": {
"histogram_titles": {
"terms": { "field": "my_text" }
}
}
}

</details>

florent-leborgne · 2024-10-18T08:54:59Z

👋 I gathered some thoughts about the wording, depth of information and location of information. I may have missed some occurrences or cases where the UI would look a bit different, but feel free to check it here: https://www.figma.com/design/uHUXd5MGcJWGTwpHwWRpYY/Scratchpad?node-id=370-187&t=ORIHkTsVvYhn2MW6-1

A summary of my thinking here after the background and comments I could read & get:

Use "Results" as a generic term instead of "Documents". Other options like records, hits, or keeping documents seem less accurate. I'm also not a fan of "rows" as for me it rather refers to the container rather than to the content.
Show numbers that correspond to what the user is looking at:
- Seeing the histogram, it feels to me like it's a valuable information to see a "Total count" of what's currently represented in the visualization. I don't think we need a name for (h)it as just Count or Total count feels simple enough. If we really need one, "results" again?
- To me the number in the tab name doesn't make any sense. As a user, I'm opening what is basically a big table. So I'd rather expect to see the number of entries, but in this case... it's not obvious. For that reason, I think it'd be good to somehow attach this information to the pagination, either right before the table, or under? I always found Gmail to do it clearly, like Results 101-200 of 1888 for example. If for some reason/setting/LIMIT parameter the number of results that are loaded into the table is different from the total number of results, the current behaviour with the information on the last page looks like a good option to me. ES|QL mode doesn't seem to have pagination by default, so the same example with the default limit could just be Results 1-1000 of 1888 of the first (which is also the last) page.
Add context where there can be confusion. It was noted a few times that users can get confused as to what these numbers mean exactly. We could add tooltips or popovers to provide that information, or at least context. 2 important ones based on my suggestions above would be to have one next to the total count in the histogram, and one next to the 1-100 of 1888. On each we could explain where these numbers come from, and what can influence them, talk about downsampling or aggregations.

For details and nuances, please check my Figma scratchpad linked earlier.
I'm probably missing some things and my understanding isn't super deep yet so please keep me honest here. Those were the things that struck me the most as a newbie looking at the UI

kertal · 2024-10-25T08:59:04Z

Let's start with applying the renaming of the "Documents" tab, it's a good issue to start:
#197779

ryankeairns added Feature:Discover Discover Application Feature:ES|QL ES|QL related features in Kibana Team:DataDiscovery Discover App Team (Document Explorer, Saved Search, Surrounding documents, Data, DataViews) labels Oct 15, 2024

ryankeairns assigned l-suarez Oct 15, 2024

kertal added the impact:medium Addressing this issue will have a medium level of impact on the quality/strength of our product. label Oct 16, 2024

kertal mentioned this issue Oct 16, 2024

[ES|QL] Should calculate true value for total number of docs for given date range #195787

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Discover] Presentation of document and row counts #196444

[Discover] Presentation of document and row counts #196444

ryankeairns commented Oct 15, 2024 •

edited by kertal

Loading

elasticmachine commented Oct 15, 2024

kertal commented Oct 16, 2024

florent-leborgne commented Oct 18, 2024 •

edited

Loading

kertal commented Oct 25, 2024

[Discover] Presentation of document and row counts #196444

[Discover] Presentation of document and row counts #196444

Comments

ryankeairns commented Oct 15, 2024 • edited by kertal Loading

Problem

User experience considerations

Technical considerations

Current UI

elasticmachine commented Oct 15, 2024

kertal commented Oct 16, 2024

florent-leborgne commented Oct 18, 2024 • edited Loading

kertal commented Oct 25, 2024

ryankeairns commented Oct 15, 2024 •

edited by kertal

Loading

florent-leborgne commented Oct 18, 2024 •

edited

Loading