3033 Introduced V4 RECAP Search API #3975

albertisfu · 2024-04-15T23:11:33Z

This PR introduces version 4 of the RECAP Search API as outlined in #3033, which will serves as the base for other version 4 API search types.

Here are the main features of this endpoint:

Search types:

The v4 RECAP Search API supports three different search types. All of them operate similarly in terms of queries, they support the same queries as the frontend, but they differ in how results are displayed. For example, the Docket type "d" can match dockets by RD fields, although RD fields are not displayed. The RECAP_DOCUMENT type "rd" can match RDs by docket fields that are indexed within each RD.

RECAP "r"

This is the main type and it mimics the RECAP Search results in the frontend.
The objects estructure looks like this:

{
            "assignedTo": null,
            "assigned_to_id": null,
            "attorney": [],
            "attorney_id": [],
            "caseName": "Bear River Band of Rohnerville Rancheria v. California Department of Social Services",
            "case_name_full": "",
            "cause": "42:1983 Civil Rights Act",
            "chapter": null,
            "court": "District Court, S.D. New York",
            "court_citation_string": "S.D.N.Y.",
            "court_exact": "nysd",
            "court_id": "nysd",
            "dateArgued": null,
            "dateFiled": null,
            "dateTerminated": null,
            "date_created": "2024-04-03T19:37:37.412570Z",
            "docketNumber": "4:23-cv-01809",
            "docket_absolute_url": "/docket/181/bear-river-band-of-rohnerville-rancheria-v-california-department-of-social/",
            "docket_id": 181,
            "firm": [],
            "firm_id": [],
            "jurisdictionType": "",
            "juryDemand": "",
            "more_docs": false,
            "pacer_case_id": 411140,
            "party": [],
            "party_id": [],
            "recap_documents": [
                {
                    "absolute_url": "/docket/181/54/bear-river-band-of-rohnerville-rancheria-v-california-department-of-social/",
                    "attachment_number": null,
                    "cites": [],
                    "description": "ORDER  by Judge Haywood S. Gilliam, Jr. GRANTING   53   Stipulation to Extend Plaintiffs' Deadline to File Second Amended Complaint. Amended Pleadings due by 4/15/2024.  (ndr, COURT STAFF) (Filed on 4/1/2024)",
                    "docket_entry_id": 15081,
                    "document_number": 54,
                    "document_type": "PACER Document",
                    "entry_date_filed": "2024-04-01",
                    "entry_number": 1,
                    "filepath_local": null,
                    "id": 15082,
                    "is_available": false,
                    "pacer_doc_id": "035024238118",
                    "page_count": null,
                    "short_description": "",
                    "snippet": "",
                    "timestamp": "2024-04-11T07:22:22.188855Z"
                }
            ],
            "referredTo": null,
            "referred_to_id": null,
            "suitNature": "Civil Rights: Other",
            "timestamp": "2024-04-11T07:22:22.144353Z",
            "trustee_str": null
        }

The outer level displays the Docket fields.
The recap_documents field displays up to 5 RECAPDocuments that matched the query.
If a result contains more than 5 matched RDs, more_docs will be true, indicating there are additional RDs matched by the query. Otherwise, it is false.
Date values are shown as date objects without a timezone.
Datetime values are displayed as ISO-8601 datetime in UTC.

DOCKETS "d"
This search type only display docket fields without the "recap_documents" or more_docs fields.

{
            "assignedTo": null,
            "assigned_to_id": null,
            "attorney": [],
            "attorney_id": [],
            "caseName": "Harris v. Broomfield",
            "case_name_full": "",
            "cause": "42:1983 Prisoner Civil Rights",
            "chapter": null,
            "court": "District Court, N.D. California",
            "court_citation_string": "N.D. Cal.",
            "court_exact": "cand",
            "court_id": "cand",
            "dateArgued": null,
            "dateFiled": null,
            "dateTerminated": null,
            "date_created": "2024-04-12T17:18:01.859134Z",
            "docketNumber": "4:21-cv-00283",
            "docket_absolute_url": "/docket/195/harris-v-broomfield/",
            "docket_id": 195,
            "firm": [],
            "firm_id": [],
            "jurisdictionType": "",
            "juryDemand": "",
            "pacer_case_id": 371855,
            "party": [],
            "party_id": [],
            "referredTo": null,
            "referred_to_id": null,
            "suitNature": "Prisoner:  Prison Condition",
            "timestamp": "2024-04-12T17:18:01.909951Z",
            "trustee_str": null
        }

Regarding the d and r types, I noticed that the parties and attorneys fields can be massive in some cases, which can make some responses quite large.
Should we consider displaying a maximum number of parties and attorneys?

RECAP_DOCUMENT "rd"
This search type only display RECAPDocuments fields.

{
   "absolute_url":"/docket/180/275/rice-v-city-and-county-of-san-francisco/",
   "attachment_number":null,
   "cites":[
      
   ],
   "description":" Order by Judge Laurel Beeler regarding  233  Bill of Costs. In the attached order, the court taxes the full amount of claimed costs ($19,469.61).  (lblc1, COURT STAFF) (Filed on 3/31/2024)Any non-CM/ECF Participants have been served by First Class Mail to the addresses of record listed on the Notice of Electronic Filing (NEF)",
   "docket_entry_id":15080,
   "document_number":275,
   "document_type":"PACER Document",
   "entry_date_filed":"2024-03-31",
   "entry_number":1,
   "filepath_local":"recap/dev.gov.uscourts.cand.345347/gov.uscourts.cand.345347.275.0.pdf",
   "id":15081,
   "is_available":true,
   "pacer_doc_id":"035024237367",
   "page_count":6,
   "short_description":"",
   "snippet":"                                             Case 3:19-cv-04250-LB Document 275 Filed 03/31/24 Page 1 of 6\n\n\n\n\n                                   1\n\n                                   2\n\n                                   3\n\n                                   4\n\n                                   5\n\n                                   6\n\n                                   7\n\n                                   8                                     UNITED STATES DISTRICT COURT\n\n                    ",
   "timestamp":"2024-04-11T07:22:22.123649Z"
}

I realized that this document type can be useful to accomplish the same results of the docket_id query in the frontend, for instance a query like this:
?order_by=score+desc&type=rd&q=docket_id:1

Will return all RDs that belong to that docket_id. So I did not add the docket_id query to the r type, which would increase the number of nested documents from 5 to 100, maybe the rd type is just better for the same objective?

One question here, would it be necessary to add some Docket fields to this serializer so users can easily identify the parent docket? Perhaps adding the docket_id?

Results Count

Since the total count of results is no longer required for computing pagination, an additional count query is not required. Instead, the count is taken from the main query. With the count limit of 10,000, we can show the actual number of results if they're less than 10,000.

So the count key looks like this:

"count": {
        "exact": 144,
        "more": false
    }

exact is the number of results matched by the query, up to 10,000 items.
more indicates if there are more than 10,000 documents it'll True.

Suggestions are welcome if you have a better way to display the exact count and indicate that there are more results.

Cursor pagination and sorting

As requested in #3645, the V4 now uses cursor pagination. The cursor paginator is custom made to work alongside the ES search_after parameter, enhancing performance during deep pagination. However the architecture of our ESCursorPagination class follows the standards of CursorPaginator.

For the cursor paginator to function, it is mandatory to set a sorting key used as the "cursor" in the ES request. The supported sortings are:

 "score desc" # Default
  "dateFiled desc"
  "dateFiled asc"
  "entry_date_filed asc"
  "entry_date_filed desc"

Additionally, to avoid pagination inconsistencies due to repeated values like scores or dates, a secondary sorting key, which must be unique for each result, is required. For the 'r' and 'd' types, this key is docket_id, and for the 'rd' type, it is the RD id. It's important to note that using two sorting keys can lead to discrepancies between sorting in the frontend and in the API v4, even though the primary sorting field remains the same. The introduction of the secondary sorting key will sort results with duplicated sorting values in a docket_id desc or id desc order, acting as a tiebreaker to ensure consistent results across pages. In the frontend, the order of documents with the same sorting values is displayed arbitrarily.

For date sorting such as 'dateFiled', a workaround was necessary regarding sorting and the search_after request. The issue arises when Dockets with a None 'dateFiled' are indexed as null in ES. However, when using this field for sorting, the null fields are represented as -9,223,372,036,854,775,808, which is the long.min_value in JAVA, causing an illegal value error when sent as part of the search_after parameter.

The workaround applied was to use the function score (with a few tweaks) we are currently using for the entry_date_filed sorting in the frontend. Thus, when a document with a 'dateFiled' of None is in the results, its sorting value is 0 instead of long.min_value.

So when sorting by either:

"dateFiled desc"
"dateFiled asc"
"entry_date_filed asc"
"entry_date_filed desc"

Results where the sorting field is None will be shown at the end, regardless of the order (desc or asc).

By default, ES does not provide a "search_before" parameter for backward pagination. A different approach was required to implement cursor backward pagination. It uses the same search_after approach, but when going backwards, the sorting keys are inverted, and the item selected as the "search_after" is the first item on the page, allowing it to indeed go back to the previous page. A final step is required, as the results for this backward query are returned in the reverse order of the original one. It is necessary to invert the results on that page to achieve the original order.

Results per page are controlled by the setting: SEARCH_API_PAGE_SIZE, which defaults to 20.

random sorting
The random sorting key is being omitted from this PR, as it currently uses a sorting script instead of a function score, so it would require a change in approach to work alongside the search_after parameter and might not function properly due to the randomness of the search_after parameter. Therefore, as we agreed, it will not be implemented at this time.

In general, cursor pagination is working as expected, and results are consistent across pages, even in scenarios where new items are indexed or removed. However, there can be some corner cases where the cursor pagination can lead to inconsistent results, for instance, if the last or first document on the current page used as the cursor is updated before moving to the next page and the field used as the sorting key (like 'dateFiled') or if sorting by relevance, the update affects the document score. This can lead to inconsistencies in results when moving to the next or previous page, where the updated document or documents can be displayed again. To solve this issue, ES documentation recommends to use a Point in time. I'll open an issue to describe how it works in detail in case it is required to implement in the future.

The cursor is a base-64 encoded string that looks like:
cursor=cz0yLjIwNzg0ODUmcz0xNjYmdD1k

It contains the following parameters:

search_after: The ES search_after parameter.
reverse: True if performing backward pagination, False if going forward.
search_type: The search type to which the cursor belongs.

If an invalid cursor string is sent, the response will contain the following body:

{
    "detail": "Invalid cursor"
}

Also, if a user switches to a different search type, for instance, if the original request was performed for the "r" type and then it's changed to "d" without cleaning the current cursor, the Invalid cursor error will be shown to avoid pagination inconsistencies due to the current cursor not matching the sorting values of the new search type.

On every ES request, 'page_size + 1' documents are requested to check whether there are more results and to determine whether to display a next or previous page.

The next and previous links look like these:

"next": "http://localhost:8000/api/rest/v4/search/?cursor=cz0yLjIwNzg0ODUmcz0xNDYmdD1k&order_by=score+desc&type=d",
"previous": "http://localhost:8000/api/rest/v4/search/?cursor=cz0yLjIwNzg0ODUmcz0xNjUmcj0xJnQ9ZA%3D%3D&order_by=score+desc&type=d",

Highlighting on demand.

By default, highlighting in the v4 Search API is disabled, providing a performance boost to the requests. When highlighting is disabled in the RECAP results, the plain text snippet (first 500 characters) is extracted directly from the database for the results on a page. This is because highlighting is required to retrieve the 'no_match' fragment and to avoid retrieving the entire plain text, which can be expensive. Thus, to fully benefit from disabling highlighting, data extraction from the database is necessary.

To enable highlighting, users should pass the highlight=on parameter in the request.

Highlighted fields include:
Dockets

"assignedTo",
"caseName",
"cause",
"court_citation_string",
"docketNumber",
"juryDemand",
"referredTo",
"suitNature",

RDs

short_description"
"description"
"plain_text"

Let me know what do you think.

- Serializers to support the V4 Search API. - Handling of the highlight parameter to enable or disable highlighting.

mlissner · 2024-04-15T23:49:52Z

HERE WE GOOOO!

… highlight param - If the highlight parameter is not passed to the API request, highlighting will be disabled by default. - To enable highlighting in the Search API, the "highlight=on" parameter must be sent.

- Fixed highlight for other V3 search types.

cl/search/api_utils.py

…ld in dockets.

- Added tests to test consistency of results across pagination.

…Search API

- Refactored tests to reuse code.

cl/lib/elasticsearch_utils.py

…tch the cursor.

mlissner · 2024-04-27T04:17:36Z

Should we consider displaying a maximum number of parties and attorneys?

I don't think so. If the backend can handle it then the consumer can too.

maybe the rd type is just better for the same objective?

Yes. Seems fine.

One question here, would it be necessary to add some Docket fields to this serializer so users can easily identify the parent docket? Perhaps adding the docket_id?

Yeah, that makes sense. Do we have the docket_entry_id in the recap_docket object in ES also?

Suggestions are welcome if you have a better way to display the exact count and indicate that there are more results.

Hm, we can do fast count queries that lose accuracy after a certain point, right? I'd say we do that instead. We can just provide the approximate count, and then document that it's only accurate for result sets smaller than XXX (whatever it was we discussed before, if we made a decision about it).

using two sorting keys can lead to discrepancies between sorting in the frontend and in the API v4

That's not a big deal, but we should open an issue to have it on our backlog as a "someday" issue. Seems easy to fix on the front end, right?

Highlighted fields include...

How does this compare to the front end?

Let me know what do you think.

This all sounds good. My one concern is that the backwards pagination sounds like a pretty big hack. Is it something of a best practice or something you came up with to solve the problem?

mlissner · 2024-04-27T04:23:42Z

I just did a very quick skim of the code. @ERosendo, if you can do a full review, that would be great, and I'll do longer review after that.

albertisfu · 2024-04-29T16:19:15Z

Yeah, that makes sense. Do we have the docket_entry_id in the recap_docket object in ES also?

Correct, I'll add the docket_id to the response. And yeah, the docket_entry_id is already in the rd response.

Hm, we can do fast count queries that lose accuracy after a certain point, right? I'd say we do that instead. We can just provide the approximate count, and then document that it's only accurate for result sets smaller than XXX (whatever it was we discussed before, if we made a decision about it).

Sure, so that means we'd only provide the "count" key containing the accurate result if it's below a threshold, or the approximate count if it's greater.

The threshold we defined in #3926 was 2,000 documents, considering up to 100 pages of 20 documents each.

We can perform the same cardinality count with accurate results up to that threshold, but since we're currently getting an accurate count up to 10,000 items from the main query, we can use that count if the results are less than 10,000 and use the approximate count returned by the cardinality aggregation if they exceed 10,000. And the "more" key won't be shown in any case.

Does that sound good?

That's not a big deal, but we should open an issue to have it on our backlog as a "someday" issue. Seems easy to fix on the front end, right?

Sure, here is the issue: #3999

How does this compare to the front end?

The HL fields in the front end are the same as in the "r" search type (docket + RD fields) when highlight is enabled. The "d" type only highlights docket fields, and the "rd" type only highlights RD fields.

My one concern is that the backward pagination sounds like a pretty big hack. Is it something of a best practice or something you came up with to solve the problem?

Initially, it was just a brief idea I came up with in #3645 (comment) when assessing the search_after parameter, but I was not sure it would work. The problem that remained unsolved was that the results were inverted on every page, so we'd need to invert them again on every page when going backward. Upon inspecting and analyzing the CursorPagination class in DRF, I noticed that it follows the same approach of inverting the sorting when going backward and also inverting the order on every page before returning the results to the users. So I felt more confident to implement the solution, as it employs the same principles, albeit with variations for SQL.

mlissner · 2024-04-29T19:10:44Z

Everything above sounds great, thanks Alberto. My last comment is in response to this:

The "d" type only highlights docket fields, and the "rd" type only highlights RD fields.

That sounds right, but to be certain I understand correctly, are there fields that are shown on both the RD and the R search type that are not highlighted in RD even though they are highlighted in the R search type (or the same question for the D result type)?

albertisfu · 2024-04-29T19:38:44Z

That sounds right, but to be certain I understand correctly, are there fields that are shown on both the RD and the R search type that are not highlighted in RD even though they are highlighted in the R search type (or the same question for the D result type)?

All the fields HL in the r type, are also HL in the rd type and d type, according to the fields available on each type. For instance, the RECAPDocuments fields are not available in the d type and the Docket fields are not available in the rd type.

Let me explain it with a response example for each type.
The HL fields in each search type will look like this; note that I'm omitting all the non-HL fields for simplicity.

"r" type:

{
   "assignedTo":"<mark>Lorem</mark> Ipsum",
   "caseName":"Bear <mark>Lorem</mark> River Band of Rohnerville Rancheria v. California Department of Social Services",
   "cause":"42:1983 Civil Rights Act <mark>Lorem</mark>",
   "court_citation_string":"S.D.N.Y. <mark>Lorem</mark>",
   "juryDemand":"<mark>Lorem</mark>",
   "referredTo":"<mark>Lorem</mark>",
   "suitNature":"Civil Rights: <mark>Lorem</mark>",
   "recap_documents":[
      {
         "description":"ORDER  <mark>Lorem</mark> by Judge Haywood S. Gilliam, Jr. GRANTING   53   Stipulation to Extend Plaintiffs",
         "short_description":"<mark>Lorem</mark> Ipsum",
         "snippet":"<mark>Lorem</mark> Ipsum"
      }
   ],
  "more_no_hl_fields"
}

"d" type:

{
   "assignedTo":"<mark>Lorem</mark> Ipsum",
   "caseName":"Bear <mark>Lorem</mark> River Band of Rohnerville Rancheria v. California Department of Social Services",
   "cause":"42:1983 Civil Rights Act <mark>Lorem</mark>",
   "court_citation_string":"S.D.N.Y. <mark>Lorem</mark>",
   "juryDemand":"<mark>Lorem</mark>",
   "referredTo":"<mark>Lorem</mark>",
   "suitNature":"Civil Rights: <mark>Lorem</mark>",
    "more_no_hl_fields"
}

"rd" type:

{
         "description":"ORDER  <mark>Lorem</mark> by Judge Haywood S. Gilliam, Jr. GRANTING   53   Stipulation to Extend Plaintiffs",
         "short_description":"<mark>Lorem</mark> Ipsum",
         "snippet":"<mark>Lorem</mark> Ipsum",
         "more_no_hl_fields"
      }

mlissner · 2024-04-29T20:41:00Z

Perfect, that's what I was expecting, but just wanted to be sure! Thank you!

mlissner · 2024-04-30T23:16:04Z

Thanks Alberto. All sounds good, and we await Eduardo's review.

One other thought: Is it hard to add a second count to the r results? One for the docket count and one for the recap document count, like we have on the front end? I'm eventually hoping to use this API for the front end (with HTMX), so I'm thinking about gaps we'd have to fill if we did that.

albertisfu · 2024-04-30T23:33:51Z

One other thought: Is it hard to add a second count to the r results? One for the docket count and one for the recap document count, like we have on the front end? I'm eventually hoping to use this API for the front end (with HTMX), so I'm thinking about gaps we'd have to fill if we did that.

It's not difficult to add a count for the recap documents. I didn't include it because it requires an additional ES query for every API request, and the current V3 only displays one count.

However, if it's beneficial for users to have the recap document count in r type, we could add a secondary count named document_count or similar.

We might also use this for the frontend in the future. Alternatively, if the document count is only useful for the frontend, we could add an extra parameter to the request, something like document_count=on This way, the response returns the document count for use in the frontend while it remains disabled in the API to save a query.

mlissner · 2024-04-30T23:42:27Z

I think it's OK to add it to all responses, particularly if we do the cardinality thing that should make it pretty performant?

albertisfu · 2024-05-01T00:07:40Z

I think it's OK to add it to all responses, particularly if we do the cardinality thing that should make it pretty performant?

Yeah! I'll add the secondary count for the r type then.

albertisfu · 2024-05-01T01:39:44Z

The document_count provided by a cardinality query is added to the r type.

The d and rd types lack this key in the response. So this is an additional detail that we'll need to explain in the documentation.

…h async tests

ERosendo · 2024-05-03T20:59:22Z

cl/search/api_utils.py

+def limit_api_results_to_page(
+    results: Response | AttrList, cursor: ESCursor
+) -> Response | AttrList:


The cursor argument in this method seems to be optional. Let's update the type hint to reflect this by using | None.

Suggested change

def limit_api_results_to_page(

results: Response | AttrList, cursor: ESCursor

) -> Response | AttrList:

def limit_api_results_to_page(

results: Response | AttrList, cursor: ESCursor | None

) -> Response | AttrList:

You're right, I've added missing None to the type

ERosendo · 2024-05-03T21:04:25Z

cl/search/api_utils.py

+    reverse = False
+    if cursor is not None:
+        search_after, reverse, _ = cursor


We're not using the search_after variable here. I think we can simplify this using a ternary if statement

Suggested change

reverse = False

if cursor is not None:

search_after, reverse, _ = cursor

reverse = cursor.reverse if cursor else False

Yeah, this is better, changed it!

ERosendo · 2024-05-03T21:16:30Z

cl/search/types.py

@@ -106,3 +107,6 @@ class EventTable(StrEnum):
    DOCKET_ENTRY = "search.DocketEntry"
    RECAP_DOCUMENT = "search.RECAPDocument"
    UNDEFINED = ""
+
+
+ESCursor = namedtuple("ESCursor", ["search_after", "reverse", "search_type"])


is there a reason to use a namedtuple instead of a dataclass? Dataclasses provide the same data storage capabilities with the added benefit of built-in type hints and cleaner syntax. This can improve code readability and maintainability.

Thanks for the suggestion. Using a dataclass can improve readability, so I've made the change it to:

@dataclass(frozen=True) class ESCursor: search_after: AttrList | None reverse: bool search_type: str

The frozen=True to make it immutable as the namedtuple

ERosendo · 2024-05-03T22:46:10Z

cl/lib/elasticsearch_utils.py

+            main_query = s
+            if cd["highlight"]:
+                main_query = add_es_highlighting(s, cd)


We can use a ternary if statement here.

Yeah, changed it!

ERosendo · 2024-05-04T00:16:37Z

cl/lib/document_serializer.py

+class DateField(serializers.Field):
+    """Handles date objects."""


Let's update this docstring to improve clarity. Initially, I struggled to differentiate between this class and the DateOrDateTimeField class. However, I realized that this new class uses only the serializers.DateField() class to handle dates. In contrast, the older class offers more flexibility, using either the DateTimeField or the DateField serializer.

Sure, I've updated the docstring and also renamed the class to CoerceDateField to avoid confusion.

ERosendo · 2024-05-04T03:39:31Z

cl/api/pagination.py

+    def __init__(self):
+        self.page_size = settings.SEARCH_API_PAGE_SIZE
+        self.request = None
+        self.es_list_instance = None
+        self.results_count_exact = None
+        self.results_count_approximate = None
+        self.child_results_count = None
+        self.results_in_page = None
+        self.base_url = None
+        self.cursor = None
+        self.search_type = None
+        self.cursor_query_param = "cursor"
+        self.invalid_cursor_message = "Invalid cursor"


Since the __init__ method doesn't accept any arguments and only assigns values to variables, I believe we can simplify the code by moving these assignments directly to the class level as class variables.

Yeah, you're right. I have moved them to class variables. The only one that needed to be within the __init__ was self.page_size for testing purposes. However, it's better to use settings.SEARCH_API_PAGE_SIZE directly in the code and remove __init__.

ERosendo · 2024-05-04T04:09:06Z

cl/lib/elasticsearch_utils.py

+    if toggle_sorting and order_by:
+        sort_components = order_by.split(",")
+        toggle_sort_components = []
+        for component in sort_components:
+            component = component.strip()
+            if "desc" in component:
+                toggle_sort_components.append(component.replace("desc", "asc"))
+            elif "asc" in component:
+                toggle_sort_components.append(component.replace("asc", "desc"))
+            else:
+                toggle_sort_components.append(component)
+
+        return ", ".join(toggle_sort_components)
+
+    return order_by


We can simplify the code structure and improve readability by using an early return to reduce the size of the nested block.

Sure, updated it.

cl/lib/elasticsearch_utils.py

cl/search/api_serializers.py

cl/search/api_utils.py

ERosendo

The code looks good overall 👍 . My comments are mostly minor nits, so I think we can merge after they are addressed.

albertisfu · 2024-05-06T22:59:41Z

Thank you @ERosendo
I've applied your suggestions, let me know if they look good to you.

ERosendo · 2024-05-06T23:08:39Z

@albertisfu LGTM 👍

mlissner · 2024-05-06T23:17:03Z

@blancoramiro, this has a noop migration, so it needs a little hand-holding. Do you think you can get it deployed, please?

blancoramiro · 2024-05-07T00:31:56Z

Migrations applied and merge deployed!

feat(api): Introduced V4 RECAP Search API

9086456

- Serializers to support the V4 Search API. - Handling of the highlight parameter to enable or disable highlighting.

fix(api): Allow API users to enable or disable highlighting using the…

90e6280

… highlight param - If the highlight parameter is not passed to the API request, highlighting will be disabled by default. - To enable highlighting in the Search API, the "highlight=on" parameter must be sent.

Base automatically changed from 3918-migrate-v3-opinion-search-api-to-es to main April 16, 2024 22:48

albertisfu added 3 commits April 19, 2024 20:38

fix(api): Introduced cursor pagination for V4 Search API

6bb7b6a

- Fixed highlight for other V3 search types.

Merge branch 'main' into 3033-develop-v4-recap-search-api

de3158d

fix(lib): Fixed merge conflict

a44c616

semgrep-app bot reviewed Apr 20, 2024

View reviewed changes

cl/search/api_utils.py Show resolved Hide resolved

albertisfu added 9 commits April 22, 2024 18:44

fix(api): Introduced backward cursor pagination for the V4 Search API

8dfa629

Merge branch 'main' into 3033-develop-v4-recap-search-api

1988928

fix(api): Fixed limit results to display on page

7b842d5

fix(api): Fixed has_next and has_previous for multiple scenarios

cb9f112

fix(api): Fixed an issue regarding cursor pagination and null dateFie…

c869e9c

…ld in dockets.

fix(api): Added more_docs field to the v4 RECAP Search API

738d65f

- Added tests to test consistency of results across pagination.

Merge branch 'main' into 3033-develop-v4-recap-search-api

102e7fc

fix(api): Added Docket "d" and RECAPDocument "rd" search types in V4 …

5745716

…Search API

fix(api): Added HL for Docket "d" and RECAPDocument "rd" search types

9d57c24

- Refactored tests to reuse code.

semgrep-app bot reviewed Apr 26, 2024

View reviewed changes

cl/lib/elasticsearch_utils.py Show resolved Hide resolved

semgrep-app bot reviewed Apr 26, 2024

View reviewed changes

cl/lib/elasticsearch_utils.py Show resolved Hide resolved

fix(api): Raise an invalid cursor error if the search type doesn't ma…

c4253b7

…tch the cursor.

albertisfu marked this pull request as ready for review April 27, 2024 03:14

albertisfu requested a review from mlissner April 27, 2024 03:14

albertisfu mentioned this pull request Apr 29, 2024

Use a ES Point in time to keep the same index status across V4 Search API cursor pagination. #3999

Open

fix(api): Added docket_id field to the search API "rd" type response

3081a05

fix(api): Added "document_count" to the "r" search type response

9269cdf

albertisfu added 2 commits May 1, 2024 12:35

fix(api): Make the skip_if_common_tests_skipped decorator to work wit…

4662c7c

…h async tests

Merge branch 'main' into 3033-develop-v4-recap-search-api

a0611eb

albertisfu mentioned this pull request May 3, 2024

3033 Introduced V4 Opinion Search API #4007

Merged

ERosendo reviewed May 3, 2024

View reviewed changes

ERosendo reviewed May 4, 2024

View reviewed changes

cl/lib/elasticsearch_utils.py Outdated Show resolved Hide resolved