-
Notifications
You must be signed in to change notification settings - Fork 0
/
feed.xml
2723 lines (2302 loc) · 241 KB
/
feed.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.3.2">Jekyll</generator><link href="https://kolchfa-aws.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://kolchfa-aws.github.io/" rel="alternate" type="text/html" /><updated>2024-09-12T21:46:35+00:00</updated><id>https://kolchfa-aws.github.io/feed.xml</id><title type="html">OpenSearch</title><subtitle></subtitle><entry><title type="html">OpenSearch Project Roadmap 2024–2025</title><link href="https://kolchfa-aws.github.io/blog/opensearch-project-roadmap-2024-2025/" rel="alternate" type="text/html" title="OpenSearch Project Roadmap 2024–2025" /><published>2024-09-12T00:00:00+00:00</published><updated>2024-09-12T21:41:34+00:00</updated><id>https://kolchfa-aws.github.io/blog/opensearch-project-roadmap-2024-2025</id><content type="html" xml:base="https://kolchfa-aws.github.io/blog/opensearch-project-roadmap-2024-2025/"><p>OpenSearch is an open-source product suite comprising a search engine, an ingestion system, language clients, and a user interface for analytics. Our goal at the <a href="https://github.com/opensearch-project">OpenSearch Project</a> is to make OpenSearch the preferred open-source solution for search, vector databases, log analytics, and security analytics and to establish it as the preferred backend for generative AI applications. OpenSearch contributors and maintainers are innovating in all these areas at a fast pace. With <a href="https://metrics.opensearch.org/_dashboards/app/dashboards#/view/f1ad21c0-e323-11ee-9a74-07cd3b4ff414">more than 1,400 unique contributors</a> working across <a href="https://github.com/orgs/opensearch-project/repositories?q=visibility%3Apublic+archived%3Afalse">110+ public GitHub repositories</a> on a daily basis, OpenSearch is a rapidly growing open-source project.</p>
<p>To steer the project’s development effectively, we have revamped the project roadmap to provide better transparency into both short- and long-term enhancements. This will help the community provide feedback more easily, assist with prioritization, foster collaboration, and ensure that contributor efforts align with the community’s needs. To achieve this, the OpenSearch Project recently <a href="https://github.com/opensearch-project/.github/issues/196">introduced a new public process</a> for developing a <strong>theme-based, community-driven</strong> <a href="https://github.com/orgs/opensearch-project/projects/206"><strong>OpenSearch roadmap board</strong></a>, which we are excited to share today. The roadmap board will provide the community with visibility into the project’s high-level technological direction and will facilitate the sharing of feedback.</p>
<p>In this blog post, we will outline the OpenSearch roadmap for 2024–2025, focusing on the key areas that foster innovation among OpenSearch contributors. These innovation areas are categorized into the following nine main themes:</p>
<ol>
<li><strong><a href="#roadmap-theme-1-vector-database-and-generative-ai">Vector Database and Generative AI</a></strong></li>
<li><strong><a href="#roadmap-theme-2-search">Search</a></strong></li>
<li><strong><a href="#roadmap-theme-3-ease-of-use">Ease of Use</a></strong></li>
<li><strong><a href="#roadmap-theme-4-observability-log-analytics-and-security-analytics">Observability, Log Analytics, and Security Analytics</a></strong></li>
<li><strong><a href="#roadmap-theme-5-cost-performance-and-scalability">Cost, Performance, and Scalability</a></strong></li>
<li><strong><a href="#roadmap-theme-6-stability-availability-and-resiliency">Stability, Availability, and Resiliency</a></strong></li>
<li><strong><a href="#roadmap-theme-7-security">Security</a></strong></li>
<li><strong><a href="#roadmap-theme-8-modular-architecture">Modular Architecture</a></strong></li>
<li><strong><a href="#roadmap-theme-9-releases-and-project-health">Releases and Project Health</a></strong></li>
</ol>
<p>In the rest of this post, we will first <a href="#roadmap-summary">summarize the key innovation areas</a> in the context of the roadmap themes. For readers interested in a comprehensive understanding, we have a <a href="#roadmap-details">section dedicated to each theme</a> containing information about key innovations and links to the relevant GitHub RFCs/METAs for the features.</p>
<h2 id="roadmap-summary">Roadmap summary</h2>
<p>As a technology, OpenSearch innovates in three main areas: search, streaming data, and vectors. Search use cases employ lexical and semantic means to match end user queries to the catalog of information, stored in indexes, that drives your application. <em>Streaming data</em> includes a wide range of real-time data types, such as raw log data, observability trace data, security event data, metric data, and other event data like Internet of Things (IoT) events. Vector data includes the outputs of embedding-generating large language models (LLMs), vectors produced by machine learning (ML) models, and encodings of media like audio and video.</p>
<p>OpenSearch’s roadmap is aligned vertically in some cases and horizontally in others, depending on the workloads it supports. Features relevant to <strong>search workloads</strong> are described in <a href="#roadmap-theme-1-vector-database-and--generative-ai">theme 1</a> and <a href="#roadmap-theme-2-search">theme 2</a>. Features relevant to <strong>vector workloads</strong> are described in <a href="#roadmap-theme-1-vector-database-and--generative-ai">theme 1</a>. Features relevant to <strong>streaming data workloads</strong> are described in <a href="#_roadmap-theme-4-observability-log-analytics-and-security-analytics">theme 4</a>. Features relevant to <strong>all three workload types</strong> are described in <a href="#roadmap-theme-3-ease-of-use">theme 3</a>, and <a href="#roadmap-theme-5-cost-performance-and-scalability">themes 5–9</a>.</p>
<p><strong>Theme 1 (Vector Database and Generative AI)</strong> is centered on price performance and ease of use for vector workloads, creating new features that help reduce costs through quantization, disk storage, and GPU utilization. Ease-of-use features will make it easier to get started with and use embedding vectors to improve search results. <strong>Theme 2 (Search)</strong> focuses on enhancing the query capabilities of core search, building a new query engine with query planning, tight integrations with Lucene innovations, improving search relevance, and searching across external data sources with Data Prepper. <strong>Theme 3 (Ease of Use)</strong> encompasses building a richer dashboard experience and serverless dashboards that feature simplified installation, migration, and multi-data-source support. <strong>Theme 4 (Observability, Log Analytics, and Security Analytics)</strong> emphasizes integrating with industry standards, such as OpenTelemetry, to unify workflows across metrics, logs, and traces; providing a richer SQL-PPL experience; positioning Discover as the main entry point for analytical workflows; improving Data Prepper for various analytics use cases; and developing well-integrated security analytics workflows. <strong>Theme 5 (Cost, Performance, and Scalability)</strong> includes improving core search engine performance, scaling shard management, providing context-aware templates for different workloads, moving to remote-store-backed tiered storage, and scaling cluster management. <strong>Theme 6 (Stability, Availability, and Resiliency)</strong> includes features involving query visibility, query resiliency, workload management, and cluster management resilience. <strong>Theme 7 (Security)</strong> centers on providing constructs that are secure by default and adopting a streamlined plugin security model as the plugin ecosystem grows. <strong>Theme 8 (Modular Architecture)</strong> involves modularizing the OpenSearch codebase to suit different deployments and moving to a decoupled, service-oriented architecture. <strong>Theme 9 (Releases and Project Health)</strong> dives into initiatives for faster automated releases, with streamlined continuous integration/continuous delivery (CI/CD) and metrics dashboards to measure community health and operations.</p>
<h2 id="roadmap-details">Roadmap details</h2>
<p>In the following sections, we cover each theme in detail. You can find the associated RFCs and METAs on the <a href="https://github.com/orgs/opensearch-project/projects/206/views/11">new roadmap board</a>. We would love for you to get involved with the OpenSearch community by contributing to innovation in these areas or by providing your feedback.</p>
<h3 id="roadmap-theme-1-vector-database-and-generative-ai">Roadmap Theme 1: Vector Database and Generative AI</h3>
<p>The OpenSearch roadmap includes several innovations to OpenSearch’s vector database and ML functionality. These innovations focus on enhancing vector search and making ML-powered applications and integrations more flexible and easier to build. AI advancements are transforming the search experience for end users of all skill levels. By integrating AI models, OpenSearch delivers more relevant search results to all users. Experienced builders can apply additional techniques such as query rewriting, result reranking, personalization, semantic search, summarization, and retrieval-augmented generation (RAG) in order to further enhance search result accuracy. Many of these techniques rely on a vector database. With the current rise of generative AI, OpenSearch is gaining traction as a vector database solution powered by k-NN indexes. Our planned innovations will make OpenSearch vector database features easy to use and more efficient while lowering operational costs.</p>
<p><strong>Vector search price performance</strong>: To further improve the price performance of vector search, we are planning several key initiatives, such as offering a <a href="https://github.com/opensearch-project/k-NN/issues/1779">disk-optimized approximate nearest neighbor (ANN) solution</a> that uses quantized vectors to provide up to 32x compression and a 70% cost reduction, while still maintaining recall and requiring no pretraining. We are optimizing memory footprint functionality using techniques like iterative product quantization (PQ) and data types like <a href="https://github.com/opensearch-project/k-NN/issues/1764">binary vectors</a>. Additionally, we are implementing smart routing capabilities that allow you to organize indexes by semantic similarity for doubled query throughput, enabling multi-tenancy and smart filtering for high-recall ANN search at the tenant level and using GPUs to significantly accelerate index build times for k-NN indexes, with a 10–40x better price/performance ratio compared to CPU-based infrastructure. We also plan to further lower costs by storing full-precision vectors on cold storage systems like Amazon Simple Storage Service (Amazon S3). The smart routing capabilities will place neighboring embeddings on the same node, improving query efficiency. The multi-tenancy and smart filtering features will cater to use cases requiring granular filtering of large datasets with stringent recall targets, enhancing efficiency and cost effectiveness. OpenSearch already provides memory footprint reduction techniques, such as PQ using HNSWPQ and IVFPQ and scalar quantization (SQ) in byte and fp16 formats. We are now investing in additional techniques to further compress vectors while maintaining recall similar to that provided when using full-precision vectors. The upcoming innovations are expected to significantly improve the price performance of vector search, making it more accessible and cost effective for a wide range of applications.</p>
<p><strong>Out-of-the-box (OOB) experience</strong>: OpenSearch aims to enhance the OOB experience of vector search. While the community appreciates the wide variety of tools and algorithms provided for tuning clusters according to workloads, having too many options can make it challenging for users to choose the right configuration. To address this, OpenSearch’s AutoTune feature will recommend the optimal hyperparameter values for a given workload based on metrics such as recall, latency, and throughput. Additionally, we plan to introduce smarter defaults to automatically tune indexing threads and enable <a href="https://github.com/opensearch-project/OpenSearch/issues/6798">concurrent segment search</a> based on traffic patterns and hardware resources. By simplifying the tuning process and providing intelligent defaults, OpenSearch will make it easier for users to achieve optimal performance without the need for extensive manual configuration.</p>
<p><strong>Neural search</strong>: Ingestion performance has been a significant barrier to the adoption of neural search, especially for users who work with large-scale datasets. To address this, in version 2.16 we introduced online batch inference support that reduces communication overhead. We will further enhance ingestion performance by supporting <a href="https://github.com/opensearch-project/ml-commons/issues/2891">offline batch inference</a>. By using the offline batch processing capabilities of inference services like Amazon SageMaker, Amazon Bedrock, OpenAI, and Cohere, users will be able to directly process batch requests from preferred storage locations such as Amazon S3. This will significantly boost ingestion throughput while simultaneously reducing costs. Offline batch inference eliminates real-time communication with remote services, unlocking the full potential of neural search. We want to allow users to efficiently process large datasets and use advanced search capabilities at scale without compromising performance or incurring excessive costs.</p>
<p><strong>Neural sparse search</strong>: Neural sparse search provides yet another semantic search option for builders. Sparse encoding models create a reduced token set in which related tokens have semantically similar weights. A neural sparse index uses Lucene’s inverted index to store tokens and weights, providing fast, token-based recall and fast scoring through dot products. The OpenSearch 2.13 release included self-pretrained <a href="https://huggingface.co/opensearch-project/opensearch-neural-sparse-encoding-v2-distill">sparse encoders</a> on Hugging Face. Further optimizations will enhance both model effectiveness and efficiency:</p>
<ul>
<li><strong>More powerful models</strong>: OpenSearch will continue tuning neural sparse models to boost both relevance and efficiency.</li>
<li><strong>Weight quantization</strong>: Compressing the payload of sparse term weights will considerably reduce index sizes, providing an economic solution comparable to BM25.</li>
<li><strong>Multilingual support</strong>: In addition to English, neural sparse models will support at least three more languages.</li>
</ul>
<p><strong>Development process for ML-powered search</strong>: Enhancing the builder experience and streamlining the development process for ML-powered search is our top priority. To achieve this, we will introduce a <a href="https://github.com/opensearch-project/OpenSearch-Dashboards/issues/4755">low-code search flow builder</a> within OpenSearch Dashboards, enabling the creation and customization of AI-enhanced search capabilities with minimal coding effort. Additionally, we will extend both the model-serving framework in ML Commons and its search pipeline functionality, allowing users to seamlessly integrate various third-party models, such as OpenAI or Cohere embedding models. This will provide greater flexibility and enable builders to use the most suitable solution for their specific use case.</p>
<p><strong>ML connector certification program</strong>: To keep up with the rapid evolution of ML and the emergence of new inference services, we are launching a self-service certification program through which the community and service providers can contribute blueprints for their preferred inference models. OpenSearch already provides OOB blueprints for popular services such as <a href="https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/cohere_connector_embedding_blueprint.md">Cohere</a> and <a href="https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/openai_connector_embedding_blueprint.md">OpenAI</a>. However, adding a new blueprint requires a manual code review and merging process, as shown in <a href="https://github.com/opensearch-project/ml-commons/pull/1991">this pull request for adding a blueprint for the Cohere chat model</a>. The new certification program encourages users to submit blueprints for their favorite models and have them verified and approved through automated pipelines. Once approved, these blueprints will be distributed alongside OpenSearch version releases, benefiting the entire community and ensuring that OpenSearch remains current with the latest advancements in the field.</p>
<p><strong>OpenSearch Assistant Toolkit</strong>: The <a href="https://github.com/opensearch-project/dashboards-assistant/issues/18">OpenSearch Assistant Toolkit</a> helps create AI-powered assistants for OpenSearch Dashboards. Its main goal is to simplify interactions with OpenSearch features and enhance their accessibility. For example, using natural language queries allows for interaction with OpenSearch without the need to learn a custom query language. The toolkit empowers OpenSearch users to build their own AI-powered applications tailored to their customized use cases. It contains built-in skills that will allow builders to use LLMs to create new visualizations based on their data, summarize their data, and help configure anomaly detectors. The OpenSearch Assistant will guide both novice and experienced users, simplifying complex tasks and making it easier to effectively navigate OpenSearch. For more information, see <a href="https://www.youtube.com/watch?v=VTiJtGI2Sr4">this video</a>.</p>
<h3 id="roadmap-theme-2-search">Roadmap Theme 2: Search</h3>
<p>OpenSearch is designed to offer a highly scalable, reliable, and fast search experience, built to handle large-scale data environments while delivering accurate and relevant results. The community is committed to evolving OpenSearch’s core search capabilities to meet modern workload standards and business needs. As part of our ongoing investments in the core search engine, the roadmap focuses on the following key advancements.</p>
<p><strong>Enhanced query capabilities</strong>: The OpenSearch community continues to push the boundaries of query capabilities. Features like <a href="https://github.com/opensearch-project/OpenSearch/issues/1133">derived fields</a>, <a href="https://github.com/opensearch-project/OpenSearch/issues/5639">wildcard fields</a>, and <a href="https://github.com/opensearch-project/OpenSearch/pull/14774">bitmap filtering</a> offer greater flexibility in search queries, allowing users to extract more precise insights from their data. The adoption of new ranking techniques and algorithms such as <a href="https://github.com/opensearch-project/OpenSearch/issues/3996">combined_fields (BM25F)</a> improves search result relevance, contributing to a more refined search experience. We plan to introduce query <a href="https://github.com/opensearch-project/OpenSearch/issues/10250">categorization</a> and <a href="https://github.com/opensearch-project/OpenSearch/issues/11429">insights</a>, providing fine-grained monitoring to identify problematic queries, diagnose bottlenecks, and optimize performance.</p>
<p><strong>Sophisticated query engine</strong>: We are committed to further enhancing the core query engine, with plans to integrate advanced capabilities from the <a href="https://github.com/opensearch-project/sql">SQL plugin</a> directly into OpenSearch. This effort is aimed at unifying query planning and distributed execution across different query languages, bringing OpenSearch query domain-specific language (DSL), SQL, and <a href="https://github.com/opensearch-project/sql/tree/main/ppl">Piped Processing Language (PPL)</a> into closer parity. This integration will support more sophisticated query optimizations and distributed executions, unlocking more efficient data processing at scale. The introduction of <a href="https://github.com/opensearch-project/OpenSearch/issues/15185">join support</a> in the core engine will offer users a powerful method of combining and analyzing datasets. These capabilities are crucial for those dealing with relational-style data, enabling greater query complexity without sacrificing performance. A key step in improving the query engine is separating the search coordinator logic from the shard-level Lucene search logic. This separation will allow the search coordinator to focus on complex distributed logic (including joins) and process results from a variety of data sources (including future support for non-Lucene data sources like relational databases and Parquet files).</p>
<p><strong>Query performance</strong>: In terms of broader query engine speed and scale, OpenSearch is moving toward <a href="https://github.com/opensearch-project/OpenSearch/issues/15237">writer/searcher separation</a>, which will provide a more modular and adaptable framework for managing indexing and search processes. Efforts like <a href="https://github.com/opensearch-project/OpenSearch/issues/15257">Star Tree index</a> and the introduction of <a href="https://github.com/opensearch-project/OpenSearch/issues/10684">Protobuf</a> for search execution and communication further reduce costs and improve performance, enabling the platform to efficiently handle even larger data volumes. The roadmap includes several key advancements in query processing, such as improving <a href="https://github.com/opensearch-project/OpenSearch/issues/13566">range query performance</a> through <a href="https://github.com/opensearch-project/OpenSearch/pull/13788">approximation</a> techniques, accelerating aggregations such as date histograms, <a href="https://github.com/opensearch-project/OpenSearch/issues/15136">enhancing concurrent segment search</a>, developing multi-level request caching with <a href="https://github.com/opensearch-project/OpenSearch/issues/13566">tiered caching</a>, and integrating Rust and SIMD operations.</p>
<p><strong>Contributions to core dependencies</strong>: As part of our community-driven effort to optimize OpenSearch’s underlying architecture, we continue to contribute to the Lucene search library. A notable example includes ongoing work on <a href="https://github.com/apache/lucene/pull/13521">BKD doc ID encoding</a>, which will improve indexing and query performance. These contributions ensure that OpenSearch remains on the cutting edge of search technology, benefiting from the latest Lucene advancements.</p>
<p><strong>Hybrid search enhancements</strong>: OpenSearch continues to enhance search relevance through hybrid search, which combines text and vector queries. In addition to the existing score-based normalization and combination techniques, OpenSearch plans to launch a rank-based approach called <a href="https://github.com/opensearch-project/neural-search/issues/865"><em>reciprocal rank fusion</em></a>. This approach will combine search results based on their rank, allowing users to make informed choices by considering the score distribution. Moreover, hybrid search will be augmented with <a href="https://github.com/opensearch-project/neural-search/issues/280">pagination and profiling capabilities</a>, enabling users to debug scores at different stages of score normalization and combination. These enhancements will further improve the search experience, providing more accurate and insightful results while offering greater transparency into the ranking process.</p>
<p><strong>User behavior insights</strong>: Search users are turning to AI to improve search relevance and reduce manual effort. However, it is challenging to train and tune opaque models without a data feedback loop. To help users gain search insights and build a tuning feedback loop, we are launching <a href="https://github.com/opensearch-project/OpenSearch/issues/12084">User Behavior Insights</a> (UBI). UBI consists of a standard data schema, server-side collection components, query-side collection components, and analytics dashboards. This will provide a standard way for users to record and analyze search behavior and train and fine-tune models.</p>
<p><strong>Ingestion from other databases</strong>: OpenSearch can ingest data from Amazon DynamoDB and Amazon DocumentDB databases using Data Prepper, which enables using OpenSearch as a search engine for these sources. Data Prepper is continuing to add support for new database types, with the immediate goal of supporting SQL databases. With this new source type, the community can search even more databases, <a href="https://github.com/opensearch-project/data-prepper/issues/4561">including Amazon Aurora and Amazon Relational Database Service (Amazon RDS)/MySQL databases</a>.</p>
<h3 id="roadmap-theme-3-ease-of-use">Roadmap Theme 3: Ease of Use</h3>
<p>OpenSearch Dashboards provides an intuitive interface and powerful visualization and analytics tools for OpenSearch users. Additionally, OpenSearch Dashboards contains a rich set of features and tools that enable advanced analytics use cases. These easy-to-use tools simplify data exploration, monitoring, and management for both OpenSearch administrators and end users.</p>
<p><strong>Richer dashboard experience</strong>: We are planning dynamic and interactive features to make data visualization more intuitive and powerful. Additionally, we aim to enable <a href="https://github.com/opensearch-project/OpenSearch-Dashboards/issues/1388">multiple data sources</a>, allowing seamless integration and operations, such as cross-source alerting, within a unified interface. As part of this effort, we plan to introduce a <em>dataset</em> concept, which extends the index pattern concept in OpenSearch Dashboards and enables working with different types of data sources, such as relational databases or Prometheus. This will allow users to seamlessly access and visualize data from a variety of sources within the OpenSearch Dashboards interface. We are also introducing a <a href="https://github.com/opensearch-project/OpenSearch-Dashboards/issues/4615"><em>workspace</em></a> concept in OpenSearch Dashboards. Workspaces will streamline user workflows by providing curated vertical experiences for search, observability, and security analytics. Additionally, workspaces will enhance collaboration on workspace assets and improve data connections.</p>
<p><strong>Serverless dashboards and migration</strong>: Our strategy for OpenSearch Dashboards also includes <a href="https://github.com/opensearch-project/OpenSearch-Dashboards/issues/5804">decoupling release and distribution</a> from the OpenSearch engine. We are aiming to allow OpenSearch Dashboards to run as a standalone application, independent from the OpenSearch installation. OpenSearch Dashboards will have its own authentication and access control based on workspaces, and we’ll provide options for using a dedicated database for OpenSearch Dashboards saved objects. To simplify configuration and customization, we envision implementing <a href="https://github.com/opensearch-project/OpenSearch-Dashboards/issues/7111">one-click installation and setup</a>, allowing users to get started quickly. We also plan to streamline <a href="https://github.com/opensearch-project/OpenSearch-Dashboards/issues/5877">plugin management</a> to enable users to extend OpenSearch Dashboards without restarting the application. We aim to develop a <a href="https://github.com/opensearch-project/OpenSearch-Dashboards/issues/5757">migration toolkit</a> to assist users in seamlessly transitioning data from older versions of OpenSearch Dashboards or other tools like Grafana. We’ll also implement an <a href="https://github.com/opensearch-project/OpenSearch-Dashboards/issues/7035">interactive onboarding experience</a> to guide new users through key features and setup steps. Additionally, we plan to integrate live help powered by generative AI, which will offer real-time assistance within the platform, and to enhance the platform’s resilience with improved health and status monitoring. We also plan to focus on improving the overall performance of OpenSearch Dashboards. This will include <a href="https://github.com/opensearch-project/OpenSearch-Dashboards/issues/4630">optimizing the loading times</a> of the application and visualizations, ensuring a smooth and responsive user experience. We will analyze the current performance bottlenecks and implement targeted optimizations to reduce latency and improve the responsiveness of OpenSearch Dashboards, especially when working with large or complex datasets.</p>
<h3 id="roadmap-theme-4-observability-log-analytics-and-security-analytics">Roadmap Theme 4: Observability, Log Analytics, and Security Analytics</h3>
<p>The OpenSearch Project continues to enhance its observability and security analytics capabilities. We are dedicated to creating a more cohesive and user-friendly experience while expanding functionality and improving performance. Our roadmap for 2024–2025 focuses on delivering a more unified, powerful, and intuitive experience while maintaining the cost effectiveness and scalability our users expect.</p>
<p><strong>OpenTelemetry support</strong>: OpenSearch has enhanced its observability features by incorporating support for the OpenTelemetry Protocol (OTLP), enabling the ingestion of metrics, logs, and traces. OTLP, a vendor-neutral protocol, standardizes telemetry data transmission, making it easier to send various types of observability data (traces, metrics, and logs) directly to OpenSearch. This integration with OpenTelemetry allows developers and operations teams to seamlessly ingest traces, metrics, and logs within a unified workflow, promoting a more efficient and standardized approach to collecting and analyzing observability data across complex, distributed systems. With robust support for OpenTelemetry and OTLP, OpenSearch offers a powerful platform for storing, analyzing, and visualizing essential observability data, simplifying system performance monitoring and issue troubleshooting across your entire infrastructure. To address the challenges of managing, monitoring, and analyzing traces, metrics, and logs, OpenSearch introduced a new <a href="https://github.com/opensearch-project/simple-schema">schema</a> compatible with OpenTelemetry. This schema supports predefined dashboards through an <a href="https://github.com/opensearch-project/opensearch-catalog/releases">OpenSearch catalog</a> for common systems like NGINX, HAProxy, and Kubernetes. Additionally, it enables cross-index querying of data containing shared structures from different telemetry data producers. OpenSearch is dedicated to continuously enhancing its schema to support emerging observability use cases and to develop more advanced correlation and alerting solutions. To further explore OpenSearch capabilities, see <a href="https://github.com/opensearch-project/opentelemetry-demo?tab=readme-ov-file#running-this-demo">this demo</a>.</p>
<p><strong>Cost-effective, scalable analytics using Apache Spark</strong>: Many community members are opting to store data on cost-optimized cloud storage outside of OpenSearch, either because it is cost prohibitive to store in OpenSearch or because the amount of data raises scalability concerns. To analyze data outside of OpenSearch, users were forced to switch between tools or create one-off ingestion pipelines. <a href="https://github.com/opensearch-project/OpenSearch/issues/14524">OpenSearch’s integration with Apache Spark</a> allows you to analyze data outside of OpenSearch, potentially reducing storage costs by up to 90%. OpenSearch has added support for <a href="https://github.com/opensearch-project/opensearch-spark">indexing data on cloud storage using Spark Streaming</a>. Naturally, analysts want to join data across OpenSearch indexes and the cloud. Our upcoming Iceberg-compatible <a href="https://github.com/opensearch-project/OpenSearch/issues/8639">table format</a> will enable complex joins between OpenSearch indexes and cloud storage, enhancing your ability to analyze data across platforms. Additionally, the table enhances Iceberg by incorporating index capabilities, enabling the creation of search indexes on text fields, vector indexes, and geographical indexes. During query execution, these indexes will be automatically used to optimize full-text, neural, and geographical searches. Initially, this feature may be based on a customized Iceberg version that is fully compatible with Iceberg and named <em>OpenSearch Table</em>. As this feature is integrated into Iceberg, it will become available to all query engines.</p>
<p><strong>Unified query experience—bridging PPL and SQL</strong>: By the end of 2024, we’ll consolidate SQL and PPL into a common interface within Discover. This unification will allow analysts to work more efficiently, using their preferred language without switching between tools. We’re also including autocomplete and auto-suggest functionality to make query building easier. Looking ahead to 2025, we’re planning to significantly enhance both OpenSearch’s PPL and SQL capabilities. For <a href="https://github.com/orgs/opensearch-project/projects/214/views/2">PPL</a>, we’re introducing over 30 new PPL commands and functions, including <a href="https://github.com/opensearch-project/sql/issues/2913">JOINs</a>, lookups, and JSON search capabilities. These additions will empower you to perform more sophisticated analyses, especially in observability and security contexts. Our SQL engine is also undergoing a <a href="https://github.com/opensearch-project/sql/issues/2674">major upgrade</a>, with a focus on standardization and interoperability. You can look forward to support for vector search, geographical search, and advanced SQL queries, unlocking even more powerful analytics possibilities.</p>
<p><strong>Discover—your central hub for analytics</strong>: We’re positioning <a href="https://github.com/opensearch-project/OpenSearch-Dashboards/issues/8069">Discover as the primary entry point</a> to your analytics workflows. Soon, you’ll be able to seamlessly transition from refining queries to creating visualizations, performing trace analytics, generating reports, or setting up alerts—all without leaving the Discover interface. This interconnected approach will streamline your workflow, saving time and reducing context switching. While we know the community is interested in the workflows we highlighted, we will build the functionality generically so that the community can easily plug in custom workflows that meet their needs.</p>
<p><strong>Enhanced observability tools</strong>: OpenSearch is working on several new observability features to enhance the existing capabilities and user experience. These include the development of a <a href="https://github.com/opensearch-project/opensearch-catalog/issues/123">correlation zones framework</a>, which aims to simplify and automate site reliability engineers’ (SREs) daily tasks by identifying critical issues more efficiently. The framework will categorize anomalies and incidents into correlation zones, reducing the need for constant monitoring and allowing SREs to focus on significant segments. Additionally, OpenSearch is optimizing its <a href="https://github.com/opensearch-project/dashboards-observability/issues/2141">Trace Analytics</a> plugin by adding improved storage capabilities, UI enhancements, query performance, and seamless integration with other OpenSearch Dashboards plugins. This includes the ability to store configurations, support for custom indexes and cross-cluster queries, and better correlation between logs, traces, and metrics. OpenSearch is also working on adding support for <a href="https://github.com/opensearch-project/dashboards-observability/issues/2139">PromQL</a> in dashboards, enabling users to query Prometheus data sources directly and further expanding its observability capabilities and data integration options.</p>
<p><strong>Data Prepper</strong>: Data Prepper allows the community to ingest traces, logs, and metrics into OpenSearch. Currently, the primary means for ingesting these signals are through OpenTelemetry over gRPC, HTTP, and Apache Kafka and through loading from Amazon S3. The community has looked for other ways to ingest data into OpenSearch, and Data Prepper is planning to support those. First, an <a href="https://github.com/opensearch-project/data-prepper/issues/1082">Amazon Kinesis source</a> will allow the community to pull data from Amazon Kinesis, which is popular for streaming data. Second, Data Prepper is planning to <a href="https://github.com/opensearch-project/data-prepper/issues/4180">provide a new OpenSearch API source</a> for ingesting data using existing OpenSearch APIs. This API will initially accept requests made using the <a href="https://github.com/opensearch-project/data-prepper/issues/248">OpenSearch Bulk API</a> and will support other document update APIs in the future. Third, Data Prepper will <a href="https://github.com/opensearch-project/data-prepper/issues/1986">support Apache Kafka</a> as a sink. While users can currently read from Apache Kafka using Data Prepper, there is growing interest in using Data Prepper as an ingestion tool for Kafka clusters. One of Data Prepper’s major use cases is observability and analytics, and both the maintainers and community continue to improve upon Data Prepper capabilities for these important use cases.</p>
<p><strong>Security analytics</strong>: Our mission is to empower security and operations teams to quickly discover and isolate threats or operational issues, minimizing the impact on business operations and protecting confidential data. OpenSearch users ingest security and operations data into their clusters for real-time security threat detection and correlation, security event investigation, and operational trend visualization to generate meaningful insights. <a href="https://github.com/opensearch-project/security-analytics">Security Analytics</a> provides a prebuilt library of over 3,300 threat detection rules for common security event logs, a threat intelligence framework, a real-time detection rules engine, alerting capabilities for notifying incident response teams, and a correlation rules engine for identifying associations across events. In the coming year, we will create a unified experience so that users can move faster to find and address threats. We will support security insights without creating detectors, expand support for new security log types, add new threat intelligence feed integrations, and simplify the data mapping workflows. We will integrate generative AI features into existing workflows to enable users of all skill levels to easily configure threat detection, create security rules, and obtain security insights and remediation steps. In addition, we will improve investigation workflows that will enable users to query and analyze historical logs for compliance and investigation purposes. Native integrations with incident response and case management systems, such as ServiceNow and PagerDuty, will help users monitor updates from a centralized location.</p>
<h3 id="roadmap-theme-5-cost-performance-and-scalability">Roadmap Theme 5: Cost, Performance, and Scalability</h3>
<p><strong>Search performance and a new query engine</strong>: As data volumes increase in size and workloads become more complex, price performance remains a top priority for OpenSearch users. OpenSearch recently implemented significant engine performance enhancements, as highlighted in a <a href="https://opensearch.org/blog/opensearch-performance-2.14/">previous blog post</a>. Compared to OpenSearch 1.0, recent OpenSearch versions demonstrate a 50% improvement for text queries, a 40% improvement for multi-term queries, a 100x boost for term queries, and a 50x boost for date histograms. These advancements stem from the engine performance optimizations outlined in our <a href="https://github.com/orgs/opensearch-project/projects/153">performance roadmap</a>. The roadmap also includes future initiatives such as <a href="https://github.com/opensearch-project/OpenSearch/issues/12257">document reordering</a>, <a href="https://github.com/opensearch-project/OpenSearch/issues/12390">query rewriting</a>, <a href="https://github.com/opensearch-project/OpenSearch/issues/11959">dynamic pruning</a>, and count-only caching. Additionally, the OpenSearch community is now taking the initiative to evolve the core engine in order to embrace new technologies like custom engines, parallelization, and composable architectures—all within an open-source framework. This includes rearchitecting the engine toward <a href="https://github.com/opensearch-project/OpenSearch/issues/14596">indexing and search separation</a> and offering a more modular and adaptable system. Additionally, faster interconnections using an efficient binary format for client-server communication, such as <a href="https://github.com/opensearch-project/OpenSearch/issues/15190">gRPC</a>, and node-to-node messaging through <a href="https://github.com/opensearch-project/OpenSearch/issues/6844">Protobuf</a>, have yielded promising early results. While actively contributing to core Lucene, we’re also focused on building a <a href="https://github.com/opensearch-project/OpenSearch/issues/14637">cloud-native architecture</a> to further enhance engine performance at scale.</p>
<p><strong>Application-based context templates</strong>: <a href="https://github.com/opensearch-project/OpenSearch/issues/12683">Application-based context templates</a> provide predefined, use-case-specific templates that package the right configuration for the specific use case. For example, an index created based on the <a href="https://github.com/opensearch-project/opensearch-system-templates/blob/main/src/main/resources/org/opensearch/system/applicationtemplates/v1/logs.json">logs template</a> is configured with the Zstd compression codec and <code class="language-plaintext highlighter-rouge">log_byte_size</code> merge policy. This configuration helps reduce disk utilization and enhances overall performance. Multi-field indexes aim to provide constant query latency when a query searches across multiple fields. The first implementation of a multi-field index is available as a <a href="https://github.com/opensearch-project/OpenSearch/issues/12498">Star Tree index</a>. The roadmap includes plans to introduce additional context-specific templates, such as those for metrics, traces, and events. It also aims to enhance existing templates with specialized optimizations, including the Star Tree index.</p>
<p><strong>Scaling shard management</strong>: <a href="https://github.com/opensearch-project/OpenSearch/issues/12918">Shard splitting</a> aims to provide the capability to scale shards based on size or throughput with zero downtime for read and write traffic. In search use cases, it can be difficult to predict the number of primary shards in advance. As a result, the OpenSearch cluster can become “hot,” impacting performance. This can yield insufficient resources on the node hosting the shard, potentially triggering Lucene’s hard limit on the number of documents (2B) in a Lucene index. Today, there are two options available to solve this problem: document reindexing or index splitting. With document reindexing, the entire index is reindexed into a new index with a larger number of primary shards. This is a very slow process that requires additional compute and I/O. With index splitting, the index is first marked as read-only, and then all its shards are split, causing write downtime for users. Additionally, the Split API does not provide the granularity of splitting at the shard level, so a single hot shard cannot be scaled independently. In-place shard splitting will address these limitations and provide a more holistic way to scale shards. One challenge of running a bigger cluster is optimally allocating a large number of shards while honoring a set of placement constraints. Because all placement decisions are executed sequentially, the cluster manager is unable to prioritize other critical operations, such as index creation and settings updates, which can eventually time out. To address this issue, all placement decisions are <a href="https://github.com/opensearch-project/OpenSearch/issues/15872">optimized</a> and <a href="https://github.com/opensearch-project/OpenSearch/pull/14848">bounded</a> so they finish early, preventing starvation of critical tasks.</p>
<p><strong>Remote-backed storage and automatic storage tiering</strong>: OpenSearch already offers remote store indexes, which improve durability and indexing performance. Building on this architecture, we plan to deliver an end-to-end <a href="https://github.com/opensearch-project/OpenSearch/issues/3739">multi-tier storage experience</a>, which will provide users with an optimal balance of cost and performance. The <a href="https://github.com/opensearch-project/OpenSearch/issues/12809">warm tier</a> will handle more storage per compute while maintaining the interactive <a href="https://github.com/opensearch-project/OpenSearch/issues/13806">search experience on the warm data</a> without requiring all data to be locally available. The on-demand cold tier experience will provide compute and storage separation, allowing users to store large amounts of data that can be made searchable when needed. Additionally, we’ll introduce new use-case-specific index templates to simplify index configuration for users.</p>
<p><strong>Pull-based ingestion</strong>: Native <a href="https://github.com/opensearch-project/OpenSearch/issues/10610">pull-based ingestion</a> that pulls events from an external event stream provides further benefits compared to the current push-based model. These benefits include better handling of ingestion throughput spikes and removing the need for the translog in the indexing nodes. OpenSearch can be extended to support pull-based indexing, which can also present the possibility of priority-based ingestion. Time-sensitive and critical updates can be isolated from lower-priority events, and ingestion spikes can be handled by throttling low-priority events.</p>
<p><strong>Next-generation snapshots for remote-backed clusters</strong>: <a href="https://github.com/opensearch-project/OpenSearch/issues/15057">Snapshots v2</a> aims to enhance the scalability of snapshots for remote-backed clusters and reduce dependence on per-shard state updates in the cluster manager. The new snapshots rely on a timestamp-based pinning strategy, where instead of resolving shard-level files at snapshot time, the timestamp for the snapshot is pinned and the resolution is deferred until restore time. This approach makes the snapshot process much faster, allowing snapshot operations to finish within a couple of minutes, even for larger clusters, while significantly reducing the computational load associated with data backup. Timestamp pinning serves as the fundamental building block for future features, such as <a href="https://github.com/opensearch-project/OpenSearch/issues/1147">Point-In-Time-Restore (PITR)</a>.</p>
<p><strong>Scaling admin APIs</strong>: For large cluster configurations, cluster manager nodes become scaling bottlenecks as multiple admin APIs obtain the cluster state from the active cluster manager node, even if the latest state is present locally or present in a remote store. With the ongoing optimizations, the coordinator node can <a href="https://github.com/opensearch-project/OpenSearch/pull/12252/">serve the admin APIs without relaying the request</a> to the cluster manager node in most cases. Also, for APIs like CAT Shards and CAT Snapshots, the response size increases as the cluster expands to 100K shards or more. We plan to introduce <a href="https://github.com/opensearch-project/OpenSearch/issues/14258">pagination</a> and <a href="https://github.com/opensearch-project/OpenSearch/issues/13908">cancellation</a> for these APIs to ensure that they continue to operate efficiently regardless of the metadata size. We are implementing multiple optimizations to the Stats and Cluster APIs that will eliminate redundant processing and perform <a href="https://github.com/opensearch-project/OpenSearch/pull/14426">pre-aggregation</a> on the data node before responding to the coordinator node receiving the user request.</p>
<h3 id="roadmap-theme-6-stability-availability-and-resiliency">Roadmap Theme 6: Stability, Availability, and Resiliency</h3>
<p>OpenSearch is designed to provide capabilities for search and analytics at scale by using the underlying Lucene search engine that also powers other distributed systems. The OpenSearch Project has dedicated time and effort to improving stability and resiliency and making the service highly available. The following are some of the planned key efforts.</p>
<p><a href="https://github.com/opensearch-project/OpenSearch/issues/7334"><strong>Coordinator-level latency visibility</strong></a>: This initiative provides users visibility into the different phases of search request execution in OpenSearch. This is particularly useful for statistically identifying possible changes in a workload by monitoring latency metrics across different phases. Coordinator slow logs were recently introduced to give users the ability to capture “slow” requests along with a breakdown of time spent in different search phases, something that was otherwise only available for the query and fetch phases.</p>
<p><a href="https://github.com/opensearch-project/OpenSearch/issues/11429"><strong>Query insights</strong></a>: We recently introduced the ability for users to access computationally expensive queries (top N queries). We plan to integrate OpenSearch with external metrics collectors, like OpenTelemetry, to deliver more comprehensive analytics. Currently, queries can be analyzed according to various metrics, such as latency, CPU, memory utilization, and even query structure. Support for visualizing the execution profile will help users easily identify bottlenecks in their workload execution. Given a sufficient level of insight data, we will use AI/ML to build recommendation systems, which will eventually be able to automatically manage cluster settings for users with minimal intervention on their part.</p>
<p><a href="https://github.com/opensearch-project/OpenSearch/issues/1329"><strong>Query resiliency</strong></a>: One significant risk to cluster stability is runaway queries that continuously consume memory, leading to out-of-memory states and potentially catastrophic outcomes. Search backpressure introduces a mechanism to automatically identify and terminate such problematic queries when an OpenSearch host is low on memory or CPU. Existing mechanisms like circuit breakers and thread pool size thresholds provide a generic solution, but they do not specifically target the problematic queries. New search backpressure and hard cancellation techniques are designed to address these limitations.</p>
<p><a href="https://github.com/opensearch-project/OpenSearch/issues/11061"><strong>Workload management</strong></a>: An OpenSearch installation often contains a large number of tenants, all of which experience the same quality of service (QoS). However, this potentially means that an inexperienced tenant can consume more than the desired amount of cluster resources, which can lead to a degraded experience for other tenants. Admission control and search backpressure provide a best-effort assurance for cluster stability but do not guarantee a consistent QoS. With the introduction of query groups, system administrators of OpenSearch clusters will be able to provide tenant-based performance isolation for search workloads, manage tenant-based query groups, and enforce resource-based limits on tenant workloads. This enhancement will allow system administrators to prioritize execution of some workloads over others, thereby further improving QoS guarantee levels.</p>
<p><a href="https://github.com/opensearch-project/OpenSearch/issues/13257"><strong>Cluster state management</strong></a>: The cluster manager node manages all admin operations in a cluster. These operations include creating and deleting indexes, updating fields in an existing index, taking snapshots, and adding and removing nodes. The metadata about indexes and other entities—data streams, templates, aliases, snapshots, and custom entities stored by plugins—is stored in a data structure called the cluster state. Any change to the cluster state is processed by the cluster manager node and persisted to that node’s local disk. Starting with version 2.12, OpenSearch added support for storing the cluster state remotely in order to provide durability guarantees. With the introduction of a remote cluster state, replacing all cluster manager nodes will not result in any data loss in remote store clusters. The cluster manager node processes any cluster state updates and then sends the updated state to all the follower nodes in the cluster. As the state and number of follower nodes grow, the overhead on the cluster manager node increases significantly because the cluster manager node is responsible for publishing the updated state to every node in the cluster. This impacts the cluster’s stability and availability. To reduce strain on the cluster manager node, we are proposing to use the remote store for cluster state publication. The cluster manager node will publish the entire cluster state to the remote store to be downloaded by each follower node. The published cluster state will include ephemeral entities like the <a href="https://github.com/opensearch-project/OpenSearch/issues/14164">shard routing table</a>, which stores the mapping of the shards assigned to each data node in the cluster. The cluster manager node will only communicate that a new state is available and provide the remote location of the new state, instead of publishing the entire cluster state. Publishing the state remotely will reduce memory, CPU, and transport thread overhead on the cluster manager node during cluster state changes. This approach will also allow on-demand downloading of entities on the data or coordinator nodes instead of requiring all nodes to maintain the full cluster state. This will align with our vision of a more cloud-native architecture. Remote publication will be generally available in OpenSearch 2.17 and is planned to be further enhanced in future version releases.</p>
<p><a href="https://github.com/opensearch-project/data-prepper/issues/3857"><strong>Data Prepper pipeline DLQ</strong></a>: Data Prepper provides resilience when OpenSearch is down by buffering data and eventually writing to a dead-letter queue (DLQ) if the cluster remains unavailable. Currently supported DLQ targets are local files and Amazon S3. One current limitation is that data is only sent to the DLQ if it fails to write to the sink. Other failures, such as during processing in the pipeline, do not case data to be sent to the DLQ. With the proposed pipeline DLQ, Data Prepper will be able to send failed events to the DLQ or continue to send them downstream, allowing the pipeline author to decide. This will improve the resiliency of data throughout the pipeline. Additionally, the pipeline DLQ will be a pipeline just like any other and will be able to write to any supported Data Prepper sink, such as Apache Kafka.</p>
<h3 id="roadmap-theme-7-security">Roadmap Theme 7: Security</h3>
<p>Security is a Tier 0 prerequisite for modern workloads. In OpenSearch, security features are primarily implemented by the Security plugin, which offers a rich set of capabilities. These include various authentication backends (SAML, JWT, LDAP), authorization primitives, fine-grained access control (document-level and file-level security, or DLS/FLS), and encryption in transit. OpenSearch has <a href="https://github.com/orgs/opensearch-project/projects/206/views/11?sliceBy%5Bvalue%5D=Security">rapidly developed new plugin capabilities</a>, attracting increased interest from the community. This growth also raises critical security implications. Importantly, security should not come at the cost of performance. To address these challenges, OpenSearch is focusing on the following initiatives to strengthen its security posture.</p>
<p><a href="https://github.com/opensearch-project/security/issues/4500"><strong>Plugin resource permissions</strong></a>: We are developing a mechanism for sharing plugin resources that supports existing use cases while allowing more granular control over resource sharing. Examples include model groups in the ML Commons plugin, anomaly detectors in the Time Series Analytics plugin, and detectors in the Alerting plugin.</p>
<p><a href="https://github.com/opensearch-project/security/issues/4439"><strong>Plugin isolation</strong></a>: OpenSearch is moving toward a zero-trust model for plugins. Cluster administrators will have full <a href="https://github.com/opensearch-project/security/issues/2860">visibility into all permissions</a> requested by a plugin before installation.</p>
<p><a href="https://github.com/opensearch-project/security/issues/3870"><strong>Optimized privilege evaluation</strong></a>: Performance is a key focus for OpenSearch. We’ve identified areas within the Security plugin that can yield significant performance improvements, especially for clusters with numerous indexes or roles mapped to users.</p>
<p><a href="https://github.com/opensearch-project/security/issues/4009"><strong>API tokens</strong></a>: API tokens introduce a new way to interact with OpenSearch clusters by associating permissions directly with a token. Cluster administrators will have full visibility into and control over the issued tokens and their usage.</p>
<p><a href="https://github.com/opensearch-project/security-dashboards-plugin/issues/2070"><strong>Ease of use</strong></a>: We aim to simplify security setup for cluster administrators. Many useful security features remain underused because they are not exposed through OpenSearch Dashboards. To address this, we will add security dashboard pages where administrators can configure rate limiters to protect clusters from unauthenticated actors.</p>
<p>Looking ahead, security primitives like <a href="https://github.com/opensearch-project/security/issues/4702">authorization could be extracted and made pluggable</a>, allowing integration with newer open standards for policy evaluation, such as Open Policy Agent (OPA) or Cedar.</p>
<h3 id="roadmap-theme-8-modular-architecture">Roadmap Theme 8: Modular Architecture</h3>
<p>OpenSearch is working toward well-supported modularity in order to enable <strong>rapid development of properly encapsulated features and flexible deployment architectures</strong> for cloud-native use cases. Historically, OpenSearch has been deployed and operated as a cluster model, in which all functions (such as replication and durability) were implemented within the cluster. While the project has grown organically, offering many extension points through plugins, it still relies on a monolithic server module at its core, with tight coupling across the architecture. As the project grows within a globally distributed community, this monolithic architecture will become an unsustainable bottleneck. Innovations such as the next-generation query engine are not possible with tightly coupled components. Additionally, the Java Security Manager is pending deprecation and removal from the Java runtime, and the recommended replacement technique, (<a href="https://inside.java/2021/04/23/security-and-sandboxing-post-securitymanager/">shallow sandboxing</a>), relies on using <a href="https://github.com/opensearch-project/OpenSearch/issues/1588">newer language features that require properly modularized code</a>. The overall goal of the <a href="https://github.com/opensearch-project/OpenSearch/issues/5910">modularity effort</a> is to allow the same core OpenSearch code to run across all variants (for example, on-premises clusters and large managed serverless offerings) while providing strong encapsulation of cluster functions. This will facilitate more independent development and innovation across the project.</p>
<h3 id="roadmap-theme-9-releases-and-project-health">Roadmap Theme 9: Releases and Project Health</h3>
<p>With contributions ranging from code enhancements to feature requests across all roadmap themes, the OpenSearch community is working together to maintain the stability of the codebase while ensuring that CI/CD pipelines remain green across all active branches. This provides a reliable foundation for both new and existing contributors, reduces bugs, and safeguards feature integrity. Key repository health metrics are publicly available on the <a href="https://metrics.opensearch.org/_dashboards/app/dashboards#/view/f1ad21c0-e323-11ee-9a74-07cd3b4ff414?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-4y,to:now))&amp;_a=(description:'OpenSearch%20Ops%20Metrics',filters:!(),fullScreenMode:!f,options:(hidePanelTitles:!f,useMargins:!t),query:(language:kuery,query:''),timeRestore:!t,title:'OpenSearch%20Ops%20Metrics',viewMode:view)">Ops Dashboard</a>.</p>
<p>The OpenSearch <a href="https://github.com/opensearch-project/.github/blob/main/RELEASING.md">release process</a> is fully automated, including a one-click release system for products such as OpenSearch Benchmark. Each product adheres to <a href="https://opensearch.org/blog/what-is-semver/">semantic versioning (semver)</a>, ensuring that breaking changes only occur in major versions. Releases follow a structured <a href="https://opensearch.org/releases.html#release-schedule">schedule</a>, starting with a code freeze and release candidate generation, and are driven by automated workflows that eliminate the need for manual sign-offs. We’re also building a <a href="https://opensearch.org/release-dashboard">Central Release Dashboard</a> to streamline and provide visibility into the release pipeline from beginning to end.</p>
<h2 id="get-involved">Get involved</h2>
<p>We recognize that community engagement is crucial to the success of all the innovations mentioned in this post. We invite the open-source community to review our roadmap, provide feedback, and contribute to the OpenSearch Project. Your insights and contributions will be invaluable in helping us to achieve these goals and continue improving OpenSearch.</p>
<p>You can <a href="https://github.com/opensearch-project/.github/blob/main/FEATURES.md">propose new ideas and features</a> at any time by creating a GitHub issue and following our <a href="https://github.com/opensearch-project/.github/blob/main/.github/ISSUE_TEMPLATE/FEATURE_REQUEST_TEMPLATE.md">feature request template</a>. Once proposed, the feature can be included in the <a href="https://github.com/orgs/opensearch-project/projects/206/views/11">public roadmap</a> by adding corresponding labels (such as Meta, RFC, or Roadmap), which are automatically populated for all the repositories and are categorized by themes for clarity. If you have any questions or suggestions for improving our processes, please feel free to reach out or contribute directly through <a href="https://github.com/opensearch-project">GitHub</a>.</p>
<p>We encourage you to actively participate in our project because your involvement will help shape the future of OpenSearch. By engaging with our community, sharing your ideas, and contributing to development, you’ll play a crucial role in driving innovation and improving the project. Thank you for your continued support and commitment to open source!</p></content><author><name>pallp</name></author><category term="community-updates" /><summary type="html">OpenSearch is a rapidly growing open-source product suite comprising a search engine, an ingestion system, language clients, and a user interface for analytics. OpenSearch contributors and maintainers are innovating in all these areas at a fast pace. To steer the project's development effectively, we have revamped the project roadmap to provide better transparency into both short- and long-term enhancements. In this blog post, we are excited to share the new theme-based, community-driven OpenSearch Project Roadmap for 2024–2025.</summary></entry><entry><title type="html">Data Prepper 2.9.0 is ready for download</title><link href="https://kolchfa-aws.github.io/blog/Data-Prepper-2.9.0-is-ready-for-download/" rel="alternate" type="text/html" title="Data Prepper 2.9.0 is ready for download" /><published>2024-08-29T18:30:00+00:00</published><updated>2024-09-05T16:18:17+00:00</updated><id>https://kolchfa-aws.github.io/blog/Data-Prepper-2.9.0-is-ready-for-download</id><content type="html" xml:base="https://kolchfa-aws.github.io/blog/Data-Prepper-2.9.0-is-ready-for-download/"><h2 id="introduction">Introduction</h2>
<p>You can download Data Prepper 2.9.0 today.
This release includes a number of core improvements as well as improvements to many popular processors.</p>
<h2 id="expression-improvements">Expression improvements</h2>
<p>Data Prepper continues to improve support for expressions to allow you more control over conditions that you use for routing and conditional processing.
In this release, Data Prepper adds support for set operations.
These operations allow you to write conditions that check whether a value is in a set of possible values.
This can be especially useful for routing, where you need to route data depending on the originating system.</p>
<p>Additionally, Data Prepper has a new <code class="language-plaintext highlighter-rouge">startsWith</code> function that determines whether a string value starts with another string.</p>
<h2 id="default-route">Default route</h2>
<p>Data Prepper has offered sink routing since version 2.0.
With this capability, pipeline authors can use Data Prepper expressions to route events to different sinks in order to meet their requirements.
One challenge experienced by pipeline authors has been how to handle events that do not match any existing routes.
A common solution to this challenge has been to create a route that is the inverse of other routes.
However, this required copying and inverting the other conditions, which could be difficult to handle and even more difficult to maintain.</p>
<p>Now Data Prepper supports a special route named <code class="language-plaintext highlighter-rouge">_default</code>.
By applying this route to a sink, pipeline authors can ensure that events that do not match any other routes will be sent to a default sink of their choosing.</p>
<p>For example, consider a simple situation in which you want to route frontend and backend events to different sinks.
You can define two sinks for these events and then define your routes.
But what if you receive events that do not match?
The following sample pipeline shows an approach to handling events that do not match either the frontend or backend routes:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>routes:
- frontend: '/service == "front-end"`
- backend: '/service == "back-end"`
sink:
- opensearch:
routes:
- front-end
- opensearch:
routes:
- back-end
- opensearch:
routes:
- _default
</code></pre></div></div>
<h2 id="performance">Performance</h2>
<p>The Data Prepper maintainers have been working toward improving the performance of Data Prepper.
This release includes a number of internal improvements that speed up processing for many processors.
You don’t need to do anything other than update your version to experience these improvements.</p>
<p>Data Prepper 2.9 also offers some new features that you can use to help reduce out-of-memory errors or circuit breaker trips.
Many pipelines involve extracting source data from a string into a structure.
Some examples are <code class="language-plaintext highlighter-rouge">grok</code> and <code class="language-plaintext highlighter-rouge">parse_json</code>.
When you use these processors, you more than double the size of each event that you process.
Because the events flowing through the system consume the largest portion of memory usage, this will greatly increase your memory requirements.</p>
<p>Many pipeline authors may use these processors and then remove the source data in a second processor.
This is a good approach when you don’t need to store the original string in your sink.
But it doesn’t always make the memory used by the string available for garbage collection when you need it.
The reason for this is that Data Prepper pipelines operate on batches of data.
As these batches of data move through the pipeline, the pipeline will expand the memory usage in one processor and then attempt to reduce it in the next.
Because the memory expansion happens in batches, Data Prepper may expand many thousands of events before starting to remove the source data.</p>
<p>See the following example pipeline, which runs <code class="language-plaintext highlighter-rouge">grok</code> and then <code class="language-plaintext highlighter-rouge">delete_entries</code>.
With a configured <code class="language-plaintext highlighter-rouge">batch_size</code> of 100,000, Data Prepper will expand 100,000 events before deleting the messages.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>my-pipeline:
buffer:
bounded_blocking:
batch_size: 100000
processor:
- grok:
match:
message: ["..."]
- delete_entries:
with_keys: ["message"]
</code></pre></div></div>
<p>To help with this memory usage issue, Data Prepper now provides a <code class="language-plaintext highlighter-rouge">delete_source</code> flag on some of these processors, including <code class="language-plaintext highlighter-rouge">grok</code> and <code class="language-plaintext highlighter-rouge">parse_json</code>.</p>
<p>Returning to the preceding example, you could both simplify the pipeline and reduce the amount of memory used in between processors:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>my-pipeline:
buffer:
bounded_blocking:
batch_size: 100000
processor:
- grok:
match:
message: ["..."]
delete_source: true
</code></pre></div></div>
<p>If you observe this pattern of the source being deleted in a separate processor, configure your pipeline to use <code class="language-plaintext highlighter-rouge">delete_source</code> in order to improve your overall memory usage.</p>
<h2 id="getting-started">Getting started</h2>
<ul>
<li>To download Data Prepper, visit the <a href="https://opensearch.org/downloads.html">OpenSearch downloads</a> page.</li>
<li>For instructions on how to get started with Data Prepper, see <a href="https://opensearch.org/docs/latest/data-prepper/getting-started/">Getting started with Data Prepper</a>.</li>
<li>To learn more about the work in progress for Data Prepper 2.10 and other releases, see the <a href="https://github.com/orgs/opensearch-project/projects/221">Data Prepper roadmap</a>.</li>
</ul>
<h2 id="thanks-to-our-contributors">Thanks to our contributors!</h2>
<p>The following community members contributed to this release. Thank you!</p>
<ul>
<li><a href="https://github.com/chenqi0805">chenqi0805</a> – Qi Chen</li>
<li><a href="https://github.com/danhli">danhli</a> – Daniel Li</li>
<li><a href="https://github.com/dinujoh">dinujoh</a> – Dinu John</li>
<li><a href="https://github.com/dlvenable">dlvenable</a> – David Venable</li>
<li><a href="https://github.com/graytaylor0">graytaylor0</a> – Taylor Gray</li>
<li><a href="https://github.com/ivan-tse">ivan-tse</a> – Ivan Tse</li>
<li><a href="https://github.com/jayeshjeh">jayeshjeh</a> – Jayesh Parmar</li>
<li><a href="https://github.com/joelmarty">joelmarty</a> – Joël Marty</li>
<li><a href="https://github.com/kkondaka">kkondaka</a> – Krishna Kondaka</li>
<li><a href="https://github.com/mishavay-aws">mishavay-aws</a></li>
<li><a href="https://github.com/oeyh">oeyh</a> – Hai Yan</li>
<li><a href="https://github.com/san81">san81</a> – Santhosh Gandhe</li>
<li><a href="https://github.com/sb2k16">sb2k16</a> – Souvik Bose</li>
<li><a href="https://github.com/shenkw1">shenkw1</a> – Katherine Shen</li>
<li><a href="https://github.com/srikanthjg">srikanthjg</a> – Srikanth Govindarajan</li>
<li><a href="https://github.com/timo-mue">timo-mue</a></li>
</ul></content><author><name>dvenable</name></author><category term="releases" /><summary type="html">Data Prepper 2.9.0 contains core improvements to expressions, routing, performance, and more.</summary></entry><entry><title type="html">Boosting vector search performance with concurrent segment search</title><link href="https://kolchfa-aws.github.io/blog/boost-vector-search-with-css/" rel="alternate" type="text/html" title="Boosting vector search performance with concurrent segment search" /><published>2024-08-27T00:00:00+00:00</published><updated>2024-08-27T22:38:15+00:00</updated><id>https://kolchfa-aws.github.io/blog/boost-vector-search-with-css</id><content type="html" xml:base="https://kolchfa-aws.github.io/blog/boost-vector-search-with-css/"><p>In OpenSearch, data is stored in shards, which are further divided into segments. When you execute a search query, it runs sequentially across all segments of each shard involved in the query. As the number of segments increases, this sequential execution can increase <em>query latency</em> (the time it takes to retrieve the results) because the query has to wait for each segment run to complete before moving on to the next one. This delay becomes especially noticeable if some segments take longer to process queries than others.</p>
<style>
table {
font-size: 16px;
}
h3 {
font-size: 22px;
}
h4 {
font-size: 20px;
}
th {
background-color: #f5f7f7;
}
</style>
<p>Introduced in OpenSearch version 2.12, <em>concurrent segment search</em> addresses this issue by enabling parallel execution of queries across multiple segments within a shard. By using available computing resources, this feature reduces overall query latency, particularly for larger datasets with many segments. Concurrent segment search is designed to provide more consistent and predictable latencies. It achieves this consistency by reducing the impact of variations in segment performance or the number of segments on query execution time.</p>
<p>In this blog post, we’ll explore the impact of concurrent segment search on vector search workloads.</p>
<h2 id="enabling-concurrent-segment-search">Enabling concurrent segment search</h2>
<p>By default, concurrent segment search is disabled in OpenSearch. For our experiments, we enabled it for all indexes in the cluster by using the following dynamic cluster setting:</p>
<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="err">PUT</span><span class="w"> </span><span class="err">_cluster/settings</span><span class="w">
</span><span class="p">{</span><span class="w">
</span><span class="nl">"persistent"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="nl">"search.concurrent_segment_search.enabled"</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>
<p>To achieve concurrent segment searches, OpenSearch divides the segments within each shard into multiple slices, with each slice processed in parallel on a separate thread. The number of slices determines the degree of parallelism that OpenSearch can provide. You can either use Lucene’s default slicing mechanism or set the maximum slice count manually. For detailed instructions on updating the slice count, see <a href="https://opensearch.org/docs/latest/search-plugins/concurrent-segment-search/#slicing-mechanisms">Slicing mechanisms</a>.</p>
<h2 id="performance-results">Performance results</h2>
<p>We performed our tests on an <a href="https://opensearch.org/versions/opensearch-2-15-0.html">OpenSearch 2.15</a> cluster using the OpenSearch Benchmark <a href="https://github.com/opensearch-project/opensearch-benchmark-workloads/tree/main/vectorsearch">vector search workload</a>. We used the Cohere dataset with two different configurations to evaluate the performance improvements of vector search queries when running the workload with concurrent segment search disabled, enabled with default settings, and enabled with different max slice counts.</p>
<h3 id="cluster-setup">Cluster setup</h3>
<ul>
<li>3 data nodes (r5.4xlarge: 128 GB RAM, 16 vCPUs, 250 GB disk space)</li>
<li>3 cluster manager nodes (r5.xlarge: 32 GB RAM, 4 vCPUs, 50 GB disk space)</li>
<li>1 OpenSearch workload client (c5.4xlarge: 32 GB RAM, 16 vCPUs)</li>
<li>1 and 4 search clients</li>
<li><code class="language-plaintext highlighter-rouge">index_searcher</code> thread pool size: 32</li>
</ul>
<h4 id="index-settings">Index settings</h4>
<table>
<thead>
<tr>
<th><code class="language-plaintext highlighter-rouge">m</code></th>
<th><code class="language-plaintext highlighter-rouge">ef_construction</code></th>
<th><code class="language-plaintext highlighter-rouge">ef_search</code></th>
<th>Number of shards</th>
<th>Replica count</th>
<th>Space type</th>
</tr>
</thead>
<tbody>
<tr>
<td>16</td>
<td>100</td>
<td>100</td>
<td>6</td>
<td>1</td>
<td>inner product</td>
</tr>
</tbody>
</table>
<h4 id="configuration">Configuration</h4>
<table>
<thead>
<tr>
<th style="text-align: left">Dimension</th>
<th style="text-align: left">Vector count</th>
<th style="text-align: left">Search query count</th>
<th style="text-align: left">Refresh interval</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left">768</td>
<td style="text-align: left">10M</td>
<td style="text-align: left">10K</td>
<td style="text-align: left">1s (default)</td>
</tr>
</tbody>
</table>
<h3 id="service-time-comparison">Service time comparison</h3>
<p>We conducted the following experiments:</p>
<ol>
<li><a href="#experiment-1-concurrent-search-disabled">Concurrent search disabled</a></li>
<li>Concurrent search enabled:
<ul>
<li><a href="#experiment-2-concurrent-search-enabled-max-slice-count--0-default">Max slice count = 0 (default)</a></li>
<li><a href="#experiment-3-concurrent-search-enabled-max-slice-count--2">Max slice count = 2</a></li>
<li><a href="#experiment-4-concurrent-search-enabled-max-slice-count--4">Max slice count = 4</a></li>
<li><a href="#experiment-5-concurrent-search-enabled-max-slice-count--8">Max slice count = 8</a></li>
</ul>
</li>
</ol>
<p>The following sections present the results of these experiments.</p>
<h4 id="experiment-1-concurrent-search-disabled">Experiment 1: Concurrent search disabled</h4>
<table border="1">
<thead>
<tr>
<th>k-NN engine</th>
<th>Segment count</th>
<th>Num search clients</th>
<th colspan="3">Service time (ms)</th>
<th>Max CPU %</th>
<th>% JVM heap used</th>
<th>Recall</th>
</tr>
<tr>
<th></th>
<th></th>
<th></th>
<th>p50</th>
<th>p90</th>
<th>p99</th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="2">Lucene</td>
<td rowspan="2">381</td>
<td>1</td>
<td>30</td>
<td>37</td>
<td>45</td>
<td>11</td>
<td>53.48</td>
<td>0.97</td>
</tr>
<tr>
<td>4</td>
<td>36</td>
<td>43</td>
<td>51</td>
<td>38</td>
<td>42</td>
<td>0.97</td>
</tr>
<tr>
<td rowspan="2">NMSLIB</td>
<td rowspan="2">383</td>
<td>1</td>
<td>28</td>
<td>35</td>
<td>41</td>
<td>10</td>
<td>47.5</td>
<td>0.97</td>
</tr>
<tr>
<td>4</td>
<td>35</td>
<td>41</td>
<td>46</td>
<td>36</td>
<td>48.06</td>
<td>0.97</td>
</tr>
<tr>
<td rowspan="2">Faiss</td>
<td rowspan="2">381</td>
<td>1</td>
<td>29</td>
<td>37</td>
<td>42</td>
<td>10</td>
<td>47.85</td>
<td>0.97</td>
</tr>
<tr>
<td>4</td>
<td>36</td>
<td>40</td>
<td>44</td>
<td>38</td>
<td>46.38</td>
<td>0.97</td>
</tr>
</tbody>
</table>
<h4 id="experiment-2-concurrent-search-enabled-max-slice-count--0-default">Experiment 2: Concurrent search enabled, max slice count = 0 (default)</h4>
<table border="1">
<thead>
<tr>
<th>k-NN engine</th>
<th>Segment count</th>
<th>Num search clients</th>
<th colspan="3">Service time (ms)</th>
<th>Max CPU %</th>
<th>% JVM heap used</th>
<th>Recall</th>
</tr>
<tr>
<th></th>
<th></th>
<th></th>
<th>p50</th>
<th>p90</th>
<th>p99</th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="2">Lucene</td>
<td rowspan="2">381</td>
<td>1</td>
<td>13</td>
<td>15</td>
<td>17</td>
<td>47</td>
<td>47.99</td>
<td>0.97</td>
</tr>
<tr>
<td>4</td>
<td>27</td>
<td>32</td>
<td>37</td>
<td>81</td>
<td>45.95</td>
<td>0.97</td>
</tr>
<tr>
<td rowspan="2">NMSLIB</td>
<td rowspan="2">383</td>
<td>1</td>
<td>13</td>
<td>14</td>
<td>16</td>
<td>38</td>
<td>47.28</td>
<td>0.97</td>
</tr>
<tr>
<td>4</td>
<td>24</td>
<td>27</td>
<td>32</td>
<td>75</td>
<td>44.76</td>
<td>0.97</td>
</tr>
<tr>
<td rowspan="2">Faiss</td>
<td rowspan="2">381</td>
<td>1</td>
<td>13</td>
<td>14</td>
<td>16</td>
<td>34</td>
<td>46.04</td>
<td>0.97</td>
</tr>
<tr>
<td>4</td>
<td>25</td>
<td>28</td>
<td>33</td>
<td>76</td>
<td>47.72</td>
<td>0.97</td>
</tr>
</tbody>
</table>
<h4 id="experiment-3-concurrent-search-enabled-max-slice-count--2">Experiment 3: Concurrent search enabled, max slice count = 2</h4>
<table border="1">
<thead>
<tr>
<th>k-NN engine</th>
<th>Segment count</th>
<th>Num search clients</th>
<th colspan="3">Service time (ms)</th>
<th>Max CPU %</th>
<th>% JVM heap used</th>
<th>Recall</th>
</tr>
<tr>
<th></th>
<th></th>
<th></th>
<th>p50</th>
<th>p90</th>
<th>p99</th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="2">Lucene</td>
<td rowspan="2">381</td>
<td>1</td>
<td>14</td>
<td>16</td>
<td>19</td>
<td>41</td>
<td>52.91</td>
<td>0.97</td>
</tr>
<tr>
<td>4</td>
<td>28</td>
<td>34</td>
<td>42</td>
<td>88</td>
<td>51.65</td>
<td>0.97</td>
</tr>
<tr>
<td rowspan="2">NMSLIB</td>
<td rowspan="2">383</td>
<td>1</td>
<td>20</td>
<td>23</td>
<td>25</td>
<td>16</td>
<td>44.97</td>
<td>0.97</td>
</tr>
<tr>
<td>4</td>
<td>23</td>
<td>27</td>
<td>33</td>
<td>60</td>
<td>41.06</td>
<td>0.97</td>
</tr>
<tr>
<td rowspan="2">Faiss</td>
<td rowspan="2">381</td>
<td>1</td>
<td>20</td>
<td>22</td>
<td>24</td>
<td>19</td>
<td>46.42</td>
<td>0.97</td>
</tr>
<tr>
<td>4</td>
<td>23</td>
<td>26</td>
<td>32</td>
<td>67</td>
<td>37.23</td>
<td>0.97</td>
</tr>
</tbody>
</table>
<h4 id="experiment-4-concurrent-search-enabled-max-slice-count--4">Experiment 4: Concurrent search enabled, max slice count = 4</h4>
<table border="1">
<thead>
<tr>
<th>k-NN engine</th>
<th>Segment count</th>
<th>Num search clients</th>
<th colspan="3">Service time (ms)</th>
<th>Max CPU %</th>
<th>% JVM heap used</th>
<th>Recall</th>
</tr>
<tr>
<th></th>
<th></th>
<th></th>
<th>p50</th>
<th>p90</th>
<th>p99</th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="2">Lucene</td>
<td rowspan="2">381</td>
<td>1</td>
<td>13.6</td>
<td>15.9</td>
<td>17.6</td>
<td>49</td>
<td>53.37</td>
<td>0.97</td>
</tr>
<tr>
<td>4</td>
<td>28</td>
<td>33</td>
<td>41</td>
<td>86</td>
<td>50.12</td>
<td>0.97</td>
</tr>
<tr>
<td rowspan="2">NMSLIB</td>
<td rowspan="2">383</td>
<td>1</td>
<td>14</td>
<td>15</td>
<td>16</td>
<td>29</td>
<td>51.12</td>
<td>0.97</td>
</tr>
<tr>
<td>4</td>
<td>21</td>
<td>25</td>
<td>31</td>
<td>72</td>
<td>42.63</td>
<td>0.97</td>
</tr>
<tr>
<td rowspan="2">Faiss</td>
<td rowspan="2">381</td>
<td>1</td>
<td>14</td>
<td>15</td>
<td>17</td>
<td>30</td>
<td>41.1</td>
<td>0.97</td>
</tr>
<tr>
<td>4</td>
<td>23</td>
<td>28</td>
<td>37</td>
<td>77</td>
<td>47.19</td>
<td>0.97</td>
</tr>
</tbody>
</table>
<h4 id="experiment-5-concurrent-search-enabled-max-slice-count--8">Experiment 5: Concurrent search enabled, max slice count = 8</h4>
<table border="1">
<thead>
<tr>
<th>k-NN engine</th>
<th>Segment count</th>
<th>Num search clients</th>
<th colspan="3">Service time (ms)</th>
<th>Max CPU %</th>
<th>% JVM heap used</th>
<th>Recall</th>
</tr>
<tr>
<th></th>
<th></th>
<th></th>
<th>p50</th>
<th>p90</th>
<th>p99</th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="2">Lucene</td>
<td rowspan="2">381</td>
<td>1</td>
<td>14</td>
<td>16</td>
<td>18</td>
<td>43</td>
<td>45.37</td>
<td>0.97</td>
</tr>
<tr>
<td>4</td>
<td>28</td>
<td>34</td>
<td>43</td>
<td>87</td>
<td>48.79</td>
<td>0.97</td>
</tr>
<tr>
<td rowspan="2">NMSLIB</td>
<td rowspan="2">383</td>
<td>1</td>
<td>10</td>
<td>12</td>
<td>14</td>
<td>41</td>
<td>45.21</td>
<td>0.97</td>
</tr>
<tr>
<td>4</td>
<td>23</td>
<td>25</td>
<td>29</td>
<td>75</td>
<td>45.87</td>
<td>0.97</td>
</tr>
<tr>
<td rowspan="2">Faiss</td>
<td rowspan="2">381</td>
<td>1</td>
<td>15</td>
<td>16</td>
<td>17</td>
<td>44</td>
<td>48.68</td>
<td>0.97</td>
</tr>
<tr>
<td>4</td>
<td>23</td>
<td>26</td>
<td>32</td>
<td>79</td>
<td>47.19</td>
<td>0.97</td>
</tr>
</tbody>
</table>
<h3 id="comparing-results">Comparing results</h3>
<p>For simplicity, we’ll focus on the p90 metric with a single search client because this metric captures the performance of long-running vector search queries.</p>
<h4 id="service-time-comparison-p90">Service time comparison (p90)</h4>
<table>
<thead>
<tr>
<th>k-NN engine</th>
<th>Concurrent segment search disabled</th>
<th>Concurrent segment search enabled (Lucene default number of slices)</th>
<th>% Improvement</th>
<th>Concurrent segment search with max slice count = 2</th>
<th>% Improvement</th>
<th>Concurrent segment search with max slice count = 4</th>
<th>% Improvement</th>
<th>Concurrent segment search with max slice count = 8</th>
<th>% Improvement</th>
</tr>
</thead>
<tbody>
<tr>
<td>Lucene</td>
<td>37</td>
<td>15</td>
<td>59.5</td>
<td>16</td>
<td>56.8</td>
<td>15.9</td>
<td>57</td>
<td>16</td>
<td>56.8</td>
</tr>
<tr>
<td>NMSLIB</td>
<td>35</td>
<td>14</td>
<td>60</td>
<td>23</td>
<td>34.3</td>
<td>15</td>
<td>57.1</td>
<td>12</td>
<td>65.7</td>
</tr>
<tr>
<td>Faiss</td>
<td>37</td>
<td>14</td>
<td>62.2</td>
<td>22</td>
<td>40.5</td>
<td>15</td>
<td>59.5</td>
<td>16</td>
<td>56.8</td>
</tr>
</tbody>
</table>
<h4 id="cpu-utilization-comparison">CPU utilization comparison</h4>
<table>
<thead>
<tr>
<th>k-NN engine</th>
<th>Concurrent segment search disabled</th>
<th>Concurrent segment search enabled (Lucene default number of slices)</th>
<th>% Additional CPU utilization</th>
<th>Concurrent segment search with max slice count = 2</th>
<th>% Additional CPU utilization</th>
<th>Concurrent segment search with max slice count = 4</th>
<th>% Additional CPU utilization</th>
<th>Concurrent segment search with max slice count = 8</th>
<th>% Additional CPU utilization</th>
</tr>
</thead>
<tbody>
<tr>
<td>Lucene</td>
<td>11</td>
<td>47</td>
<td>36</td>
<td>41</td>
<td>30</td>
<td>49</td>
<td>38</td>
<td>43</td>
<td>32</td>
</tr>
<tr>
<td>NMSLIB</td>
<td>10</td>
<td>38</td>
<td>28</td>
<td>16</td>
<td>6</td>
<td>29</td>
<td>19</td>
<td>41</td>
<td>31</td>
</tr>
<tr>
<td>Faiss</td>
<td>10</td>
<td>34</td>
<td>24</td>
<td>19</td>
<td>9</td>
<td>30</td>
<td>20</td>
<td>44</td>
<td>34</td>
</tr>
</tbody>
</table>
<p>As demonstrated by our performance benchmarks, enabling concurrent segment search with the default slice count delivers at least a <strong>60% improvement</strong> in vector search service time while requiring only <strong>25–35% more CPU</strong>. This increase in CPU utilization is expected because concurrent segment search runs on more CPU threads—the number of threads is equal to twice the number of CPU cores.</p>
<p>We observed a similar improvement in service time when using multiple concurrent search clients. However, maximum CPU utilization also doubled, as expected, because of the increased number of active search threads running concurrently.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Our experiments clearly show that enabling concurrent segment search with the default slice count improves vector search query performance, albeit at the cost of higher CPU utilization. We recommend testing your workload to determine whether the additional parallelization achieved by increasing the slice count outweighs the additional processing overhead.</p>
<p>Before running concurrent segment search, we recommend force-merging segments into a single segment to achieve better performance. The major disadvantage of this approach is that the time required for force-merging increases as segments grow larger. Thus, we recommend reducing the number of segments in accordance with your use case.</p>
<p>By combining vector search with concurrent segment search, you can improve query performance and optimize search operations. To get started with concurrent segment search, explore the <a href="https://opensearch.org/docs/latest/search-plugins/concurrent-segment-search/">documentation</a>.</p></content><author><name>vijay</name></author><category term="technical-posts" /><category term="search" /><summary type="html">In OpenSearch, data is stored in shards, which are further divided into segments. When you execute a search query, it runs sequentially across all segments of each shard involved in the query. As the number of segments increases, this sequential execution can increase query latency (the time it takes to retrieve the results) because the query has to wait for each segment run to complete before moving on to the next one. This delay becomes especially noticeable if some segments take longer to process queries than others.</summary></entry><entry><title type="html">Improving search efficiency and accuracy with the newest v2 neural sparse models</title><link href="https://kolchfa-aws.github.io/blog/neural-sparse-v2-models/" rel="alternate" type="text/html" title="Improving search efficiency and accuracy with the newest v2 neural sparse models" /><published>2024-08-21T00:00:00+00:00</published><updated>2024-08-21T20:01:58+00:00</updated><id>https://kolchfa-aws.github.io/blog/neural-sparse-v2-models</id><content type="html" xml:base="https://kolchfa-aws.github.io/blog/neural-sparse-v2-models/"><p>Neural sparse search is a novel and efficient method for semantic retrieval, <a href="https://opensearch.org/blog/improving-document-retrieval-with-sparse-semantic-encoders/">introduced in OpenSearch 2.11</a>. Sparse encoding models encode text into (token, weight) entries, allowing OpenSearch to build indexes and perform searches using Lucene’s inverted index. Neural sparse search is efficient and generalizes well in out-of-domain (OOD) scenarios. We are excited to announce the release of our v2 series neural sparse models:</p>
<ul>
<li><strong>v2-distill model</strong>: This model <strong>reduces model parameters by 50%</strong>, resulting in lower memory requirements and costs. It <strong>increases ingestion throughput by 1.39 on GPU and 1.74x on CPU</strong>. The v2-distill architecture supports both the doc-only and bi-encoder modes.</li>
<li><strong>v2-mini model</strong>: This model <strong>reduces model parameters by 75%</strong>, also reducing memory requirements and costs. It <strong>increases ingestion throughput by 1.74x on GPUs and 4.18x on CPUs</strong>. The v2-mini architecture supports the doc-only mode.</li>
</ul>
<p>Additionally, all v2 models achieve <strong>better search relevance</strong>. The following table compares search relevance between the v1 and v2 models. All v2 models are now available in both <a href="https://opensearch.org/docs/latest/ml-commons-plugin/pretrained-models/#sparse-encoding-models">OpenSearch</a> and <a href="https://huggingface.co/opensearch-project">Hugging Face</a>.</p>
<table>
<thead>
<tr>
<th>Model</th>
<th>Requires no inference for retrieval</th>
<th>Model parameters</th>
<th>AVG NDCG@10</th>
</tr>
</thead>
<tbody>
<tr>