Description
There are packages that contain an increasing number of fields on each version. These packages will hit at some point the 2048 limit per data stream we have now. An example is the amazon_security_lake
package, that includes many fields from OCSF.
This, and other limits, exist to have some control on the size of the packages on different dimensions. In the case of data stream fields, this limit exists to avoid performance issues or other problems with indexes that have too many field mappings. See for example the warning about this in the Elasticsearch documentation (Mapping Limits docs).
The total number of fields in a data stream (including dynamic mappings) can be configured in the data stream manifest (elasticsearch.index_template.settings.index.mappings.total_fields
).
Some options to explore:
- Allow to skip validation on number of fields. I would avoid that because is a risk of distributing problematic mappings.
- Allow to override the limit on number of fields. If we do that, I think we should still have a hard limit that cannot be exceeded.
- Refactor the affected packages to make more use of dynamic mappings. We can study the current case and provide a general recommendation to include in docs. We can also be more flexible with the limits for definitions of dynamic mappings.
- Refactor the affected packages, splitting the data stream. Not desired as would be a breaking change in most cases.
cc @mrodm @kpollich @ShourieG for thoughts about possible approaches.