Skip to content

Commit

Permalink
Allow $ and @ characters in field names. (#4413)
Browse files Browse the repository at this point in the history
* Allow $ and @ characters in field names.

* Add test and update docs.

* Fix typo.
  • Loading branch information
fmassot authored Jan 22, 2024
1 parent f858455 commit af9aeef
Show file tree
Hide file tree
Showing 3 changed files with 13 additions and 9 deletions.
7 changes: 4 additions & 3 deletions docs/configuration/index-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -491,12 +491,13 @@ src.port:53 AND query_params.ctk:e42bb897d
### Field name validation rules

Currently Quickwit only accepts field name that matches the following regular expression:
`[a-zA-Z][_\.\-a-zA-Z0-9]*$`
`^[@$_\-a-zA-Z][@$_\.\-a-zA-Z0-9]{0,254}$`

In plain language:
- it needs to have at least one character.
- it should only contain latin letter `[a-zA-Z]` digits `[0-9]` or (`.`, `-`, `_`).
- the first character needs to be a letter.
- it can only contain uppercase and lowercase ASCII letters `[a-zA-Z]`, digits `[0-9]`, `.`, hyphens `-`, underscores `_`, at `@` and dollar `$` signs.
- it must not start with a dot or a digit.
- it must be different from Quickwit's reserved field mapping names `_source`, `_dynamic`, `_field_presence`.

:::caution
For field names containing the `.` character, you will need to escape it when referencing them. Otherwise the `.` character will be interpreted as a JSON object property access. Because of this, it is recommended to avoid using field names containing the `.` character.
Expand Down
2 changes: 2 additions & 0 deletions docs/configuration/storage-config.md
Original file line number Diff line number Diff line change
Expand Up @@ -146,3 +146,5 @@ storage:
flavor: minio
endpoint: http://127.0.0.1:9000
```

Note: `default_index_root_uri` or index URIs do not include the endpoint, you should set it as a typical S3 path such as `s3://indexes`.
13 changes: 7 additions & 6 deletions quickwit/quickwit-doc-mapper/src/default_doc_mapper/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -46,17 +46,17 @@ pub(crate) use self::tokenizer_entry::{
use crate::QW_RESERVED_FIELD_NAMES;

/// Regular expression validating a field mapping name.
pub const FIELD_MAPPING_NAME_PATTERN: &str = r"^[_\-a-zA-Z][_\.\-a-zA-Z0-9]{0,254}$";
pub const FIELD_MAPPING_NAME_PATTERN: &str = r"^[@$_\-a-zA-Z][@$_\.\-a-zA-Z0-9]{0,254}$";

/// Validates a field mapping name.
/// Returns `Ok(())` if the name can be used for a field mapping. Does not check for reserved field
/// mapping names such as `_source`.
/// Returns `Ok(())` if the name can be used for a field mapping.
///
/// A field mapping name:
/// - may only contain uppercase and lowercase ASCII letters `[a-zA-Z]`, digits `[0-9]`, hyphens
/// `-`, and underscores `_`;
/// - can only contain uppercase and lowercase ASCII letters `[a-zA-Z]`, digits `[0-9]`, `.`,
/// hyphens `-`, underscores `_`, at `@` and dollar `$` signs;
/// - must not start with a dot or a digit;
/// - must be different from Quickwit's reserved field mapping names `_source`, `_dynamic`;
/// - must be different from Quickwit's reserved field mapping names `_source`, `_dynamic`,
/// `_field_presence`;
/// - must not be longer than 255 characters.
pub fn validate_field_mapping_name(field_mapping_name: &str) -> anyhow::Result<()> {
static FIELD_MAPPING_NAME_PTN: Lazy<Regex> =
Expand Down Expand Up @@ -146,6 +146,7 @@ mod tests {
assert!(validate_field_mapping_name("my-field").is_ok());
assert!(validate_field_mapping_name("my.field").is_ok());
assert!(validate_field_mapping_name("my_field").is_ok());
assert!(validate_field_mapping_name("$my_field@").is_ok());
assert!(validate_field_mapping_name(&"a".repeat(255)).is_ok());
}
}

0 comments on commit af9aeef

Please sign in to comment.