Skip to content

Missing type casting in AGG group qualifier encoding #227

@zipdoki

Description

@zipdoki

Bug

AGG queries return count=0 when a group field's property type is non-String (e.g., Long, Int).

Cause

Write path (mutation): Group field values are serialized using their native runtime type. For example, 1L (Long) is encoded via encodeInt64() (header 0x2C).

Read path (AGG query): WherePredicate.parse(ranges) parses the query string without schema context, so "someLongField:eq:1" yields a String "1". This value reaches ValueUtils.serialize() without any type casting, and is encoded via encodeString() (header 0x34).

The qualifier bytes differ, so the HBase lookup returns no match.

The scan path avoids this problem by applying StartStopItem.ensureType() to cast predicate values to the correct schema type before encoding. The AGG path needs an equivalent type casting step.

Note: String-typed fields happen to work because both paths produce the same encoding. Fields with a bucket also work because bucket.handleQueryValue().toString() normalizes the value on both paths. These cases mask the problem.

Steps to reproduce

  1. Define a group with a non-String field:
mapOf(
    "group" to "my_group",
    "type" to "COUNT",
    "fields" to listOf(
        mapOf("name" to "someLongField", "bucket" to null),
        mapOf("name" to "timestamp", "bucket" to mapOf(/* date bucket */)),
    ),
)
  1. Insert edges with someLongField = 1L
  2. Query: GET .../edges/agg/my_group?...&ranges=someLongField:eq:1;time:eq:...

Expected: Returns the aggregated count
Actual: Returns count=0

Suggested fixes

Option A: Cast predicate values before encoding (minimal change)

Follow the same pattern as scan's StartStopItem.ensureType()

val eqValues = firstPairs.map { (field, predicate) ->
    require(predicate is WherePredicate.Eq)
    val castedValue = if (field.bucket == null) {
        schema.getField(field.name).type.cast(predicate.value)
    } else {
        predicate.value
    }
    field.bucketOrGet(castedValue, ceil = false)
}
  • Pros: Small change, consistent with the existing scan pattern
  • Cons: Only fixes V3QueryService.agg(). Future callers of bucketOrGet() remain unprotected.

Option B: Add type information to Group.Field (structural fix)

Make Group.Field type-aware so that bucketOrGet() itself handles casting:

data class Field(
    val name: String,
    val bucket: Bucket? = null,
    val type: FieldType? = null,
) {
    fun bucketOrGet(value: Any, ceil: Boolean): Any =
        bucket?.handleQueryValue(value, ceil)?.toString()
            ?: type?.cast(value)
            ?: value
}
  • Pros: Fixes the problem at its source. All callers of bucketOrGet() are safe by default.
  • Cons: Wider change — requires propagating type information when constructing Group.Field from the schema.

References

  • V3QueryService.agg() — parses ranges without schema
  • Group.Field.bucketOrGet() — no type casting when bucket is null
  • ValueUtils.serialize() — encodes based on runtime type
  • IndexedLabelMixin.StartStopItem.ensureType() — scan's type casting for reference

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions