Bug
AGG queries return count=0 when a group field's property type is non-String (e.g., Long, Int).
Cause
Write path (mutation): Group field values are serialized using their native runtime type. For example, 1L (Long) is encoded via encodeInt64() (header 0x2C).
Read path (AGG query): WherePredicate.parse(ranges) parses the query string without schema context, so "someLongField:eq:1" yields a String "1". This value reaches ValueUtils.serialize() without any type casting, and is encoded via encodeString() (header 0x34).
The qualifier bytes differ, so the HBase lookup returns no match.
The scan path avoids this problem by applying StartStopItem.ensureType() to cast predicate values to the correct schema type before encoding. The AGG path needs an equivalent type casting step.
Note: String-typed fields happen to work because both paths produce the same encoding. Fields with a bucket also work because bucket.handleQueryValue().toString() normalizes the value on both paths. These cases mask the problem.
Steps to reproduce
- Define a group with a non-String field:
mapOf(
"group" to "my_group",
"type" to "COUNT",
"fields" to listOf(
mapOf("name" to "someLongField", "bucket" to null),
mapOf("name" to "timestamp", "bucket" to mapOf(/* date bucket */)),
),
)
- Insert edges with
someLongField = 1L
- Query:
GET .../edges/agg/my_group?...&ranges=someLongField:eq:1;time:eq:...
Expected: Returns the aggregated count
Actual: Returns count=0
Suggested fixes
Option A: Cast predicate values before encoding (minimal change)
Follow the same pattern as scan's StartStopItem.ensureType()
val eqValues = firstPairs.map { (field, predicate) ->
require(predicate is WherePredicate.Eq)
val castedValue = if (field.bucket == null) {
schema.getField(field.name).type.cast(predicate.value)
} else {
predicate.value
}
field.bucketOrGet(castedValue, ceil = false)
}
- Pros: Small change, consistent with the existing scan pattern
- Cons: Only fixes
V3QueryService.agg(). Future callers of bucketOrGet() remain unprotected.
Option B: Add type information to Group.Field (structural fix)
Make Group.Field type-aware so that bucketOrGet() itself handles casting:
data class Field(
val name: String,
val bucket: Bucket? = null,
val type: FieldType? = null,
) {
fun bucketOrGet(value: Any, ceil: Boolean): Any =
bucket?.handleQueryValue(value, ceil)?.toString()
?: type?.cast(value)
?: value
}
- Pros: Fixes the problem at its source. All callers of
bucketOrGet() are safe by default.
- Cons: Wider change — requires propagating type information when constructing
Group.Field from the schema.
References
V3QueryService.agg() — parses ranges without schema
Group.Field.bucketOrGet() — no type casting when bucket is null
ValueUtils.serialize() — encodes based on runtime type
IndexedLabelMixin.StartStopItem.ensureType() — scan's type casting for reference
Bug
AGG queries return
count=0when a group field's property type is non-String (e.g.,Long,Int).Cause
Write path (mutation): Group field values are serialized using their native runtime type. For example,
1L(Long) is encoded viaencodeInt64()(header0x2C).Read path (AGG query):
WherePredicate.parse(ranges)parses the query string without schema context, so"someLongField:eq:1"yields a String"1". This value reachesValueUtils.serialize()without any type casting, and is encoded viaencodeString()(header0x34).The qualifier bytes differ, so the HBase lookup returns no match.
The scan path avoids this problem by applying
StartStopItem.ensureType()to cast predicate values to the correct schema type before encoding. The AGG path needs an equivalent type casting step.Steps to reproduce
someLongField = 1LGET .../edges/agg/my_group?...&ranges=someLongField:eq:1;time:eq:...Expected: Returns the aggregated count
Actual: Returns
count=0Suggested fixes
Option A: Cast predicate values before encoding (minimal change)
Follow the same pattern as scan's
StartStopItem.ensureType()V3QueryService.agg(). Future callers ofbucketOrGet()remain unprotected.Option B: Add type information to
Group.Field(structural fix)Make
Group.Fieldtype-aware so thatbucketOrGet()itself handles casting:bucketOrGet()are safe by default.Group.Fieldfrom the schema.References
V3QueryService.agg()— parses ranges without schemaGroup.Field.bucketOrGet()— no type casting whenbucketis nullValueUtils.serialize()— encodes based on runtime typeIndexedLabelMixin.StartStopItem.ensureType()— scan's type casting for reference