Skip to content

[BUG]: unsigned values that exceed long limit should be converted to string #61

@jpmcmu

Description

@jpmcmu

Description

Spark does not have unsigned types. Unsigned values in HPCC data that exceed what a long type can represent should be converted to string. Today, these values are mis-interpreted when converted to Java types.
This nid field of this file is a good example: https://eclwatch-hpcc.us-prod400thor-prod.azure.lnrsg.io/esp/files/index.html#/files/data/thor_data400::base::emailage_ingest::20250818
I don’t believe spark can represent any HPCC type greater than unsigned4.

Steps to Reproduce

  1. Read a dataset with an unsigned8

Expected Behavior

Unsigned8 values are read correctly

Actual Behavior

Unsigned8 values are read as Longs and look incorrect in Spark

HPCC Systems Version

9.10.x

HPCC4J Module

spark-hpcc - Spark connector for reading/writing HPCC datasets

HPCC4J Version

9.10.x

Java Version

Java 11

Environment/Configuration

Databricks

Code Sample

Additional Context

No response

Pre-submission Checklist


🤖 AI Validation Response

  • Ready for re-review (Check this box when you've addressed the feedback and want the AI to validate again)

Thank you for the detailed bug report!

Your issue is well-documented and provides all the information we need to investigate the Unsigned8 handling in spark-hpcc. This is a known limitation due to Java's lack of unsigned 64-bit integer support, and your environment and steps are clear.

📋 Action Required

  1. Review the workaround in our documentation for handling Unsigned8 fields in Spark.
  2. If you need to preserve the full unsigned8 range, configure your pipeline to treat these fields as strings or BigIntegers.
  3. Let us know if the documented solution does not address your use case or if you have further questions.
ℹ️ How to Find This Information

Workaround for Unsigned8 Handling

  • See the Common Issues and Solutions wiki, section: "Long type overflow error Message".
  • In Spark, unsigned8 values exceeding 263-1 will overflow Java Longs and appear negative.
  • To avoid this, use the connector option to treat unsigned8 fields as strings or BigIntegers.

Checking Your Configuration

  • Review your spark-hpcc connector options for unsigned8 handling.
  • If you need help configuring this, let us know your current pipeline setup.
🔍 Related Resources
⚠️ Important Notes
  • This is a known limitation: Java does not natively support unsigned 64-bit integers.
  • HPCC4J 9.10.x with HPCC Platform 9.10.x and Java 11 is a valid, supported combination.
  • Never share sensitive data (like passwords) in public issues.

Tip: This is covered in our documentation. Please review the suggested section and let us know if you need further clarification.


Please update this issue if you have additional questions or if the workaround does not resolve your problem. We're here to help!


Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions