Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Type Change in Adapter Causes Data Storage Failure in InfluxDB #2954

Open
tenthe opened this issue Jun 25, 2024 · 1 comment · Fixed by #2958
Open

Data Type Change in Adapter Causes Data Storage Failure in InfluxDB #2954

tenthe opened this issue Jun 25, 2024 · 1 comment · Fixed by #2958
Labels
bug Something isn't working
Milestone

Comments

@tenthe
Copy link
Contributor

tenthe commented Jun 25, 2024

Body

Problem

If you edit the adapter and change the data type, the data will no longer be saved in InfluxDB.

How to Reproduce:

  1. Create an adapter with a temperature field as an integer and create a pipeline to store it in the data lake.
  2. Change the data type of the temperature field to double.
  3. An exception appears in the data lake sink because it tries to write into the same measurement "temperature" with two different data types.

StreamPipes Committer

I acknowledge that I am a maintainer/committer of the Apache StreamPipes project.

@tenthe tenthe added the bug Something isn't working label Jun 25, 2024
@tenthe tenthe added this to the 0.97.0 milestone Jun 25, 2024
@tenthe
Copy link
Contributor Author

tenthe commented Jun 26, 2024

When data is written to the data lake, an entry is created in CouchDB with the schema information and the data. If a user adjusts the event schema of an adapter, this is also updated in CouchDB, but the old data remains unchanged in InfluxDB.

Problem

  • Adding new properties or renaming a property works without issues.
  • Changing the data type of a property causes a runtime error when writing the data, due to the column in the database having a different data type.

Potential Solutions

  • Migrate old entries in the database to match the new schema.
  • Adapt the schema internally to prevent the issue. This requires checking for pipelines that write data to the DataLake when editing the adapter.

Current Workaround for Users

  • For now we will show a warning when the user changes the data type
  • Change the name of the data type.
  • Write data to a new measurement.
  • Truncate the old data in the database.

tenthe added a commit that referenced this issue Jun 26, 2024
…n influxdb (#2958)

* fix(#2954): Add test to reproduce problem

* fix(#2954): Add warning message to data type change if adapter is edited
@tenthe tenthe reopened this Jun 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant