Skip to content

Feature: improve data set generator#52

Merged
sarusso merged 31 commits intodevelopfrom
feature/improve_data_set_generator
Dec 2, 2025
Merged

Feature: improve data set generator#52
sarusso merged 31 commits intodevelopfrom
feature/improve_data_set_generator

Conversation

@clarasaja
Copy link
Collaborator

@clarasaja clarasaja commented Dec 2, 2025

This PR updates the generate workflow in HumiTempDatasetGenerator.
It introduces a sub-series concatenation approach and supports multiple occurrences of the same anomaly type within a single series.
The plotting method is also updated to handle these cases.

Changes

New sub-series workflow
Each final time series is built by concatenating n sub-series, each containing zero or one anomaly.

if len(anomalies) == 0:
    n = 1
    
else:
   if auto_repeat_anomalies is False:
      n = min(max_anomalies_per_series, len(anomalies))
      # Then anomalies picked randomly but limited to one per list item
      
   elif auto_repeat_anomalies is True:
      n = max_anomalies_per_series
      # Then anomalies picked randomly with no constraint 

This approach allows a single series to include multiple anomalies, even of the same type.

Support for variable anomaly ratios
Series without anomalies are alternated with series with anomalies according to anomalies_ratio.

auto_repeat_anomalies flag
True: anomalies are automatically reused to fill the requested number per series.
False: anomalies appear only as many times as listed in the anomalies argument. Even with False, you can repeat anomalies by listing them multiple times.

Step anomaly limitation
Limitation for step_uv / step_mv: Short series containing step anomalies are not yet supported.
_divide_time_interval() raises NotImplementedError if time span is too short.

Static plotting method
plot_dataset() is now a static method and uses _plot_func to handle multiple anomalies of the same type.

et calls in timeseries_generators.py _plot_func
@sarusso sarusso merged commit 011b618 into develop Dec 2, 2025
2 checks passed
agataben pushed a commit to agataben/ATS that referenced this pull request Dec 2, 2025
…taset_generator

Feature/improve dataset generator
@clarasaja clarasaja changed the title Feature/improve data set generator Feature: improve data set generator Dec 2, 2025
@clarasaja clarasaja deleted the feature/improve_data_set_generator branch December 14, 2025 18:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants