Skip to content

[Failing Test]: MLTest.test_ml_preprocessing_yaml is flaky #36688

@mohamedawnallah

Description

@mohamedawnallah

Where did this flake appear?

It appeared in the following CI workflow runs as part of PR #36684:
https://github.com/apache/beam/actions/runs/18947074452/job/54101320643?pr=36684#step:8:19830
https://github.com/apache/beam/actions/runs/18947074452/job/54101320654?pr=36684#step:8:19846
https://github.com/apache/beam/actions/runs/18947074452/job/54101320678?pr=36684#step:8:20260

Are there any captured failure logs for debugging?

self = <apache_beam.yaml.examples.testing.examples_test.MLExamplesTest testMethod=test_ml_preprocessing_yaml>

    @mock.patch('apache_beam.Pipeline', TestPipeline)
    def test_yaml_example(self):
      with open(pipeline_spec_file, encoding="utf-8") as f:
        lines = f.readlines()
      expected_key = '# Expected:\n'
      if expected_key in lines:
        expected = lines[lines.index('# Expected:\n') + 1:]
      else:
        raise ValueError(
            f"Missing '# Expected:' tag in example file '{pipeline_spec_file}'")
      for i, line in enumerate(expected):
        expected[i] = line.replace('#  ', '').replace('\n', '')
      expected = [line for line in expected if line]
    
      raw_spec_string = ''.join(lines)
      # Filter for any jinja preprocessor - this has to be done before other
      # preprocessors.
      jinja_preprocessor = [
          preprocessor for preprocessor in custom_preprocessors
          if 'jinja_preprocessor' in preprocessor.__name__
      ]
      if jinja_preprocessor:
        jinja_preprocessor = jinja_preprocessor[0]
        raw_spec_string = jinja_preprocessor(
            raw_spec_string, self._testMethodName)
        custom_preprocessors.remove(jinja_preprocessor)
    
      pipeline_spec = yaml.load(
          raw_spec_string, Loader=yaml_transform.SafeLineLoader)
    
      with TestEnvironment() as env:
        for fn in custom_preprocessors:
          pipeline_spec = fn(pipeline_spec, expected, env)
        with beam.Pipeline(options=PipelineOptions(
            pickle_library='cloudpickle',
            **yaml_transform.SafeLineLoader.strip_metadata(pipeline_spec.get(
                'options', {})))) as p:
          actual = [
>             yaml_transform.expand_pipeline(
                  p,
                  pipeline_spec,
                  [
                      yaml_provider.InlineProvider(
                          TEST_PROVIDERS, INPUT_TRANSFORM_TEST_PROVIDERS)
                  ])
          ]

apache_beam/yaml/examples/testing/examples_test.py:371: 
....
>                   gotit = waiter.acquire(True, timeout)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E                   Failed: Timeout (>600.0s) from pytest-timeout.

/opt/hostedtoolcache/Python/3.12.10/x64/lib/python3.12/threading.py:359: Failed

Issue Failure

Failure: Test is flaky

Issue Priority

Priority: 1 (unhealthy code / failing or flaky postcommit so we cannot be sure the product is healthy)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions