RecursionError when pickling Table object #2001

adinamarca · 2024-08-23T04:29:50Z

Hello.

Environment details

OS type and version: Latest Mac OS and also in Linux, Ubuntu 24.04 amd/64
Python version: 3.12.4
pip version: 24.0
google-cloud-bigquery version: tested since 3.18.0 up to 3.29.

Steps to reproduce

Get results from any method which returns a RowIterator or _EmptyRowIterator.
Using pickle, dumps the results.
Using pickle, loads the results.

Code example

from os import environ
from pickle import dumps, loads

from google.cloud import bigquery

environ["GOOGLE_APPLICATION_CREDENTIALS"] = (
    "your_path"
)

def query_stackoverflow() -> None:
    client = bigquery.Client()
    results = client.query(
        """
        SELECT
          CONCAT(
            'https://stackoverflow.com/questions/',
            CAST(id as STRING)) as url,
          view_count
        FROM `bigquery-public-data.stackoverflow.posts_questions`
        WHERE tags like '%google-bigquery%'
        ORDER BY view_count DESC
        LIMIT 10"""
    )
    results = results.result()
    results = list(results)

    pickled = dumps(results)
    results = loads(pickled)

query_stackoverflow()

Stack trace

Traceback (most recent call last):
  File "/some_path/repo/some_file.py", line 33, in query_stackoverflow
    results = loads(pickled)
              ^^^^^^^^^^^^^^
  File "/some_path/.cache/pypoetry/virtualenvs/some_env3.12/lib/python3.12/site-packages/google/cloud/bigquery/table.py", line 1586, in __getattr__
    value = self._xxx_field_to_index.get(name)
            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/some_path/.cache/pypoetry/virtualenvs/some_env3.12/lib/python3.12/site-packages/google/cloud/bigquery/table.py", line 1586, in __getattr__
    value = self._xxx_field_to_index.get(name)
            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/some_path/.cache/pypoetry/virtualenvs/some_env3.12/lib/python3.12/site-packages/google/cloud/bigquery/table.py", line 1586, in __getattr__
    value = self._xxx_field_to_index.get(name)
            ^^^^^^^^^^^^^^^^^^^^^^^^
  [Previous line repeated 995 more times]
RecursionError: maximum recursion depth exceeded

I was using Airflow, and in a task a maximum recursion depth exceeded exception was raised. Also I checked this in my personal computer.

Checking the code and I see that in the "Row" object (which probably is used by Table), there is a recursion when calling __getattr__ because it uses the get method from _xxx_field_to_index, and the get method also uses the same get method from _xxx_field_to_index, which leads to recursion (my guess).

Thanks!

The text was updated successfully, but these errors were encountered:

bhogan-bdai · 2025-02-12T20:10:54Z

We're encountering this when saving a Row object when using Metaflow + Argo, which uses the pickling approach described above and results in the above infinite recursion pass.

The framework (in this case Metaflow) determines how the state is saved (pickling). We determine what is saved. Saving a Row result is performant and simple when doing a fan out processing, one branch per row result. Frameworks such as Airflow, Metaflow, and Argo delineate steps across kuberenetes pods, which is why the state is saved between steps.

With this limitation, we need to take additional steps to convert the Row to a dict, which doesn't have such pickling limitations.

What is the priority of this work? It limits the use of Big Query in large data processing applications using popular frameworks.

adinamarca · 2025-02-12T20:34:40Z

I updated the issue to a reproducible code.

Hope this gets fixed. I'm still having this same issue since 3.18.0, and still in 3.29 version.

The only way to fix the pickling was to avoid having a Row object before attempting serialization.

product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Aug 23, 2024

blunderbuss-gcf bot assigned yirutang Aug 23, 2024

adinamarca mentioned this issue Aug 23, 2024

RecursionError when pickling bigquery table object PrefectHQ/prefect#3052

Closed

shollyman assigned chalmerlowe and unassigned yirutang Aug 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RecursionError when pickling Table object #2001

RecursionError when pickling Table object #2001

adinamarca commented Aug 23, 2024 •

edited

Loading

bhogan-bdai commented Feb 12, 2025

adinamarca commented Feb 12, 2025 •

edited

Loading

RecursionError when pickling Table object #2001

RecursionError when pickling Table object #2001

Comments

adinamarca commented Aug 23, 2024 • edited Loading

Environment details

Steps to reproduce

Code example

Stack trace

bhogan-bdai commented Feb 12, 2025

adinamarca commented Feb 12, 2025 • edited Loading

adinamarca commented Aug 23, 2024 •

edited

Loading

adinamarca commented Feb 12, 2025 •

edited

Loading