Trino data reader #20127

joshuaFordyce · 2025-10-21T14:11:24Z

Description

This PR introduces a new TrinoReader (Data Loader) for the LlamaINdex ecosystem, built around the native trino-python-client. This implementation is designed. to solve common production instability issues when querying Trino's distributed sql engine

Fixes #20126 20126

Prevents connection leaks, which commonly lead to resource exhaustion on the Trino coordinator
Addresses the poor retrieval accuracy problem inherent in generic SQL Loaders
The transformation logic explicitly serializes data into a high-density key:value format and embeds this context into the Document's text and metadata. This significantly improves semantic search precision
Removes reliance on fragile, generic ODBC/JDBC wrappers

New Package?

Yes

Did I fill in the tool.llamahub section in the pyproject.toml and provide a detailed README.md for my new integration or package?

Yes
No

Version Bump?

Did I bump the version in the pyproject.toml file of the package I am updating? (Except for the llama-index-core package)

Yes
No

Type of Change

Please delete options that are not relevant.

New feature (non-breaking change which adds functionality)
This change requires a documentation update

How Has This Been Tested?

Your pull-request will likely not be merged unless it is covered by some form of impactful unit testing.

I added new unit tests to cover this change
I believe this change is already covered by existing unit tests

Suggested Checklist:

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added Google Colab support for the newly added notebooks.
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I ran uv run make format; uv run make lint to appease the lint gods

AstraBert

Some minor comments, but in order to get this PR approved you should change the location of the scripts as detailed in the first comment

AstraBert · 2025-10-23T15:51:38Z

llama-index-integrations/readers/llama-index-readers-trino/llama_index/readers/trino/base.py

If you want the package to be published correctly, the scripts should be moved under llama_index/readers/trino

AstraBert · 2025-10-23T15:55:17Z

llama-index-integrations/readers/llama-index-readers-trino/llama_index/base.py

+            "schema": schema,
+        }
+
+    def configureConnection(self) -> Tuple[trino.dbapi.Connection, trino.dbapi.Cursor]:


nit, better to use snake_case

AstraBert · 2025-10-23T15:57:12Z

llama-index-integrations/readers/llama-index-readers-trino/llama_index/base.py

+            cur.execute(query)
+
+            rows = cur.fetchall()
+            return [rows, cur.description]


Why don't we use the cursor and connection available within the class?

AstraBert · 2025-10-23T16:08:11Z

llama-index-integrations/readers/llama-index-readers-trino/pyproject.toml

+# CRITICAL: Add the native Trino Python client as a dependency
+# Note: We use the llama-index-core version from the source template


Did you mean to commit these comments? 😅

AstraBert · 2025-10-23T16:09:30Z

llama-index-integrations/readers/llama-index-readers-trino/pyproject.toml

 exclude = ["_static", "build", "examples", "notebooks", "venv"]
 ignore_missing_imports = true
-python_version = "3.8"
+python_version = "3.9"  # Updated to align with your project's min Python version


We should be using 3.10 since 3.9 has reached its EOF at the beginning of October... Also, I would get rid of the comment :)

AstraBert

Change requested as detailed below 👇

llama-index-integrations/readers/llama-index-readers-trino/llama_index/readers/trino/base.py

AstraBert

You should change the test import pattern, otherwise tests will fail

AstraBert · 2025-11-03T18:00:26Z

llama-index-integrations/readers/llama-index-readers-trino/tests/test_Trino_DataReader.py

+from unittest import mock
+from llama_index.core.readers.base import BaseReader
+import inspect
+from llama_index.base import TrinoReader


I might have overseen this, but this module does not exist. If you want to import TrinoReader, you should do: from llama_index.readers.trino.base import TrinoReader. This is causing test failures

Suggested change

from llama_index.base import TrinoReader

from llama_index.readers.trino.base import TrinoReader

Hi thanks for that. fixed it and looked over the feature to make sure everything else made sense

AstraBert · 2025-11-03T18:07:45Z

llama-index-integrations/readers/llama-index-readers-trino/llama_index/readers/trino/init.py

This should be __init__.py. This is breaking package building

Hi Sorry about that fixed it!

joshuaFordyce added 5 commits October 7, 2025 11:31

added the TrinoReader stuff.

c28fe59

Added framework for this contribution

6932023

added comprehensive testing

b9ddb67

add documentation to TrinoReader

a060bdb

feat: add TrinoReader with comprehensive linting fixes

88762f5

joshuaFordyce marked this pull request as ready for review October 21, 2025 16:00

dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Oct 21, 2025

AstraBert requested changes Oct 23, 2025

View reviewed changes

completed review notes

a219c13

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Oct 27, 2025

AstraBert requested changes Oct 27, 2025

View reviewed changes

llama-index-integrations/readers/llama-index-readers-trino/llama_index/readers/trino/base.py Outdated Show resolved Hide resolved

corrected OOP issues

d2d5bab

joshuaFordyce requested a review from AstraBert November 3, 2025 16:48

AstraBert requested changes Nov 3, 2025

View reviewed changes

AstraBert reviewed Nov 3, 2025

View reviewed changes

fixed package build breaking details

cc85b01

joshuaFordyce requested a review from AstraBert November 3, 2025 21:52

AstraBert and others added 2 commits November 4, 2025 14:16

fix: tests and package build (hopefully)

0c598b8

changed requires-python=4.0 to <=3.13 to fix maturin build error

48a674b

		# CRITICAL: Add the native Trino Python client as a dependency
		# Note: We use the llama-index-core version from the source template

	from llama_index.base import TrinoReader
	from llama_index.readers.trino.base import TrinoReader

Trino data reader #20127

Are you sure you want to change the base?

Trino data reader #20127

Conversation

joshuaFordyce commented Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

New Package?

Version Bump?

Type of Change

How Has This Been Tested?

Suggested Checklist:

Uh oh!

AstraBert left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AstraBert left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

AstraBert left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

joshuaFordyce commented Oct 21, 2025 •

edited

Loading