Skip to content

Conversation

amotl
Copy link
Member

@amotl amotl commented Sep 28, 2025

About

Add information about how to use Conecta with CrateDB.

Preview

https://cratedb-guide--342.org.readthedocs.build/integrate/conecta/

Copy link

coderabbitai bot commented Sep 28, 2025

Walkthrough

Adds a new documentation page docs/integrate/conecta/index.md that introduces the Conecta library with About, Features, Install, and Usage sections, including a Python read_sql example showing conversions to Arrow, pandas, and Polars, plus external links.

Changes

Cohort / File(s) Summary
Docs: Conecta integration page
docs/integrate/conecta/index.md
New documentation page describing Conecta (SQL→Arrow loader), Features (pooling, multithreading, partitioning, bind utilities), Install (pip command for conecta and pyarrow), Usage example querying PostgreSQL with read_sql, showing to_pylist, to_pandas, and pl.from_arrow, sample dataset fields, and external links to arro3, nanoarrow, pyarrow, and Conecta.

Sequence Diagram(s)

(omitted — changes are documentation-only; no control-flow diagrams applicable)

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested labels

guidance

Suggested reviewers

  • hammerhead
  • karynzv
  • surister

Poem

A rabbit scurries, docs in paw,
Conecta carrots, neat and raw.
SQL nibbled, Arrow bright,
Dataframes hop into the light. 🥕

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title succinctly and accurately describes the primary change, which is adding an index page for Conecta documentation, and matches the file addition in the changeset.
Description Check ✅ Passed The description clearly states that the pull request adds information on using Conecta with CrateDB and provides a preview link, which aligns with the added documentation page.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch conecta

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment on lines +4 to +13
:::{rubric} About
:::

[Conecta] is a library designed to load data from SQL databases into Arrow
with maximum speed and memory efficiency by leveraging zero-copy and true
concurrency in Python.

Conecta integrates natively with the arrow ecosystem by supporting several
arrow libraries: [pyarrow], [arro3] and [nanoarrow]. Additionally, the
database results can easily be converted to Polars or pandas.
Copy link
Member Author

@amotl amotl Sep 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is a little flat and should better elaborate about Conecta's advanced features, right? Do you think it's still good enough for a start?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it encompasses well what it is, if you want , add a feature list like we have in the readme:

* Connection pooling
* Real multithreading
* Client-based query partition
* Utilities like: sql bind parameters

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Added list with d8793d9.

Comment on lines +4 to +13
:::{rubric} About
:::

[Conecta] is a library designed to load data from SQL databases into Arrow
with maximum speed and memory efficiency by leveraging zero-copy and true
concurrency in Python.

Conecta integrates natively with the arrow ecosystem by supporting several
arrow libraries: [pyarrow], [arro3] and [nanoarrow]. Additionally, the
database results can easily be converted to Polars or pandas.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it encompasses well what it is, if you want , add a feature list like we have in the readme:

* Connection pooling
* Real multithreading
* Client-based query partition
* Utilities like: sql bind parameters

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2a54b51 and d8793d9.

📒 Files selected for processing (1)
  • docs/integrate/conecta/index.md (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Build docs

Comment on lines +26 to +51
```shell
uv pip install --upgrade conecta pyarrow
```

:::{rubric} Usage
:::

```python
from pprint import pprint
from conecta import read_sql

table = read_sql(
"postgres://crate:crate@localhost:5432/doc",
queries=["SELECT country, region, mountain, height, latitude(coordinates), longitude(coordinates) FROM sys.summits ORDER BY height DESC LIMIT 3"],
)

# Display in Python format.
pprint(table.to_pylist())

# Optionally convert to pandas dataframe.
print(table.to_pandas())

# Optionally convert to Polars dataframe.
import polars as pl
print(pl.from_arrow(table))
```
Copy link

@coderabbitai coderabbitai bot Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix missing dependency for the Usage example.

The install snippet only pulls in conecta and pyarrow, yet the Usage section immediately does import polars as pl. Anyone following the docs verbatim will hit ModuleNotFoundError: No module named 'polars'. Either add polars to the install line (or document it as an optional extra) or wrap the Polars example so it doesn’t run unless Polars is present.

🤖 Prompt for AI Agents
In docs/integrate/conecta/index.md around lines 26 to 51, the Usage example
imports polars but the install snippet only installs conecta and pyarrow, which
will cause ModuleNotFoundError for polars; update the install instruction to
include polars (or note it as an optional extra like "pip install ... polars")
and/or mark the Polars conversion example as optional or guarded (e.g., add a
brief sentence that the Polars example requires installing polars or show it
behind an explicit optional section) so readers won’t encounter an import error
when following the docs verbatim.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I think polars has been added as a transitive dependency automatically?

/cc @coderabbitai, @surister

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check conecta package dependencies from PyPI
pip index versions conecta 2>/dev/null || echo "pip index not available"

# Try using pip show after installing conecta to see dependencies
pip install --quiet conecta 2>/dev/null
pip show conecta

# Also try to get the package metadata via PyPI JSON API
curl -s https://pypi.org/pypi/conecta/json | jq -r '.info.requires_dist // "No dependencies found"' | head -20

Length of output: 330


@amotl, I checked the conecta package dependencies on PyPI, and polars is NOT a transitive dependency of conecta. In fact, conecta version 0.0.5 has no required dependencies at all (the Requires: field is empty). This suggests conecta likely supports multiple arrow libraries as optional extras rather than hard dependencies.

The original concern is valid - users following the documentation will encounter ModuleNotFoundError: No module named 'polars' when running the Usage example.

I recommend one of these approaches:

  1. Add polars to the install command (if you want to showcase Polars integration):

    uv pip install --upgrade conecta pyarrow polars
  2. Mark the Polars example as optional with a clear note:

    # Optionally convert to Polars dataframe (requires: pip install polars).
    import polars as pl
    print(pl.from_arrow(table))
  3. Move the Polars example to a separate section titled "Optional: Polars Integration" with its own install instructions.

Which approach would you prefer for this documentation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants