-
Notifications
You must be signed in to change notification settings - Fork 1
Conecta: Add index page #342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughAdds a new documentation page Changes
Sequence Diagram(s)(omitted — changes are documentation-only; no control-flow diagrams applicable) Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
6ab5250
to
70ffb90
Compare
:::{rubric} About | ||
::: | ||
|
||
[Conecta] is a library designed to load data from SQL databases into Arrow | ||
with maximum speed and memory efficiency by leveraging zero-copy and true | ||
concurrency in Python. | ||
|
||
Conecta integrates natively with the arrow ecosystem by supporting several | ||
arrow libraries: [pyarrow], [arro3] and [nanoarrow]. Additionally, the | ||
database results can easily be converted to Polars or pandas. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is a little flat and should better elaborate about Conecta's advanced features, right? Do you think it's still good enough for a start?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it encompasses well what it is, if you want , add a feature list like we have in the readme:
* Connection pooling
* Real multithreading
* Client-based query partition
* Utilities like: sql bind parameters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Added list with d8793d9.
:::{rubric} About | ||
::: | ||
|
||
[Conecta] is a library designed to load data from SQL databases into Arrow | ||
with maximum speed and memory efficiency by leveraging zero-copy and true | ||
concurrency in Python. | ||
|
||
Conecta integrates natively with the arrow ecosystem by supporting several | ||
arrow libraries: [pyarrow], [arro3] and [nanoarrow]. Additionally, the | ||
database results can easily be converted to Polars or pandas. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it encompasses well what it is, if you want , add a feature list like we have in the readme:
* Connection pooling
* Real multithreading
* Client-based query partition
* Utilities like: sql bind parameters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/integrate/conecta/index.md
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Build docs
```shell | ||
uv pip install --upgrade conecta pyarrow | ||
``` | ||
|
||
:::{rubric} Usage | ||
::: | ||
|
||
```python | ||
from pprint import pprint | ||
from conecta import read_sql | ||
|
||
table = read_sql( | ||
"postgres://crate:crate@localhost:5432/doc", | ||
queries=["SELECT country, region, mountain, height, latitude(coordinates), longitude(coordinates) FROM sys.summits ORDER BY height DESC LIMIT 3"], | ||
) | ||
|
||
# Display in Python format. | ||
pprint(table.to_pylist()) | ||
|
||
# Optionally convert to pandas dataframe. | ||
print(table.to_pandas()) | ||
|
||
# Optionally convert to Polars dataframe. | ||
import polars as pl | ||
print(pl.from_arrow(table)) | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix missing dependency for the Usage example.
The install snippet only pulls in conecta
and pyarrow
, yet the Usage section immediately does import polars as pl
. Anyone following the docs verbatim will hit ModuleNotFoundError: No module named 'polars'
. Either add polars
to the install line (or document it as an optional extra) or wrap the Polars example so it doesn’t run unless Polars is present.
🤖 Prompt for AI Agents
In docs/integrate/conecta/index.md around lines 26 to 51, the Usage example
imports polars but the install snippet only installs conecta and pyarrow, which
will cause ModuleNotFoundError for polars; update the install instruction to
include polars (or note it as an optional extra like "pip install ... polars")
and/or mark the Polars conversion example as optional or guarded (e.g., add a
brief sentence that the Polars example requires installing polars or show it
behind an explicit optional section) so readers won’t encounter an import error
when following the docs verbatim.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I think polars has been added as a transitive dependency automatically?
/cc @coderabbitai, @surister
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check conecta package dependencies from PyPI
pip index versions conecta 2>/dev/null || echo "pip index not available"
# Try using pip show after installing conecta to see dependencies
pip install --quiet conecta 2>/dev/null
pip show conecta
# Also try to get the package metadata via PyPI JSON API
curl -s https://pypi.org/pypi/conecta/json | jq -r '.info.requires_dist // "No dependencies found"' | head -20
Length of output: 330
@amotl, I checked the conecta package dependencies on PyPI, and polars is NOT a transitive dependency of conecta. In fact, conecta version 0.0.5 has no required dependencies at all (the Requires:
field is empty). This suggests conecta likely supports multiple arrow libraries as optional extras rather than hard dependencies.
The original concern is valid - users following the documentation will encounter ModuleNotFoundError: No module named 'polars'
when running the Usage example.
I recommend one of these approaches:
-
Add polars to the install command (if you want to showcase Polars integration):
uv pip install --upgrade conecta pyarrow polars
-
Mark the Polars example as optional with a clear note:
# Optionally convert to Polars dataframe (requires: pip install polars). import polars as pl print(pl.from_arrow(table))
-
Move the Polars example to a separate section titled "Optional: Polars Integration" with its own install instructions.
Which approach would you prefer for this documentation?
About
Add information about how to use Conecta with CrateDB.
Preview
https://cratedb-guide--342.org.readthedocs.build/integrate/conecta/