Skip to content

Conversation

Jefffrey
Copy link
Contributor

  • Pin pip packages and introduce dependabot config to auto-bump them for us - I believe there's been an overall strategy to try pin our dependencies to specific versions to mitigate supply chain risks, so this is following suit
  • Update docs README a bit, add uv instruction to make it easier to build
  • Remove need for temp dir as we now run the rustdocs_trim as part of the Sphinx process as an extension itself, and also remove the sed for relative links as there are no relative links anymore

@github-actions github-actions bot added documentation Improvements or additions to documentation development-process Related to development process of DataFusion labels Sep 29, 2025
mkdir temp
cp -rf source/* temp/
# replace relative URLs with absolute URLs
sed -i -e 's/\.\.\/\.\.\/\.\.\//https:\/\/github.com\/apache\/arrow-datafusion\/blob\/main\//g' temp/contributor-guide/index.md
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only ever did this to one file and it looks like the file doesn't have anymore relative URLs

"sphinx.ext.napoleon",
"myst_parser",
"sphinx_reredirects",
"rustdoc_trim",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we register rustdoc_trim so it'll run during Sphinx build, meaning we don't need to manually fix it in source files that are in a temp dir, allowing us to use original build dir for simplicity.

Tested by checking the catalog page, can see it's properly trimmed:

Image

Source code:

`tables` is the key-value pair described above. The underlying state could also be another data structure or other storage mechanism such as a file or transactional database.
Then we implement the `SchemaProvider` trait for `MemorySchemaProvider`.
```rust
# use std::sync::Arc;
# use dashmap::DashMap;
# use datafusion::catalog::TableProvider;
#
# #[derive(Debug)]
# pub struct MemorySchemaProvider {
# tables: DashMap<String, Arc<dyn TableProvider>>,
# }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also double checked and it looks good to me

@Jefffrey Jefffrey marked this pull request as ready for review September 29, 2025 07:14
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks beautiful to me -- thank you @Jefffrey

"sphinx.ext.napoleon",
"myst_parser",
"sphinx_reredirects",
"rustdoc_trim",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also double checked and it looks good to me

needing to create a virtual environment:

```sh
uv run --with-requirements requirements.txt bash build.sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@Jefffrey Jefffrey added this pull request to the merge queue Sep 30, 2025
Merged via the queue into apache:main with commit 935fb3e Sep 30, 2025
29 checks passed
@Jefffrey Jefffrey deleted the docs_refactor branch September 30, 2025 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development-process Related to development process of DataFusion documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants