Skip to content

PROJ_DATA env var should take precedence over installation data #1448

Closed
@kidanger

Description

@kidanger

Hello,

Currently, setting the environment variable PROJ_DATA has no effect on pyproj when the installation of pyproj brings its own data. I think it would be good to lower the priority of the internal data, and let users override the proj data with the environment variable in more cases.

Example: (from a fresh virtual env, python 3.12)

$ pip install pyproj
...
Successfully installed certifi-2024.8.30 pyproj-3.7.0
$ # create a custom proj data dir, here just a copy of the default one
$ cp -r .venv/lib/python3.12/site-packages/pyproj/proj_dir/share/proj test/

$ # without env var, pyproj finds the its own data directory
$ pyproj -v
pyproj info:
    pyproj: 3.7.0
PROJ (runtime): 9.4.1
PROJ (compiled): 9.4.1
  data dir: /tmp/t/.venv/lib/python3.12/site-packages/pyproj/proj_dir/share/proj
...

$ # even with the env var, it uses its own directory
$ PROJ_DATA=test/ pyproj -v
...
  data dir: /tmp/t/.venv/lib/python3.12/site-packages/pyproj/proj_dir/share/proj
...

$ # remove the internal dir manually, now it works
$ rm -fr .venv/lib/python3.12/site-packages/pyproj/proj_dir/share/proj
$ PROJ_DATA=test/ pyproj -v
...
  data dir: test/
...

(related discussion: NixOS/nixpkgs#282139)

Activity

snowman2

snowman2 commented on Oct 5, 2024

@snowman2
Member

This is by design. The reason this is the case is to prevent using the PROJ_DIR for a different PROJ installation that is incompatible. The PROJ database must be the one provided for that specific PROJ version and should not be interchanged.

If you have a separate PROJ installation, you should install pyproj from source instead of from a wheel if that is what you would like to use.

https://pyproj4.github.io/pyproj/stable/api/datadir.html

kidanger

kidanger commented on Oct 5, 2024

@kidanger
Author

Thank you for the fast answer.

Then I'm not sure why pyproj.datadir.set_data_dir would have precedence over pyproj internal data but PROJ_DATA doesn't, but I don't know all the details of pyproj and proj. Maybe this is not the goal of PROJ_DATA. My use-case is to bundle specific datum grids during the distribution of a software, to avoid network downloads or relying on user folders.

Feel free to close the issue, if the behavior in intended.

snowman2

snowman2 commented on Oct 5, 2024

@snowman2
Member

I'm not sure why pyproj.datadir.set_data_dir would have precedence over pyproj internal data but PROJ_DATA doesn't

The reason set_data_dir exists is to set the data directory if it cannot be found automatically. It is guaranteed to be for the specific instance of pyproj and not for another installation of PROJ.

With multiple installations of PROJ on a single machine, PROJ_DATA could potentially point to an incorrect directory that shouldn't be used by pyproj.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @kidanger@snowman2

        Issue actions

          PROJ_DATA env var should take precedence over installation data · Issue #1448 · pyproj4/pyproj