Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a fetcher for AWS S3 Argo GDAC data #385

Draft
wants to merge 32 commits into
base: master
Choose a base branch
from

Conversation

gmaze
Copy link
Member

@gmaze gmaze commented Sep 3, 2024

Support for AWS S3 data files

This support is experimental and is primarily made available for benchmarking as part of the ADMT working group on Argo cloud format activities

In this PR, we shall provide:

  • a data fetcher for netcdf files on AWS S3
  • a netcdf to zarr convertion utility (check this gist)
  • a data fetcher prototype for zarr files on AWS S3
  • a kerchunk helper for netcdf files on AWS S3

We assume that the GDAC data structure is not modified: paths to netcdf are similar to paths to zarr archives:

./dac/<DAC>/<WMO>/*.nc
./dac/<DAC>/<WMO>/profiles/*.nc

to:

./dac/<DAC>/<WMO>/*.zarr
./dac/<DAC>/<WMO>/profiles/*.zarr

So that a file path such as
/pub/dac/coriolis/6903091/6903091_prof.nc
goes to
/pub/etc/ArgoZarr/dac/coriolis/6903091/6903091_prof.zarr

This is alive, but experimental on S3 prototype.

You can browse it at:

https://argo-gdac-sandbox.s3.eu-west-3.amazonaws.com/pub/index.html#pub/etc/ArgoZarr/

@gmaze gmaze added enhancement New feature or request backends performance labels Sep 3, 2024
@gmaze gmaze self-assigned this Sep 3, 2024
@gmaze gmaze linked an issue Sep 3, 2024 that may be closed by this pull request
@gmaze gmaze marked this pull request as draft September 4, 2024 06:08
commit 62ba4cb
Merge: 919484e ce6fed9
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Wed Sep 25 13:50:16 2024 +0200

    Merge pull request #389 from euroargodev/other-major-breaking-refactoring

    Implement other than bgc-2024 branch major breaking refactoring for major release v1.0.0

commit ce6fed9
Merge: fa05fa7 919484e
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Wed Sep 25 12:08:02 2024 +0200

    Merge branch 'master' into other-major-breaking-refactoring

commit fa05fa7
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Wed Sep 25 12:07:02 2024 +0200

    Delete test_deprecated.py

commit 919484e
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Wed Sep 25 11:37:21 2024 +0200

    Fix ci tests env

    fix error    libmamba Could not solve for environment specs
          The following packages are incompatible
          ├─ fsspec 2024.9.0*  is requested and can be installed;
          └─ s3fs 2024.6.1*  is not installable because it requires
             └─ fsspec 2024.6.1 , which conflicts with any installable versions previously reported.
      critical libmamba Could not solve for environment specs

commit 0dc9834
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Wed Sep 25 11:31:21 2024 +0200

    Add upstream tests with python 3.11 and 3.12

commit a1aedc5
Merge: 747ba13 549d8c3
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Wed Sep 25 11:25:09 2024 +0200

    Merge branch 'master' into other-major-breaking-refactoring

commit 549d8c3
Merge: 1e79ec0 2d4785d
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Wed Sep 25 11:20:42 2024 +0200

    Merge pull request #356 from euroargodev/bgc-2024

    Work on BGC from 2024 LOV visit

commit 2d4785d
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Wed Sep 25 10:30:17 2024 +0200

    Remove 45mins timeout for CI tests

commit 1797037
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Wed Sep 25 08:03:22 2024 +0200

    Update CI tests data

    include standard and research mode for erddap BGC

commit 82c20c8
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Wed Sep 25 07:44:50 2024 +0200

    Update CI tests data

commit f7ebc21
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Wed Sep 25 07:39:34 2024 +0200

    Update test_deprecated.py

commit 51355c3
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Tue Sep 24 12:08:05 2024 +0200

    update CI tests data

commit 809adc9
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Tue Sep 24 10:37:20 2024 +0200

    Update create_json_assets

commit 2ff193f
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Tue Sep 24 10:37:15 2024 +0200

    Update argovis_data.py

    make sure argovis is only using a single filestore

commit a73f727
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Tue Sep 24 10:36:53 2024 +0200

    Update CI tests data

commit cf41ba4
Merge: 4681d55 1e79ec0
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Mon Sep 23 14:59:03 2024 +0200

    Merge branch 'master' into bgc-2024

commit 4681d55
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Mon Sep 23 14:57:32 2024 +0200

    Clear CI tests for easier merge with master [skip-ci]

commit 1e79ec0
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Mon Sep 23 14:56:59 2024 +0200

    Clear CI tests data for easier merge [skip-ci]

commit c9de8b9
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Mon Sep 23 14:54:43 2024 +0200

    Clear CI tests data before merge

commit a21a644
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Mon Sep 23 09:56:26 2024 +0200

    Update whats-new.rst

commit fe8b91c
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 20 15:38:26 2024 +0200

    Update requirements.txt

commit 4ae5aab
Merge: 0f5a754 b135bfa
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 20 15:36:58 2024 +0200

    Merge pull request #394 from euroargodev/releasev0.1.17

    Prepare for v0.1.17 Bat Release 🦇

commit b135bfa
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 20 14:16:32 2024 +0200

    Update dev env definitions

commit 0f5a754
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 20 13:54:21 2024 +0200

    Update HOW_TO_RELEASE.md [skip-ci]

commit 4bc625e
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 20 13:49:08 2024 +0200

    Flake8

commit 34d1a46
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 20 13:45:15 2024 +0200

    codespell

commit 6259011
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 20 13:42:32 2024 +0200

    Fix CI tests data update

commit c5ab622
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 20 13:36:15 2024 +0200

    Update cheatsheet.rst

commit cb66217
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 20 13:28:25 2024 +0200

    Update cheatsheet PDF

commit 10ff2cf
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 20 11:50:15 2024 +0200

    Update CI tests data

commit ec0b14c
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 20 11:48:41 2024 +0200

    Update HOW_TO_RELEASE.md [skip-ci]

commit e2df789
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 20 11:28:55 2024 +0200

    Update static assets

commit cffefc0
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 20 11:28:24 2024 +0200

    Update reference_tables.py

commit 6cf2644
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 20 11:07:15 2024 +0200

    Update whats-new.rst

commit eb7e689
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 20 11:07:12 2024 +0200

    Update fetchers.py

commit d8121d8
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 20 10:58:12 2024 +0200

    Update HOW_TO_RELEASE.md [skip-ci]

commit 88ff363
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 20 10:34:20 2024 +0200

    Move to v0.1.17, to Beta

commit e48ab55
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 20 09:47:51 2024 +0200

    Update xarray.py

    don't anticipate too much on the upcoming filter_data_mode replacement

commit 29a5cfc
Merge: 5a31057 f3b0a56
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 20 09:45:45 2024 +0200

    Merge pull request #388 from euroargodev/deprec-before-major

    Introduces deprecation warnings before major v1.0.0 release

commit f3b0a56
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 20 08:56:53 2024 +0200

    Better deprecation introduction

commit 5a31057
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Thu Sep 19 14:15:02 2024 +0200

    Pin erddapy for python < 3.10

    See ioos/erddapy#359

commit 747ba13
Merge: 37f2495 0095fe6
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Wed Sep 18 15:33:08 2024 +0200

    Merge branch 'master' into other-major-breaking-refactoring

commit 37f2495
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Wed Sep 18 15:32:46 2024 +0200

    Update monitored_threadpool.py

commit 6d9be49
Merge: 62ece42 0095fe6
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Wed Sep 18 15:30:38 2024 +0200

    Merge branch 'master' into bgc-2024

commit 2669301
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 13 14:46:38 2024 +0200

    [skip-ci]

commit e87afe1
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 13 14:32:06 2024 +0200

    Create test_deprecated.py

    Ensure we're having warnings for deprecations

commit c319d0a
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 13 14:31:32 2024 +0200

    Update xarray.py

    fix deprecation warning

commit 19daad3
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 13 14:31:13 2024 +0200

    New deprecation for option 'ftp' replaced by 'gdac'

commit c890602
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Fri Sep 13 14:30:32 2024 +0200

    introduce new "OptionDeprecatedWarning"

commit 850adf1
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Wed Sep 4 11:14:10 2024 +0200

    Deprec for 'dataset' option replaced by 'ds'

commit 1371625
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Wed Sep 4 10:12:43 2024 +0200

    Update whats-new.rst

commit a988d79
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Wed Sep 4 10:10:43 2024 +0200

    Update xarray.py

commit acc789e
Author: Guillaume Maze <gmaze@ifremer.fr>
Date:   Wed Sep 4 10:08:07 2024 +0200

    Update xarray.py
Copy link

codecov bot commented Oct 23, 2024

❌ 4 Tests Failed:

Tests completed Failed Passed Skipped
1637 4 1633 141
View the top 3 failed tests by shortest run time
test_fetchers_data_gdac.py::TestBackend::test_fetching_cached[host='c', ds='phy', mode='standard', {'region': [-20, -16.0, 0, 1, 0, 100.0, '1997-07-01', '1997-09-01']}]
Stack Traces | 0.725s run time
self = &lt;argopy.tests.test_fetchers_data_gdac.TestBackend object at 0x00000260E3901510&gt;
mocked_httpserver = 'http://127.0.0.1:9898'
cached_fetcher = &lt;datafetcher.gdac&gt;
#x1F310 Name: Ifremer GDAC Argo data fetcher for a space/time region
#x1F5FA  Domain: [x=-20.00/-16.00; y=0.00/... searched: True (3 matches, 0.1110%)
#x1F3CA User mode: standard
#x1F7E1+#x1F535 Dataset: phy
#x1F324  Performances: cache=True, parallel=False

    @pytest.mark.parametrize("cached_fetcher", VALID_ACCESS_POINTS, indirect=True, ids=VALID_ACCESS_POINTS_IDS)
    def test_fetching_cached(self, mocked_httpserver, cached_fetcher):
        # Assert the fetcher (this trigger data fetching, hence caching as well):
&gt;       assert_fetcher(mocked_httpserver, cached_fetcher, cacheable=True)

argopy\tests\test_fetchers_data_gdac.py:197: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
argopy\tests\test_fetchers_data_gdac.py:90: in assert_fetcher
    assert isinstance(this_fetcher.to_xarray(errors='raise'), xr.Dataset)
argopy\fetchers.py:616: in to_xarray
    xds = self.fetcher.to_xarray(**kwargs)
argopy\data_fetchers\gdac_data.py:399: in to_xarray
    results = self.fs.open_mfdataset(URI, **opts)
argopy\stores\filesystems.py:489: in open_mfdataset
    data = future.result()
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\concurrent\futures\_base.py:449: in result
    return self.__get_result()
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\concurrent\futures\_base.py:401: in __get_result
    raise self._exception
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\concurrent\futures\thread.py:58: in run
    result = self.fn(*self.args, **self.kwargs)
argopy\stores\filesystems.py:393: in _mfprocessor
    ds = self.open_dataset(url, **open_dataset_opts)
argopy\stores\filesystems.py:374: in open_dataset
    with self.open(path) as of:
argopy\stores\filesystems.py:210: in open
    return self.fs.open(path, *args, **kwargs)
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\implementations\cached.py:446: in &lt;lambda&gt;
    return lambda *args, **kw: getattr(type(self), item).__get__(self)(
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\spec.py:1303: in open
    f = self._open(
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\implementations\cached.py:446: in &lt;lambda&gt;
    return lambda *args, **kw: getattr(type(self), item).__get__(self)(
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\implementations\cached.py:707: in _open
    self.save_cache()
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\implementations\cached.py:446: in &lt;lambda&gt;
    return lambda *args, **kw: getattr(type(self), item).__get__(self)(
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\implementations\cached.py:206: in save_cache
    self._metadata.save()
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\implementations\cache_metadata.py:227: in save
    self._save(cache, fn)
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\implementations\cache_metadata.py:75: in _save
    with atomic_write(fn, mode="w") as f:
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\contextlib.py:144: in __exit__
    next(self.gen)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

path = 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmpwusflrts\\cache'
mode = 'w'

    @contextlib.contextmanager
    def atomic_write(path: str, mode: str = "wb"):
        """
        A context manager that opens a temporary file next to `path` and, on exit,
        replaces `path` with the temporary file, thereby updating `path`
        atomically.
        """
        fd, fn = tempfile.mkstemp(
            dir=os.path.dirname(path), prefix=os.path.basename(path) + "-"
        )
        try:
            with open(fd, mode) as fp:
                yield fp
        except BaseException:
            with contextlib.suppress(FileNotFoundError):
                os.unlink(fn)
            raise
        else:
&gt;           os.replace(fn, path)
E           PermissionError: [WinError 5] Access is denied: 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmpwusflrts\\cache-r5b1fm42' -&gt; 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmpwusflrts\\cache'

C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\utils.py:625: PermissionError
test_fetchers_data_gdac.py::TestBackend::test_fetching_cached[host='c', ds='phy', mode='expert', {'region': [-20, -16.0, 0, 1, 0, 100.0, '1997-07-01', '1997-09-01']}]
Stack Traces | 0.727s run time
self = &lt;argopy.tests.test_fetchers_data_gdac.TestBackend object at 0x00000260E39027D0&gt;
mocked_httpserver = 'http://127.0.0.1:9898'
cached_fetcher = &lt;datafetcher.gdac&gt;
#x1F310 Name: Ifremer GDAC Argo data fetcher for a space/time region
#x1F5FA  Domain: [x=-20.00/-16.00; y=0.00/...ex searched: True (3 matches, 0.1110%)
#x1F3C4 User mode: expert
#x1F7E1+#x1F535 Dataset: phy
#x1F324  Performances: cache=True, parallel=False

    @pytest.mark.parametrize("cached_fetcher", VALID_ACCESS_POINTS, indirect=True, ids=VALID_ACCESS_POINTS_IDS)
    def test_fetching_cached(self, mocked_httpserver, cached_fetcher):
        # Assert the fetcher (this trigger data fetching, hence caching as well):
&gt;       assert_fetcher(mocked_httpserver, cached_fetcher, cacheable=True)

argopy\tests\test_fetchers_data_gdac.py:197: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
argopy\tests\test_fetchers_data_gdac.py:90: in assert_fetcher
    assert isinstance(this_fetcher.to_xarray(errors='raise'), xr.Dataset)
argopy\fetchers.py:616: in to_xarray
    xds = self.fetcher.to_xarray(**kwargs)
argopy\data_fetchers\gdac_data.py:399: in to_xarray
    results = self.fs.open_mfdataset(URI, **opts)
argopy\stores\filesystems.py:489: in open_mfdataset
    data = future.result()
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\concurrent\futures\_base.py:449: in result
    return self.__get_result()
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\concurrent\futures\_base.py:401: in __get_result
    raise self._exception
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\concurrent\futures\thread.py:58: in run
    result = self.fn(*self.args, **self.kwargs)
argopy\stores\filesystems.py:393: in _mfprocessor
    ds = self.open_dataset(url, **open_dataset_opts)
argopy\stores\filesystems.py:374: in open_dataset
    with self.open(path) as of:
argopy\stores\filesystems.py:210: in open
    return self.fs.open(path, *args, **kwargs)
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\implementations\cached.py:446: in &lt;lambda&gt;
    return lambda *args, **kw: getattr(type(self), item).__get__(self)(
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\spec.py:1303: in open
    f = self._open(
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\implementations\cached.py:446: in &lt;lambda&gt;
    return lambda *args, **kw: getattr(type(self), item).__get__(self)(
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\implementations\cached.py:707: in _open
    self.save_cache()
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\implementations\cached.py:446: in &lt;lambda&gt;
    return lambda *args, **kw: getattr(type(self), item).__get__(self)(
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\implementations\cached.py:206: in save_cache
    self._metadata.save()
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\implementations\cache_metadata.py:227: in save
    self._save(cache, fn)
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\implementations\cache_metadata.py:75: in _save
    with atomic_write(fn, mode="w") as f:
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\contextlib.py:144: in __exit__
    next(self.gen)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

path = 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmpwusflrts\\cache'
mode = 'w'

    @contextlib.contextmanager
    def atomic_write(path: str, mode: str = "wb"):
        """
        A context manager that opens a temporary file next to `path` and, on exit,
        replaces `path` with the temporary file, thereby updating `path`
        atomically.
        """
        fd, fn = tempfile.mkstemp(
            dir=os.path.dirname(path), prefix=os.path.basename(path) + "-"
        )
        try:
            with open(fd, mode) as fp:
                yield fp
        except BaseException:
            with contextlib.suppress(FileNotFoundError):
                os.unlink(fn)
            raise
        else:
&gt;           os.replace(fn, path)
E           PermissionError: [WinError 5] Access is denied: 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmpwusflrts\\cache-1pdjmbzx' -&gt; 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmpwusflrts\\cache'

C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\utils.py:625: PermissionError
test_fetchers_data_gdac.py::TestBackend::test_fetching_cached[host='http_mocked', ds='phy', mode='research', {'region': [-20, -16.0, 0, 1, 0, 100.0]}]
Stack Traces | 0.948s run time
self = &lt;argopy.tests.test_fetchers_data_gdac.TestBackend object at 0x00000260E392A750&gt;
mocked_httpserver = 'http://127.0.0.1:9898'
cached_fetcher = &lt;datafetcher.gdac&gt;
#x1F310 Name: Ifremer GDAC Argo data fetcher for a space/time region
#x1F5FA  Domain: [x=-20.00/-16.00; y=0.00/... searched: True (3 matches, 3.0000%)
#x1F6A3 User mode: research
#x1F7E1+#x1F535 Dataset: phy
#x1F324  Performances: cache=True, parallel=False

    @pytest.mark.parametrize("cached_fetcher", VALID_ACCESS_POINTS, indirect=True, ids=VALID_ACCESS_POINTS_IDS)
    def test_fetching_cached(self, mocked_httpserver, cached_fetcher):
        # Assert the fetcher (this trigger data fetching, hence caching as well):
&gt;       assert_fetcher(mocked_httpserver, cached_fetcher, cacheable=True)

argopy\tests\test_fetchers_data_gdac.py:197: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
argopy\tests\test_fetchers_data_gdac.py:90: in assert_fetcher
    assert isinstance(this_fetcher.to_xarray(errors='raise'), xr.Dataset)
argopy\fetchers.py:616: in to_xarray
    xds = self.fetcher.to_xarray(**kwargs)
argopy\data_fetchers\gdac_data.py:399: in to_xarray
    results = self.fs.open_mfdataset(URI, **opts)
argopy\stores\filesystems.py:1258: in open_mfdataset
    data = future.result()
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\concurrent\futures\_base.py:449: in result
    return self.__get_result()
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\concurrent\futures\_base.py:401: in __get_result
    raise self._exception
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\concurrent\futures\thread.py:58: in run
    result = self.fn(*self.args, **self.kwargs)
argopy\stores\filesystems.py:936: in _mfprocessor_dataset
    ds = self.open_dataset(url, **open_dataset_opts)
argopy\stores\filesystems.py:889: in open_dataset
    target, _ = load_in_memory(
argopy\stores\filesystems.py:833: in load_in_memory
    data = self.download_url(url, **dwn_opts)
argopy\stores\filesystems.py:761: in download_url
    data, n = make_request(
argopy\stores\filesystems.py:713: in make_request
    data = ffs.cat_file(url, **cat_opts)
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\implementations\cached.py:446: in &lt;lambda&gt;
    return lambda *args, **kw: getattr(type(self), item).__get__(self)(
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\spec.py:773: in cat_file
    with self.open(path, "rb", **kwargs) as f:
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\implementations\cached.py:446: in &lt;lambda&gt;
    return lambda *args, **kw: getattr(type(self), item).__get__(self)(
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\spec.py:1303: in open
    f = self._open(
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\implementations\cached.py:446: in &lt;lambda&gt;
    return lambda *args, **kw: getattr(type(self), item).__get__(self)(
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\implementations\cached.py:707: in _open
    self.save_cache()
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\implementations\cached.py:446: in &lt;lambda&gt;
    return lambda *args, **kw: getattr(type(self), item).__get__(self)(
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\implementations\cached.py:206: in save_cache
    self._metadata.save()
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\implementations\cache_metadata.py:227: in save
    self._save(cache, fn)
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\implementations\cache_metadata.py:75: in _save
    with atomic_write(fn, mode="w") as f:
C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\contextlib.py:144: in __exit__
    next(self.gen)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

path = 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmpwusflrts\\cache'
mode = 'w'

    @contextlib.contextmanager
    def atomic_write(path: str, mode: str = "wb"):
        """
        A context manager that opens a temporary file next to `path` and, on exit,
        replaces `path` with the temporary file, thereby updating `path`
        atomically.
        """
        fd, fn = tempfile.mkstemp(
            dir=os.path.dirname(path), prefix=os.path.basename(path) + "-"
        )
        try:
            with open(fd, mode) as fp:
                yield fp
        except BaseException:
            with contextlib.suppress(FileNotFoundError):
                os.unlink(fn)
            raise
        else:
&gt;           os.replace(fn, path)
E           PermissionError: [WinError 5] Access is denied: 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmpwusflrts\\cache-4efkwtn3' -&gt; 'C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\tmpwusflrts\\cache'

C:\Users\runneradmin\micromamba\envs\argopy-tests\Lib\site-packages\fsspec\utils.py:625: PermissionError

To view individual test run time comparison to the main branch, go to the Test Analytics Dashboard

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New data source for GDAC from Amazon S3
1 participant