Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Breaking change in 0.7 : a bug? #287

Closed
sjfleming opened this issue Jan 8, 2020 · 9 comments · Fixed by #289
Closed

Breaking change in 0.7 : a bug? #287

sjfleming opened this issue Jan 8, 2020 · 9 comments · Fixed by #289

Comments

@sjfleming
Copy link

When I save a .h5ad file using adata.write() in anndata 0.7rc2, and then try to open it using scanpy.read_h5ad() with anndata 0.6.22.post1, I get an error.

@falexwolf
Copy link
Member

Yes, the file format changed in 0.7. anndata prior to 0.7 cannot read new files. anndata 0.7 can evidently read old files!

@gokceneraslan
Copy link
Contributor

We should start having proper AnnData file format versioning 🤨 so that we can print proper errors like "this file seems to be created by a newer anndata version"

@falexwolf
Copy link
Member

We have it! It's in the new file format.

It's just not yet documented. @ivirshup, everyone waits for this documentation!

@flying-sheep
Copy link
Member

Also the warning isn’t in place. AnnData should know which versions it supports and print a warning that it’s about to read a file written by a newer version, and that this might fail.

@sjfleming
Copy link
Author

Okay, thanks for the answer!

@flying-sheep
Copy link
Member

Can you please (now and always when you report one) post the error? Use:

```pytb
...
```

@sjfleming
Copy link
Author

Yeah I will do that in the future. Sorry for not posting this time: I lost the setup that I was using to test the behavior and then gave up. But I've recreated the error again just now. Evidently the error only occurs if there's a populated .obsm slot! (If I save an h5ad with no .obsm slot, then a version saved by anndata==0.7rc2 loads in anndata==0.6.22.post1... just discovered that.) Here's the message:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-5-8edd9de9b2e9> in <module>
----> 1 adata = sc.read_h5ad('small_adata_v7.h5ad')

/anaconda3/envs/scanpy144post1/lib/python3.7/site-packages/anndata/readwrite/read.py in read_h5ad(filename, backed, chunk_size)
    445     else:
    446         # load everything into memory
--> 447         constructor_args = _read_args_from_h5ad(filename=filename, chunk_size=chunk_size)
    448         X = constructor_args[0]
    449         dtype = None

/anaconda3/envs/scanpy144post1/lib/python3.7/site-packages/anndata/readwrite/read.py in _read_args_from_h5ad(adata, filename, mode, chunk_size)
    500     if not backed:
    501         f.close()
--> 502     return AnnData._args_from_dict(d)
    503 
    504 

/anaconda3/envs/scanpy144post1/lib/python3.7/site-packages/anndata/core/anndata.py in _args_from_dict(ddata)
   2155             if d_true_keys[true_key] is not None:
   2156                 for key in keys:
-> 2157                     if key in d_true_keys[true_key].dtype.names:
   2158                         d_true_keys[true_key] = pd.DataFrame.from_records(
   2159                             d_true_keys[true_key], index=key)

AttributeError: 'dict' object has no attribute 'dtype'

Also noticed if I do a fresh pip install of scanpy==1.4.4.post1, I get the latest anndata==0.7.rc2 (due to the setup file spec of anndata>=0.6.22rc1 I think). Is that the intended behavior? When I was using scanpy==1.4.4.post1 before, it was always with anndata==0.6.22.post1.

@flying-sheep
Copy link
Member

flying-sheep commented Jan 10, 2020

Thank you, this is valuable information if we want to backport warnings to 0.6.

By marking this issue as “fixed”, I mean that there’s now code that warns people who use AnnData 0.7rc3 or higher if they try to load a version from a not yet existing, future version of AnnData that uses different serialization.

Also noticed if I do a fresh pip install of scanpy==1.4.4.post1, I get the latest anndata==0.7.rc2 (due to the setup file spec of anndata>=0.6.22rc1 I think). Is that the intended behavior?

Ha, amazing. Apparently “rc*” means to pip that any rc version is fair game even though there’s non-rc versions in between. In our case >=0.6.22rc1 should mean “0.6.22rc* or a non-rc, non-beta version above” IMHO. We should report this to pip! /edit: already reported as pypa/pip#4969 and closed as “working as intended”, wat. I filed pypa/pip#7579

@emdann
Copy link
Member

emdann commented Feb 14, 2020

Evidently the error only occurs if there's a populated .obsm slot! (If I save an h5ad with no .obsm slot, then a version saved by anndata==0.7rc2 loads in anndata==0.6.22.post1... just discovered that.)

I have encountered the same issue and removing varm and obsm didn't help. I still get the error below when loading with anndata=0.6.22.post1.

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jovyan/.local/lib/python3.7/site-packages/anndata/readwrite/read.py", line 447, in read_h5ad
    constructor_args = _read_args_from_h5ad(filename=filename, chunk_size=chunk_size)
  File "/home/jovyan/.local/lib/python3.7/site-packages/anndata/readwrite/read.py", line 502, in _read_args_from_h5ad
    return AnnData._args_from_dict(d)
  File "/home/jovyan/.local/lib/python3.7/site-packages/anndata/core/anndata.py", line 2157, in _args_from_dict
    if key in d_true_keys[true_key].dtype.names:
AttributeError: 'dict' object has no attribute 'dtype'

btw the .h5ad files saved by anndata==0.7rc2 also can't be loaded in Seurat, breaking interoperability with tools in R.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants