Skip to content

Commit

Permalink
Merge pull request #18 from mpi2/docs
Browse files Browse the repository at this point in the history
Update documentation
  • Loading branch information
marinak-ebi authored Oct 24, 2024
2 parents 8e2e0c5 + 2b41464 commit 49c6f6c
Show file tree
Hide file tree
Showing 3 changed files with 28 additions and 25 deletions.
48 changes: 26 additions & 22 deletions impc_module/README.md
Original file line number Diff line number Diff line change
@@ -1,33 +1,37 @@
# IMPC_API

`impc_api` is a python package.
`impc_api` is a Python package.

The functions in this package are intended for use on a Jupyter Notebook.
The functions in this package are intended for use in a Jupyter Notebook.

## Installation Instructions

1. **Create a virtual environment (optional but recommended)**:
On Mac:
`python3 -m venv .venv`
`source .venv/bin/activate`
```bash
python3 -m venv .venv
source .venv/bin/activate
```

2. **Install the package**: `pip install impc_api`
3. **Install Jupyter**: `pip install jupyter`
4. **Run the Jupyter Notebook**: `jupyter notebook`

2. **Install the package running**: `pip install impc_api`
3. **Try it out**: Create a [Jupyter Notebook](https://jupyter.org/install#jupyter-notebook) and try some of the examples below:
After executing the command, the Jupyter interface should open in your browser. If it does not, follow the instructions provided in the terminal.

## Installing the package for the first time
5. **Try it out**:

1. Clone the repository and navigate into it. Navigate into the package name until you can see `setup.py` and `pyproject.toml`
2. Run `python3 -m build`, this builds the package, a couple of new files/folders will appear.
3. Install the package running `pip install .`
4. Try it out: Go to Jupyter Notebook and some examples below:
Create a [Jupyter Notebook](https://jupyter-notebook.readthedocs.io/en/latest/) and try some of the examples below:

### Available functions
## Available functions

The available functions can be imported as:

```python
from impc_api import solr_request, batch_solr_request
```

## 1. Solr request
# 1. Solr request

The most basic request to the IMPC solr API

Expand All @@ -42,7 +46,7 @@ num_found, df = solr_request(
)
```

### a. Facet request
## a. Facet request

`solr_request` allows facet requests

Expand All @@ -60,11 +64,11 @@ num_found, df = solr_request(
)
```

### b. Solr request validation
## b. Solr request validation

A common pitfall when writing a query is the misspelling of `core` and `fields` arguments. For this, we have included a `validate` argument that raises a warning when these values are not as expected. Note this does not prevent you from executing a query; it just alerts you to a potential issue.

#### Core validation
### Core validation

```python
num_found, df = solr_request(
Expand All @@ -80,7 +84,7 @@ num_found, df = solr_request(
> dict_keys(['experiment', 'genotype-phenotype', 'impc_images', 'phenodigm', 'statistical-result'])
```

#### Field list validation
### Field list validation

```python
num_found, df = solr_request(
Expand All @@ -96,7 +100,7 @@ num_found, df = solr_request(
> To see expected fields check the documentation at: https://www.ebi.ac.uk/mi/impc/solrdoc/
```

## 2. Batch Solr Request
# 2. Batch Solr Request

`batch_solr_request` is available for large queries. This solves issues where a request is too large to fit into memory or where it puts a lot of strain on the API.

Expand All @@ -106,11 +110,11 @@ Use `batch_solr_request` for:
- Querying multiple items in a list
- Downloading data in `json` or `csv` format.

### Large queries
## Large queries

For large queries you can choose between seeing them in a DataFrame or downloading them in `json` or `csv` format.

### a. Large query - see in DataFrame
## a. Large query - see in DataFrame

This will fetch your data using the API responsibly and return a Pandas DataFrame

Expand All @@ -128,7 +132,7 @@ df = batch_solr_request(
print(df.head())
```

### b. Large query - Download
## b. Large query - Download

When using the `download=True` option, a file with the requested information will be saved as `filename`. The format is selected based on the `wt` parameter.
A DataFrame may be returned, provided it does not exceed the memory available on your laptop. If the DataFrame is too large, an error will be raised. For these cases, we recommend you read the downloaded file in batches/chunks.
Expand All @@ -147,7 +151,7 @@ df = batch_solr_request(
print(df.head())
```

### c. Query by multiple values
## c. Query by multiple values

`batch_solr_request` also allows to search multiple items in a list provided they belong to them same field.
Pass the list to the `field_list` param and specify the type of `fl` in `field_type`.
Expand Down
3 changes: 1 addition & 2 deletions impc_module/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ build-backend = "setuptools.build_meta"
name = "impc_api"
version = "1.0.1"
description = "A package to facilitate making API requests to the IMPC Solr API"
long_description_content_type = "text/markdown"
authors = [
{ name = "MPI2" },
{ name = "Marina Kan" },
Expand All @@ -19,7 +18,7 @@ dependencies = [
"pydantic>=2.9"
]

readme = "README.md"
readme = "impc_module/README.md"
requires-python = ">=3.10"

[project.optional-dependencies]
Expand Down
2 changes: 1 addition & 1 deletion impc_module/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@


# Read the README file for the long description
with open("README.md", "r", encoding="utf-8") as fh:
with open("impc_module/README.md", "r", encoding="utf-8") as fh:
long_description = fh.read()

setup(
Expand Down

0 comments on commit 49c6f6c

Please sign in to comment.