Skip to content

MalloryWittwer/featurescope

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

78 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🫧 Featurescope: Image Feature Visualization

DOI

jellifysh-cast.mp4

πŸ‘†πŸΌ Jellyfish dataset from Kaggle; features extracted with DinoV2 and projected using PCA. You can download this example and try it yourself!

The Featurescope is a browser-based viewer that helps you understand how numerical features are distributed in a dataset of images.

  • Upload images and features to your web browser (they remain local).
  • Choose which features to plot in X and Y.
  • Explore the data by zooming in an out of the canvas.

Image features can be measurements, embedding values, or other numerical outputs from image analysis.

Note

Looking for the initial project, Spheriscope? You can find it on the spheriscope branch. However, we're not planning to develop this project further at the moment.

Installation

You can install the featurescope Python package using pip:

pip install featurescope

or clone this repository and install the development version:

git clone https://github.com/MalloryWittwer/featurescope.git
cd featurescope
pip install -e python

Usage

To open your images in the Featurescope viewer, you need to have formatted your dataset in a way that the viewer can understand. This involves saving the images in a local folder with the corresponding features in a file named features.csv.

Overview

Saving images and features is done in Python via the featurescope module.

import featurescope

featurescope provides several dataset formatting methods to address different use cases. The method to use depends on whether are computing features on a dataset of images or on objects in a labelled image, and whether you want to apply a featurizer function or have already computed a features dataframe.

The concepts of image dataset, featurizer, labelled image and features dataframe are defined here.

Identify which method matches your use case:

Features relate to images in a dataset:
featurescope.apply_to_images() Apply a featurizer to all images in a local folder and save the results for visualization (➑️ docs).
featurescope.apply_from_images_df() Save a features dataframe matching images for visualization (➑️ docs).
Features relate to objects in a labelled image:
featurescope.apply_to_label_image() Apply a featurizer to objects in a labelled image and save the results for visualization (➑️ docs).
featurescope.apply_from_label_image_df() Save a features dataframe matching objects in a labelled image for visualization (➑️ docs).

Applying any of these methods will produce a local folder containing image files as well as a CSV file named features.csv containing the computed features.

images/
β”œβ”€β”€ img1.png
β”œβ”€β”€ img2.png
β”œβ”€β”€ ...
β”œβ”€β”€ features.csv  <- Contains the computed features

Visualization

Once your images and features.csv are saved, you can visualize these data in the web application.

That's it! You should be able to browse and visualize your images and features. πŸŽ‰

Dataset formatting methods

apply_to_images

Use this method if you have defined your own featurizer function in Python and want to apply it to all images in a folder. apply_to_images will load the images, run the featurizer, and save the results as a features.csv.

Example

You have:

images/
β”œβ”€β”€ img1.png
β”œβ”€β”€ img2.png
β”œβ”€β”€ ...

Then, do:

import featurescope

# Define your featurizer
def minmax_featurizer(image: np.ndarray) -> Dict:
    image_min = image.min()
    image_max = image.max()
    return {
        "min": image_min,
        "max": image_max
    }

images_dir = "./images"  # Folder with images

# Apply the featurizer to all images in images_dir
csv_path = featurescope.apply_to_images(images_dir, minmax_featurizer)

print(csv_path)  # CSV got saved in the images folder (./images/features.csv)

Result:

images/
β”œβ”€β”€ img1.png
β”œβ”€β”€ img2.png
β”œβ”€β”€ ...
β”œβ”€β”€ features.csv  <- Contains the computed features

apply_from_images_df

Use this method if you have already computed a features dataframe corresponding to images and want to use the Featurescope to visualize these features.

The rows in your dataframe must be matched with images. We distinguish two situations here:

Option 1: Match dataframe rows with image files via filename_column

You can pass a value for filename_column to specify a column in your dataframe that identifies images file names. The image files specified in filename_column must be found in images_dir.

Example

You have this df:

image_file feature_01 feature_02 ...
00.png 1.2 4 ...
01.png 1.9 3 ...
02.png 2.3 0 ...

Then, do:

import featurescope

df = (...)  # A features DataFrame with a column `image_file` containing image file names
images_dir = "./images"  # A folder containing these image files

csv_path = featurescope.apply_from_images_df(df, images_dir, filename_column="image_file")

Option 2: Store the image arrays in an image_column

You can pass a value for image_column to indicate that your images are stored as numpy arrays in the dataframe. In this case, images_dir refers to an empty folder where these images will be saved in PNG format.

Example

You have this df:

image feature_01 feature_02 ...
np.ndarray([[0, 1, 2..]]) 1.2 4 ...
np.ndarray([[0, 1, 2..]]) 1.9 3 ...
np.ndarray([[0, 1, 2..]]) 2.3 0 ...

Then, do:

import featurescope

df = (...)  # A features DataFrame with a column `image` containing images as numpy arrays
images_dir = "./images"  # An empty folder where to save the images

csv_path = featurescope.apply_from_images_df(df, images_dir, image_column="image")

In both cases, you should end up with a local folder containing the images and a features.csv file that you can use for visualization.

apply_to_label_image

Use this method if you have a labelled array and a featurizer function to apply to the segmented objects. The featurizer can be applied either to the binary mask or to the intensity image under the mask of each object.

For convenience, apply_to_label_image can directly compute properties from scikit-images's regionprops function instead of (or in addition to) applying a custom-defined featurizer function.

Example 1: compute the area and eccentricity features from regionprops:

import featurescope

label_image = (...) # A labelled array
images_dir = "./images"  # An empty folder where to save the results

csv_path = featurescope.apply_to_label_image(
    images_dir, 
    label_image, 
    properties=["area", "eccentricity"],
)

Example 2 with an intensity image and a featurizer function:

import featurescope

def minmax_featurizer(image) -> Dict:
    (...)

label_image = (...) # A labelled array
images = (...) # An intensity image corresponding to the labelled array
images_dir = "./images"  # An empty folder where to save the results

csv_path = featurescope.apply_to_label_image(
    images_dir, 
    label_image, 
    image=image,
    featurizer_funct=minmax_featurizer,
)

Both of these examples save the computed features and images (crops around segmented objects) in the specified folder, which can be used for visualization with the Featurescope.

apply_from_label_image_df

Use this method if you have already computed a features dataframe corresponding to objects in a labelled image and want to visualize these features with the Featurescope.

The rows in your dataframe must be matched with label values in the labelled array via a label column.

If you pass an intensity image via the image parameter, the cropped regions of this image around the segmented objects will be used for visualization. Otherwise, a binary mask of the segmented objects in label_image will be used instead.

Example

You have this df:

label feature_01 feature_02 ...
1 1.2 4 ...
2 1.9 3 ...
3 2.3 0 ...

Then, do:

import featurescope

df = (...)  # A features DataFrame with a column `label` identifying object labels
images_dir = "./images"  # An empty folder where to save the results
label_image = (...)  # A labelled array
image = (...)  # An intensity image

csv_path = featurescope.apply_from_label_image_df(df, images_dir, label_image, image)

Running this code will save the image crops and a features.csv file in the specified folder, so you can use this folder for visualization with the Featurescope.

FAQ

Does the data remain local?

Yes! Your images remain local (they are not uploaded to a remote server) even if you access the front-end app via a public URL. Your images and features are simply uploaded to your web browser's internal storage. If you reload the page, everything will be cleaned up and reset!

License

This software is distributed under the terms of the BSD-3 license.

Issues

If you encounter any problems, please file an issue along with a detailed description.