Skip to content

FrederikLauf/Image-Clustering-Tool

Repository files navigation

Image Clustering Tool

A tool for explorative image clustering, using algorithms for unsupervised machine learning provided by scikit-learn.

The result is visualised on a matplotlib frame as clusters of image thumbnails with coloured frames indicating the cluster membership. The plot is two-dimensional, but the viewing direction onto the higher-dimensional data can be selected via scrollbars.

The clusters can be exported, i.e. the original images are copied to different subfolders according to their cluster membership.

For better inspection, a cluster can be singled out and is re-plotted by clicking on a representative. Clicking on an image in this view opens the orignal image in a separate window. (The view of all clusters is recovered by re-applying the cluster analysis with the “apply” button.)

The GUI is invoked by running image_clustering_app.py.

Example screenshots

demo/demo_screenshot_1.png demo/demo_screenshot_2.png

Input fields and other control elements on the GUI

Scaler
Select the scaler, which is applied to the data first.
Decomposer
Select the method for reducing the dimensionality of the data.
components
Enter the number of dimensions for the reduced data (irrelevant for decomposer “TSNE”).
Clusterer
Select the clustering method.
n_clusters
Enter the desired number of resulting clusters (irrelevant for clusterer “DBSCAN”).
dbscan_min
Enter the minimal cluster size for DBSCAN (irrelevant for other clusterers).
dbscan_eps
Enter the eps parameter (“cluster density”) for DBSCAN (irrelevant for other clusterers).
select folder
Select a folder with image files for analysis.
apply
Perform and display clustering analysis on the currently selected images according to the current inputs.
export
Copy the original images into a subfolder of their current folder, distributed over further subfolders according to their cluster membership.
scrollbars
Select the dimension of the reduced data to be displayed on the respective axis of the plot.
slider “image size”
Determine the size of the image thumbnails which are loaded and used as the data for clustering. For value n, the images are sized to (n, 2n/3) regardless of their original format.
slider “image display scaling”
Scale the size of the thumbnails shown in the plot.
check box “on release”
Scale the size of the thumbnails shown in the plot only on release of the slider (less performance challenging).
button “axis off/on”
Switch axis visibility. (Displaying the axis can be useful for estimating the eps parameter for DBSCAN.)

About

An explorative image clustering tool for reviewing and reducing photo collections.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages