Anagnorisis

Anagnorisis - is a local recommendation system that allows you to fine-tune models on your data to predict your data preferences. You can feed it as much of your personal data as you like and not be afraid of it leaking as all of it is stored and processed locally on your own computer.

The project uses Flask libraries for backend and Bulma as frontend CSS framework. For all ML-related stuff Transformers and PyTorch are used. This is the main technological stack, however there are more libraries used for specific purposes.

To read more about the ideas behind the project you can read these articles:
Anagnorisis. Part 1: A Vision for Better Information Management.
Anagnorisis. Part 2: The Music Recommendation Algorithm.
Anagnorisis. Part 3: Why Should You Go Local?

General

Here is the main pipeline of working with the project:

You rate some data such as text, audio, images, video or anything else on the scale from 0 to 10 and all of this is stored in the project database.
When you acquire some amount of such rated data points you go to the 'Train' page and start the fine-tuning of the model so it could rate the data AS IF it was rated by you.
New model is used to sort new data by rates from the model and if you do not agree with the scores the model gave, you simply change it.

You repeat these steps again and again, getting each time model that better and better aligns to your preferences.

Music Module

Please watch this video to see presentation of 'Music' module usage:

To see how the algorithm works in details, please read this wiki page: Music

Images Module

Please watch this video to see presentation of 'Images' module usage:

Or you can read the guide at the Images wiki page.

Running from Docker

The preferred way to run the project is from Docker. This should be much more stable than running it from the local environment, especially on Windows. But be aware that all paths in the projects would be relative the DATA_PATH folder that you mount to the container.

Make sure that you have Docker installed. In case it is not go to Docker installation page and install it.

Clone this repository:

    git clone https://github.com/yourusername/Anagnorisis.git
    cd Anagnorisis

Launch the application
```
    DATA_PATH=/path/to/your/data EXTERNAL_PORT=5001 docker-compose up -d
```
Note: if you are using Docker Desktop you have to explicitly provide access to /path/to/your/data folder in the Docker settings. Otherwise, you will not be able to access it from the container. To do so, go to Docker Desktop settings, then to Resources -> File Sharing and add the path to your data folder.
Access the application at http://localhost:5001 (or whichever port you configured) in your web browser.

Running from the local environment

In case you do not want to use Docker, you can also install the project manually with this commands. Notice that the project has only been tested on Ubuntu 22.04 with Python 3.10, there is no guarantee that it will work on any other platforms or different version of Python. For Windows users I highly recommend to use Docker as there might be some unexpected issues.

Clone this repository:

    git clone https://github.com/yourusername/Anagnorisis.git
    cd Anagnorisis

Recreate the Environment with following commands:

    # For Linux
    python3 -m venv .env  # recreate the virtual environment
    source .env/bin/activate  # activate the virtual environment
    pip install -r requirements.txt  # install the required packages
    # For Windows
    python -m venv .env  # recreate the virtual environment
    .env\Scripts\activate  # activate the virtual environment
    pip install -r requirements.txt  # install the required packages

Then run the project with command:

    # For Linux
    DATA_PATH=/path/to/your/data bash run.sh
    # For Windows
    DATA_PATH=/path/to/your/data bash run.bat

Access the application at http://localhost:5001 (or whichever port you configured) in your web browser.

Additional notes for installation

The Docker container includes Ubuntu 22.04, CUDA drives and several large machine learning models and dependencies, which results in a significant storage footprint. After the container is built it will take about 45GB of storage on your disk. If you want to avoid that, consider running the project from the local environment.

If DATA_PATH is not provided, /project_data folder in the project root will be used.

After initializing the project, you will find new Anagnorisis-app folder inside of DATA_PATH folder. In this folder project's database, migrations, models and configuration file will be stored. After running the project for the first time, {DATA_PATH}/Anagnorisis-app/database/project.db file will be crated. That DB will store your preferences, that will be used later to fine-tune evaluation models. Try to make backups of this file from time to time, as it contains all of your preferences, and some additional data, such as playback history.

Running the project from the local environment should be somewhat more efficient as there is no Docker overhead when reading the data.

If you have a lot of data in your data folder, for the first time hash cache and embedding cache will be gathered. Please be patient, as it may take a while. The percentage of the progress will be shown in the status bar.

The project requires GPU to run properly. When running the project inside the Docker container, make sure that NVIDIA Container Toolkit is installed for Linux and WSL2 for Windows.

Security notes

When running the project in a local environment, the default address is set to 0.0.0.0 (this setting is necessery for proper work inside the Docker container). This configuration means the application listens on all available network interfaces, making it accessible from any device on your local network (i.e., any computer connected to the same router). However, this does not automatically expose the service to the internet. Access from outside your local network will depend on your firewall settings and router configuration.

Embedding models

To make audio and visual search possible the project uses these models:
LAION-AI/CLAP
Google/SigLIP

All embedding models are downloaded automatically when the project is started for the first time. This might take some time depending on the internet connection. You can see the progress inside container_log.txt file that will appear in the project's root folder if you run the project from the Docker container.

Wiki

The project has its own wiki that is integrated into the project itself, you might access it by running the project, or simply reading it as markdown files.

Here is some pages that might be interesting for you:
Change history
Philosophy
Music
Images
Roadmap

In memory of Josh Greenberg - one of the creators of Grooveshark. Long gone music service that had the best music recommendation system I've ever seen.

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
.github		.github
pages		pages
project_info		project_info
research		research
src		src
static		static
tests		tests
wiki		wiki
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
config.yaml		config.yaml
docker-compose.yaml		docker-compose.yaml
requirements.txt		requirements.txt
run.bat		run.bat
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Anagnorisis

General

Music Module

Images Module

Running from Docker

Running from the local environment

Additional notes for installation

Security notes

Embedding models

Wiki

About

Releases

Sponsor this project

Packages

Contributors 2

Languages

volotat/Anagnorisis

Folders and files

Latest commit

History

Repository files navigation

Anagnorisis

General

Music Module

Images Module

Running from Docker

Running from the local environment

Additional notes for installation

Security notes

Embedding models

Wiki

About

Topics

Resources

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Contributors 2

Languages

Packages