-
Notifications
You must be signed in to change notification settings - Fork 0
monasticacademy/thompson-sampling-recommendation-microservice
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Thompson Sampling Multi-Armed Bandit Microservice ================================================= This microservice provides a Thompson Sampling approach to a Multi-Armed Bandit problem. It offers two main endpoints for updating arm information and choosing an arm. API Endpoints ------------- 1. **Update Arm Information** (`POST /update`) - Description: Updates the information of a given arm and returns the new arm information. - Request Body: - `update_arm`: The arm to update (mean, variance, effective size, label). - `reward`: The reward for the arm being updated. - Example Request: ```json { "update_arm": {"mean": 0.5, "variance": 0.2, "effective_size": 10, "label": "option1"}, "reward": 1 } ``` - Response: Updated arm information. - Example Response: ```json {"arm": [{"label": "option1", "mean": 0.51, "variance": 0.19, "effective_size": 11}]} ``` 2. **Choose an Arm** (`POST /choose`) - Description: Takes all the arms and their information, then chooses an arm and returns it. - Request Body: - `arms`: List of arms with their information (mean, variance, effective size, label). - Example Request: ```json {"arms": [{"mean": 0.5, "variance": 0.2, "effective_size": 10, "label": "option1"}]} ``` - Response: Chosen arm label. - Example Response: ```json {"chosen_arm": "option1"} ``` API Documentation ----------------- API documentation is available through Swagger UI at `/apidocs/index.html` on the running service. Authentication -------------- Authentication is required for both endpoints: - Default Username: `user` - Default Password: `password` - Authentication Method: HTTP Basic Authentication Logging ------- Logging is implemented using the Python logging module. Logs are directed to stderr. Installation on Cloud Run ------------------------- 1. Build the Docker image: ``` docker build -t thompson-sampling . ``` 2. Push the image to a container registry: ``` docker push thompson-sampling ``` 3. Deploy to Cloud Run: ``` gcloud run deploy --image thompson-sampling --platform managed ``` Understanding Thompson Sampling ------------------------------- Thompson Sampling is a probabilistic algorithm used for solving the Multi-Armed Bandit problem. In this problem, you have multiple options (arms), each with an unknown reward probability. The goal is to find the arm with the highest expected reward over a series of trials. - **Mean**: The average reward for an arm. It represents the probability of success for that arm. - **Variance**: A measure of how much the rewards vary for an arm. A higher variance means more uncertainty. - **Effective Size**: The number of trials that contributed to the mean and variance. A higher effective size means more confidence in the mean and variance. - **Label**: A unique identifier for each arm. Thompson Sampling uses these parameters to model the uncertainty about the true reward probabilities and balances exploration (trying new arms) with exploitation (choosing the best-known arm). For more detailed information, please refer to scholarly articles and textbooks on Thompson Sampling and Multi-Armed Bandits.
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published