Leveraging LLMs for Informed Bitcoin Trading Decisions: Prompting with Social and News Data Reveals Promising Predictive Abilities

The work was carried out by:

Description

This thesis investigates the potential of leveraging Large Language Models (LLMs) to support Bitcoin traders. Specifically, we analyze the correlation between Bitcoin price movements and sentiment expressed in news headlines, posts, and comments on social media. We build a novel, large-scale dataset that aggregates various features related to Bitcoin and its price over time, spanning from 2016 to 2024, and includes data from news outlets, social media posts, and comments. Using this dataset, we tried to evaluate the effectiveness of LLMs and Deep Learning models by making predictions on real data through standard classification tasks, as well as backtesting and demo trading accounts with different investment strategies. We build interactive interfaces to annotate real-time data via LLMs, perform custom backtesting, and visualize demo trading account performances. Our approach leverages the extended context capabilities of recent LLMs through simple prompting to generate outputs such as textual reasoning, sentiment, recommended trading actions, and confidence scores. Our findings reveal that LLMs represent a powerful tool for assisting trading decisions, opening up promising avenues for future research.

Structure

+- root
  +- backtest
  +- data_annotation
  +- data_exploratory_analysis
  +- data_mining
  +- data_predictions
  +- demo
  +- models
  +- shared
  +- utils
  +- config.py
  +- requirements.py

Where:

backtest: Contains the scripts needed to backtest the strategies using the decisions made by the LLMs.
data_annotation: Contains the procedures to annotate the data using the LLMs.
data_exploratory_analysis: Contains the scripts that allow visualizing the data collected and the decisions made by the LLMs.
data_mining: Contains the procedures to retrieve all the data needed for research and to generate a single dataset.
data_predictions: Contains the scripts that allow deep learning models to be used to make predictions based on the data collected.
demo: Contains the files needed to view real-time data annotation and the performance of demo trading accounts.
hf_data: Contains all datasets collected during the retrieval, annotation, and visualization process.
models: Contains the definition of the deep learning models used during the prediction process.
secrets: Contains the secrets used within the project such as api keys and account credentials.
shared: Contains variables and constants shared by most of the files in the project.
utils: Contains methods that are shared by most of the files in the project.
config.py: Contains the configuration variables shared by most of the files in the project.
requirements.py: Contains requirements to be installed before running the project.

Installation

We use Python 3.12.4 which is the last version supported by PyTorch.

1. Create an enviroment

python3 -m venv .venv
.venv\scripts\activate

2. Install requirements

pip install -r requirements.py

Dataset

You can download the needed data from this Hugging Face Repository.

Put the downloaded folders into hf_data directory.

The annotated folder contains the original dataset with the annotation of the respective LLMs.

The merged folder contains the raw dataset without annotation of LLMs (price data, blockchain, and sentiment indices)

Demo

Live-data annotation

Download and install Ollama

Setup the following LLMs:

Create Gemini API Key

Create gemini.json file in the secrets directory and add

{
    "GOOGLE_API_KEY_1": "<api_key>",
}

Create Reddit API Key

Create reddit.json file in the secrets directory and add

{
    "client_id": "<client_id>",
    "client_secret": "<client_secret>",
    "user_agent": "<user_agent>"
}

Download and install HTTP Toolkit

Set on your pc as a custom proxy:

ip: 127.0.0.1
port: 8080

Execute

python -m demo.demo

Backtesting

Execute

python -m backtest.backtest

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Leveraging LLMs for Informed Bitcoin Trading Decisions: Prompting with Social and News Data Reveals Promising Predictive Abilities

Description

Structure

Installation

1. Create an enviroment

2. Install requirements

Dataset

Demo

Live-data annotation

Backtesting

Examples

Live-data annotation

Backtesting

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
backtest		backtest
data_annotation		data_annotation
data_exploratory_analysis		data_exploratory_analysis
data_mining		data_mining
data_predictions		data_predictions
demo		demo
examples		examples
models		models
shared		shared
utils		utils
.gitignore		.gitignore
README.md		README.md
config.py		config.py
requirements.py		requirements.py

CorsiDanilo/Leveraging-LLMs-for-Informed-Bitcoin-Trading-Decisions

Folders and files

Latest commit

History

Repository files navigation

Leveraging LLMs for Informed Bitcoin Trading Decisions: Prompting with Social and News Data Reveals Promising Predictive Abilities

Description

Structure

Installation

1. Create an enviroment

2. Install requirements

Dataset

Demo

Live-data annotation

Backtesting

Examples

Live-data annotation

Backtesting

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages