Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 18 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
## Table of Contents

- [Introduction](#introduction)
- [2026-02-18 Hardening Notes](#2026-02-18-hardening-notes)
- [Security & Performance](#security--performance-breakthrough-new)
- [Enterprise Security](#️-enterprise-grade-hashicorp-vault-integration)
- [Maximum Speed Trading](#-maximum-speed-trading)
Expand All @@ -30,6 +31,21 @@

The **ELVIS** (**E**nhanced **L**everaged **V**irtual **I**nvestment **S**ystem) Trading Bot is a sophisticated, modular algorithmic trading system that leverages machine learning models for automated cryptocurrency trading. The system integrates multiple ML architectures, real-time data processing, risk management, and execution modules to facilitate intelligent trading strategies with comprehensive monitoring and visualization capabilities.

## 2026-02-18 Hardening Notes

The February 18, 2026 hardening update closed issues `#9`, `#10`, `#11`, `#13`, and `#16` and changed runtime defaults in ways operators need to know:

- `VAULT_TOKEN` is no longer hardcoded or auto-populated.
- `POSTGRES_PASSWORD` no longer has a hardcoded default.
- Trade History API bind is now local by default:
- `TRADE_HISTORY_API_HOST=127.0.0.1`
- `TRADE_HISTORY_API_PORT=5050`
- Repository hygiene policy now ignores local virtualenv/build trees by default (`env*/`, `venv*/`, `.venv/`, `tensorflow/`).

Operational runbook and troubleshooting:
- `docs/ops/2026-02-18_container_observability_runbook.md`
- `SECURITY.md`

## 🚀 Current Status (July 2025)

**✅ FULLY OPERATIONAL TRADING BOT WITH ENTERPRISE SECURITY**
Expand Down Expand Up @@ -112,7 +128,8 @@ git clone https://github.com/cluster2600/ELVIS.git
cd ELVIS/ansible
chmod +x run_setup.sh
./run_setup.sh --docker
# Access at http://localhost:5050 when ready
# API health: http://localhost:5050/health
# Grafana: http://localhost:3001
```

**Option 2: Secure Development Setup**
Expand Down
26 changes: 23 additions & 3 deletions SECURITY.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,26 @@
## Overview
ELVIS Trading Bot implements enterprise-grade security practices with HashiCorp Vault integration for secure secrets management and API key protection.

## Sprint-1 Hardening Updates (2026-02-18)

These security and exposure changes were shipped in the Sprint-1 hardening pass:

- Removed hardcoded Vault token behavior.
- `VAULT_TOKEN` must now be provided externally when Vault-backed secret loading is required.
- Removed hardcoded Postgres password fallback.
- `POSTGRES_PASSWORD` has no default and must be supplied via environment/secrets manager.
- Changed Trade History API exposure to local-only by default.
- `TRADE_HISTORY_API_HOST` default: `127.0.0.1`
- `TRADE_HISTORY_API_PORT` default: `5050`
- Added API key authentication to the Flask trade-history API (`X-API-Key` header).
- `/health` is exempt for health checks.

Operational implications:
- If Prometheus scrapes `/metrics` through the authenticated Flask API, you must either:
- configure Prometheus to send `X-API-Key`, or
- explicitly exempt `/metrics` in API auth middleware.
- For remote/API access from outside localhost (for example in containerized deployments), set `TRADE_HISTORY_API_HOST=0.0.0.0` intentionally.

## 🔐 HashiCorp Vault Integration

### Security Architecture
Expand Down Expand Up @@ -195,8 +215,8 @@ def test_vault_security():

---

**Last Updated**: July 20, 2025
**Last Updated**: February 18, 2026
**Security Review**: Complete
**Next Review**: January 20, 2026
**Next Review**: August 18, 2026

> **Note**: This security implementation represents enterprise-grade protection for cryptocurrency trading operations. All security measures are actively monitored and regularly audited.
> **Note**: This security implementation represents enterprise-grade protection for cryptocurrency trading operations. All security measures are actively monitored and regularly audited.
6 changes: 5 additions & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@
- **[SECURITY.md](../SECURITY.md)** - Complete security implementation with HashiCorp Vault
- **[VAULT_SETUP.md](VAULT_SETUP.md)** - Step-by-step Vault configuration guide

### 🧰 Operations Runbooks
- **[2026-02-18 Container Observability Runbook](ops/2026-02-18_container_observability_runbook.md)** - Container startup, Grafana/Prometheus "No data" troubleshooting, and post-hardening env requirements
- **[2026-02-10 ELVIS No Data Debug](ops/2026-02-10_elvis_no_data_debug.md)** - Prior investigation notes for dashboard data issues

### 📊 System Architecture
- **[API Monitoring](../utils/api_connection_tester.py)** - Real-time API health monitoring
- **[Console Dashboard](../utils/console_dashboard.py)** - Live trading dashboard with visual indicators
Expand Down Expand Up @@ -116,4 +120,4 @@ Updated: 18:15:42

**For detailed setup instructions, see [VAULT_SETUP.md](VAULT_SETUP.md)**
**For security details, see [SECURITY.md](../SECURITY.md)**
**For support, check the main [README.md](../README.md)**
**For support, check the main [README.md](../README.md)**
138 changes: 138 additions & 0 deletions docs/ops/2026-02-18_container_observability_runbook.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
# 2026-02-18 - Container observability and hardening runbook

## Scope
This runbook documents the post-hardening behavior introduced on February 18, 2026 (issues `#9`, `#10`, `#11`, `#13`, `#16`) and how to keep Grafana dashboards populated in container deployments.

## Security-sensitive runtime defaults

### Required environment behavior
- `VAULT_TOKEN` has no hardcoded fallback.
- `POSTGRES_PASSWORD` has no hardcoded fallback.
- Trade History API defaults to local bind:
- `TRADE_HISTORY_API_HOST=127.0.0.1`
- `TRADE_HISTORY_API_PORT=5050`
- Flask trade-history API now requires `X-API-Key: <API_KEY>` for all routes except `/health`.

### Practical impact
- If you do not provide `VAULT_TOKEN`, Vault-backed secret retrieval is unavailable.
- If you do not provide `POSTGRES_PASSWORD`, DB auth may fail.
- If you keep default API host binding (`127.0.0.1`), remote/container scrapers cannot reach the API unless they share the same namespace.

## Container startup (monitoring stack)

From project root:

```bash
docker compose up -d postgres redis elvis-bot prometheus grafana loki promtail
docker compose ps
```

Expected host endpoints:
- Grafana: `http://localhost:3001`
- Prometheus: `http://localhost:9090`
- Trade API health: `http://localhost:5050/health`

## Verify metrics pipeline end-to-end

1. Verify ELVIS API process is healthy:

```bash
curl -s http://localhost:5050/health
```

2. Verify metrics endpoint responds:

Without API auth:
```bash
curl -s -o /dev/null -w "%{http_code}\n" http://localhost:5050/metrics
```

With API auth:
```bash
curl -s -H "X-API-Key: ${API_KEY}" -o /dev/null -w "%{http_code}\n" http://localhost:5050/metrics
```

3. Verify Prometheus target status:

```bash
curl -s http://localhost:9090/api/v1/targets
```

For job `elvis`, confirm `health` is `up`.

4. In Grafana, verify datasource and dashboard:
- Datasource: `Prometheus` (provisioned via `grafana/provisioning/datasources/prometheus.yml`)
- Dashboard folder: `ELVIS`

## Grafana "No data" troubleshooting

### 1) ELVIS service not running
Symptom:
- Prometheus target state is `down` with connection refused.

Fix:
```bash
docker compose up -d elvis-bot
docker compose logs --tail=200 elvis-bot
```

### 2) Scrape target mismatch
If ELVIS runs inside the same Compose network as Prometheus:
- Preferred target: `elvis-bot:5050`

If ELVIS runs on host and Prometheus runs in container:
- Target can be: `host.docker.internal:5050`

Update `prometheus.yml` accordingly and restart Prometheus:
```bash
docker compose restart prometheus
```

### 3) API auth blocking `/metrics`
Because API auth now applies to most routes, `/metrics` may return `401`/`503`.

Fix option A (recommended):
- Configure Prometheus to send `X-API-Key`.

Example:
```yaml
- job_name: 'elvis'
static_configs:
- targets: ['elvis-bot:5050']
metrics_path: '/metrics'
scheme: 'http'
scrape_interval: 10s
http_config:
headers:
X-API-Key: '<same value as API_KEY>'
```

Fix option B:
- Exempt `/metrics` from Flask auth middleware if your deployment model requires anonymous local scrape.

### 4) API bound to localhost only
If Prometheus is remote (container/host boundary), local-only bind can prevent access.

Fix:
- Set `TRADE_HISTORY_API_HOST=0.0.0.0` intentionally in deployment env.

## Repository hygiene policy

To prevent repo bloat and accidental secret/artifact commits:
- Do not commit local environments (`env*/`, `venv*/`, `.venv/`).
- Do not commit local ML source/build directories (for example `tensorflow/`).
- Keep generated logs/data/model artifacts out of git unless explicitly versioned.

If large local directories are accidentally tracked:

```bash
git rm -r --cached env-coreml env-ydf .venv venv tensorflow
git commit -m "chore(repo): stop tracking local environment/build artifacts"
```

## Related files
- `README.md`
- `SECURITY.md`
- `docker-compose.yml`
- `prometheus.yml`
- `trading/utils/trade_history_api.py`
Loading