Currency_state_evaluation
UPDATE: Added a brute-force parameter optimization step which determines the threshold(%) for a long trade trigger, the threshold(%) for a short trade trigger, the optimal % stop loss trigger, the the optimal % profit target. The Overall profit(in ticks) is slightly different between the CPU and GPU version, but the parameters are equal. The small difference may be due to the multiple operations on very small (1e-5) floating point numbers.
This is a CPU and GPU implementation of a simple pattern classification application which takes in 5 min EURUSD data.
NOTE: This implementation will only work on GPUs with compute 3.0 or higher (AtomicAdd used on float)
Both the GPU and CPU version go through the entire data set, examines the last 'periods_back' time intervals(including the most recent 5 min period), classify this period into (2^12) or 4096 possible pattern types(states).
At that point it not only caches the classification for that current pattern, but also looks forward 'periods forward' steps. Then based on what happened those next periods, it calculates the overall net average price change which is associated with that particular price/volatility pattern, and stores that information in memory.
So essentially the code 'trains' on the past data, then that cached information can be used to evaluate the current state, and based on previous data predict the expected price change for the next 4-interval period.
This version demonstrates the fast and easy use of Atomic functions, which have been significanly improved in the newer Nvidia Kepler class GPUs. It also demonstrates the increased speed of Global memory access, which has also been improved in this new generation of Nvidia GPUs.
Fill/Scale Steps Only
Number of time periods(5 min) | Periods back | Periods forward | CPU time | GPU time | Speedup |
---|---|---|---|---|---|
230,422 | 9 | 4 | 38 ms | 1ms | 38 x |
Fill/Scale With 4-Parameter Optimization
Number of time periods(5 min) | Periods back | Periods forward | CPU time | GPU time | Speedup |
---|---|---|---|---|---|
230,422 | 9 | 4 | 166322 ms | 2890ms | 57 x |
While this is not a task which is ideal for the GPU, it still is able to fully process and fill in all of the 5-min data for a 4 year period in less than 1 ms, compared to about 38 ms for the equivalent CPU version.
Since HFT trading is very dependant on the speed of the application, a more robust version of this prototype CUDA kernel could be used to output a future expected price change based on the most current state, and the history associated with this state.
In this version 12 inputs are used to describe the state, and the application could be adjusted to use more custom inputs and determine which are the most predictive.
The code does display the most predictive inputs for bullish and bearish price changes(based on past data), and also estimates the expected price change for each state.
project hardware/software configuration:
Language: C++ using CUDA and the CUDA 5.0 SDK. Memory management is in C style
CPU used: Intel Core I-7 3770 3.5 ghz, 3.9 ghz target
GPU used: Nvidia Tesla K20 5GB
motherboard: MAXIMUS V GENE, PCI-e 3.0
<script> (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) })(window,document,'script','//www.google-analytics.com/analytics.js','ga'); ga('create', 'UA-43459430-1', 'github.com'); ga('send', 'pageview'); </script>