Skip to content

Deep learning model that analyzes the forex and binary market conditions and make forecast on price action

License

Notifications You must be signed in to change notification settings

jerryduncan/pipsProphet

Repository files navigation

PipsProphet -- Predicting price action with Deep Reinforcement Learning

The forex and synthetic indices market has witnessed huge changes in prices from high, low, low-high and high-low due to fundamental and technical analysis which intends to determine the actual state of price at each point in time-frame, this has led many financial experts in the market to look beyond the market to proffer financial advice to traders across the world.

Alot will ague that the use of INDICATORS such as Fibonacci Retracement, Moving Average, Stochastic Oscillator, Relative Strength Index, Bollinger Bands and alot more which basically are built on mathematical functions are better suit in understanding price, another set of financial experts prefer to study price chart over a long period of time using the Higher TimeFrmae HTF and Lower TimeFrame LTF index which they call PRICE ACTION.

This algorithm intends to merge both the indicators and price action understanding with deep learning algorithms to understudy price and help forecast price over a short and long period of time.

Model Architecture

Our architecture involves building a feedforward backprogrational neural network acting as a Q-network. This network consists of 3 hidden layers of 20 ReLU neurons each, followed by an output layer of 3 linear neurons. Both hidden layers will be simulated inside the financial market.

The state of the hidden layers was set experimentally, while the three linear output neurons are inherent to our system design: each representing the Q-value of a give action.

Our network interacts within a simulated market environment in discrete steps t = 0,1,2,... receiving a state vector St as input at each of those steps. After a forward propagation, each of the three linear neurons outputs the Q-network current estimate for an action value equation for each of the three possible outcomes equation, where Wk represent the set of network weights after k updates.

The estimates equation are fed to a e-greedy action selection method which selects the action choice for step t as either equation

There is an external influence on the agent to invest position_size of the choosen asset at a time, a value set by the user, leaving it with five actions: open long position, open short position, close long position, close short position and do nothing.

The selected action At is then received by the simulated market environment. With role to provide a acurate simulation of the foreign exchange market and coordinate the flow of information that reaches the system so that it follows the reinforcement learning paradigm

Each state simulated by the environment conditions includes the following information

  • Type of current open position;
  • Value of any open position in view of simulated market current prices Bidi and Aski, where i is an index over the entries in the dataset used by the market ;
  • Current size of trading account;
  • Feature vector Fi; (created using the market data entries, by the preprocessing stage inspired by the technical analysis approach)

As for the reward given to the network backpropagation, each action is rewarded as follows:

  • Opening a position is rewarded by the unrealized profit it creates;
  • keeping a position open is rewarded by the fluctuation of the position unrealized profit;
  • Closing a position is rewarded with the attained profit
  • Doing nothing receives zero reward

image

Intialize Environment and Agent

    ENV_NAME = 'OHLCV-v0'
    TIME_STEP = 30

    #load path
    TRAIN_PATH = "./data/train"
    TEST_PATH = "./data/test"
    env_train = OhlcvEnv(TIME_STEP, path=TRAIN_PATH)
    env_test = OhlcvEnv(TIME_STEP, path=TEST_PATH)

    np.random.seed(456)
    env.seed(562)

    # create model
    nb_actions = env.action_space.n
    model = model_create(shape=env.shape, nb_actions=nb_actions)
    print(model.summary())

    # finally, we configure and compile our agent
    memory = SequentialMemory(limit=50000, window_length=TIME_STEP)
    
    policy = EpsGreedyQPolicy()
    # enable the dueling network 
    dqn = DQNAgent(model=model, nb_actions=nb_actions, memory=memory, nb_steps_warmup=200,
                   enable_dueling_network=True, dueling_type='avg', target_model_update=1e-2, policy=policy,
                   processor=Normalizerprocessor())
    dqn.compile(Adam(lr=1e-3), metrics=['mae'])

Train and Validate

while True:
    # train
    dqn.fit(env, nb_steps=5500, nb_max_episode_steps=10000, visualize=False, verbose=2)
    try:
        # validate
        info = dqn.test(env_test, nb_episodes=1, visualize=False)
        n_long, n_short, total_reward, portfolio = info['n_trades']['long'], info['n_trades']['short'], info[
            'total_reward'], int(info['portfolio'])
        np.array([info]).dump(
            './info/duel_dqn_{0}_weights_{1}LS_{2}_{3}_{4}.info'.format(ENV_NAME, portfolio, n_long, n_short,
                                                                        total_reward))
        dqn.save_weights(
            './model/duel_dqn_{0}_weights_{1}LS_{2}_{3}_{4}.h5f'.format(ENV_NAME, portfolio, n_long, n_short,
                                                                        total_reward),
            overwrite=True)
    except KeyboardInterrupt:
        continue

Configure Neural Network

def create_model(shape, nb_actions):
    model = Sequential()
    model.add(LSTM(64, input_shape=shape, return_sequences=True))
    model.add(LSTM(64))
    model.add(Dense(32))
    model.add(Activation('adam'))
    model.add(Dense(nb_actions, activation='linear'))
    return model

Reference

(Guide - RL) Reinforcement Q-Learning from Scratch in Python with OpenAI Gym https://www.learndatasci.com/tutorials/reinforcement-q-learning-scratch-python-openai-gym/

Train a Deep Q Network with TF-Agents https://www.tensorflow.org/agents/tutorials/1_dqn_tutorial

Deep Reinforcement Learning for Automated Stock Trading https://towardsdatascience.com/deep-reinforcement-learning-for-automated-stock-trading-f1dad0126a02

Trading Environment(OpenAI Gym) + DDQN (Keras-RL) https://github.com/miroblog/deep_rl_trader

【莫烦Python】强化学习 Reinforcement Learning
https://www.bilibili.com/video/BV13W411Y75P?from=search&seid=13844167983297755236

Pull requests

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.