Skip to content

Latest commit

 

History

History
55 lines (38 loc) · 2.47 KB

README.md

File metadata and controls

55 lines (38 loc) · 2.47 KB

Mortal-Policy

This repository is a branch of Mortal original repository ,transitioning from value-based methods to policy-based methods.

Overview

Initially developed in 2022 based on Mortal V2, migrated to Mortal V4 in 2024.
This branch features:

  • More stable performance optimization process
  • Enhanced final performance

Note:
The performance results are based on a comparison with the baseline model. The baseline used for testing has been uploaded to RiichLab(mjai.app) and has maintained a stable rank across multiple evaluation batches.

alt text

Installation

Consistent with the original repository. Read the Documentation
Torch requirement: torch2.5.1+cu124 (install via pip)

Run

Mortal-Policy adopts an offline to online training approach:

  1. Data Preparation
    Collect samples in mjai format.

  2. Configuration
    Rename config.example.toml to config.toml and set hyperparameters.

  3. Training Stages

    • Offline Phase1 (Advantage Weighted Regression):
      Run train_offline_phase1.py

    • Offline Phase2 (Behavior Proximal Policy Optimization):
      It is optional and the code is coming soon

    • Online Phase (Policy Gradient with Importance Sampling and PPO-style Clipping):
      Run train_online.py

    While online-only training is possible, it is not recommended.

Weights & Configuration

Maintained alignment with original Mortal repository. For details see this post.
The weights, hyperparameters, and some online training features have been removed from this branch when it was open-sourced.

License

Code

AGPL-3.0-or-later

Copyright (C) 2021-2022 Equim
Copyright (C) 2025 Nitasurin

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

Assets

CC BY-SA 4.0