From 206503a738f552490c56a2441f9b8a16f8bd408c Mon Sep 17 00:00:00 2001 From: Daniel Bethell <55928249+team-daniel@users.noreply.github.com> Date: Mon, 20 May 2024 10:28:08 +0100 Subject: [PATCH] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index f4e99ba..918c7c4 100644 --- a/README.md +++ b/README.md @@ -7,7 +7,7 @@ Empowering safe exploration of reinforcement learning (RL) agents during trainin
-
Fig 1. A high-level overview of ADVICE including training, inference, and Adaptive ADVICE.
+Fig 1. A high-level overview of ADVICE including construction, execution, and adaptation.
### ADVICE ADVICE starts with collecting a dataset of state-action pairs, classified as either safe or unsafe based on the outcomes they lead to within the training environment. This dataset is then used to train the contrastive autoencoder. The training process leverages a unique loss function that helps the model learn by comparing similar (safe) and dissimilar (unsafe) pairs, enhancing its ability to identify and categorize new observations quickly. To classify unseen data, a nearest neighbours model is fit on the final embedding space.