From 16105b5a628a50f085e37ba3b29d5984e68e2107 Mon Sep 17 00:00:00 2001 From: Gregor Betz <3662782+ggbetz@users.noreply.github.com> Date: Fri, 24 Nov 2023 10:35:24 +0100 Subject: [PATCH] add 2023-10-27 --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 8ea2109..a021b35 100644 --- a/README.md +++ b/README.md @@ -112,6 +112,7 @@ _Methods for analysing LLM deliberation and assessing reasoning quality._ _Things that don't work, or are poorly understood._ +- πŸŽ“ LLMs may produce "encoded reasoning" that's unintelligable to humans, which may nullify any XAI gains from deliberative prompting. "Preventing Language Models From Hiding Their Reasoning." 2023-10-27. [[>paper](https://arxiv.org/abs/2310.18512)] - πŸŽ“ LLMs judge and decide in function of available arguments (reason-responsiveness), but are more strongly influenced by fallacious and deceptive reasons as compared to sound ones. "How susceptible are LLMs to Logical Fallacies?" 2023-08-18. [[>paper](https://arxiv.org/abs/2308.09853)] - πŸŽ“ Incorrect reasoning improves answer accuracy (nearly) as much as correct one. "Invalid Logic, Equivalent Gains: The Bizarreness of Reasoning in Language Model Prompting." 2023-07-20. [[>paper](https://arxiv.org/abs/2307.10573)] - πŸŽ“ Zeroshot CoT reasoning in sensitive domains increases a LLM’s likelihood to produce harmful or undesirable output. "On Second Thought, Let’s Not Think Step by Step! Bias and Toxicity in Zero-Shot Reasoning." 2023-06-23. [[>paper](https://arxiv.org/abs/2212.08061)]