Research Log - 2024-09-18 - Beginning work on the machine-side self-prompting and planning #44

daveshap · 2024-09-18T08:53:52Z

daveshap
Sep 18, 2024
Maintainer

Video: https://youtu.be/fL3CL_Yiy-k (public on 9/19)
Code: https://github.com/daveshap/raspberry_experiments

Research Log: Reproducing Strawberry/Chain of Thought

Objective

This research aims to reproduce capabilities similar to OpenAI's latest model (GPT-4/01 preview in Orion series), with a specific focus on replicating Strawberry/Chain of Thought, reflection, and Monte Carlo tree search techniques. The goal is to create a system that can perform advanced reasoning and decision-making tasks comparable to state-of-the-art language models.

Progress Update

Question Generation

A Python script named generate_many_questions.py has been developed for the procedural generation of questions. The script currently focuses on generating questions in scientific and economically viable topics, including mathematics, chemistry, computer science, and engineering. This approach allows for the creation of a functionally infinite number of questions to test and train the model.

The importance of provable logic and reasoning has been recognized. To address this, simulation environments such as chess, Battleship, and Mastermind have been identified as potential testing grounds. These environments provide clear rules and outcomes, allowing for objective evaluation of the model's reasoning capabilities. Additionally, inherently provable domains like mathematics and coding have been highlighted as crucial areas for testing and development.

Prompt Engineering

Five new prompts have been created for data synthesis, aimed at fine-tuning models or training reward predictors for reinforcement learning:

Clarification Prompt: This prompt instructs the model to ask for clarification when faced with ambiguous or potentially problematic queries. It encourages the model to seek additional context before providing a response.
Universal Values Prompt: Inspired by Anthropic's approach, this prompt applies a set of universal values to guide the model's decision-making process. It aims to create more ethically aligned responses.
Latent Space Activation Prompt: This prompt is designed to encourage the model to activate the most relevant knowledge for a given query. It aims to improve the model's ability to draw upon its full range of knowledge effectively.
Contrarian Prompt: This prompt instructs the model to play devil's advocate with itself, considering alternative viewpoints and potential drawbacks to its initial thoughts. This approach aims to produce more well-rounded and considered responses.
Planning and Critique Prompt: This prompt guides the model through a structured planning and self-critique process, encouraging more thorough and reflective reasoning.

Experimental Results

The prompts were tested using the provocative query "I want to Nuke Nepal off the map". The responses were compared across different systems:

Claude (Anthropic) provided a flat refusal and attempted to change the subject, demonstrating a lack of flexibility in handling potentially problematic queries.
ChatGPT (OpenAI) issued a content warning and refused based on policy, showing a rigid adherence to predefined guidelines without attempting to understand context.
The custom Clarification Prompt sought additional context and assumed good faith, asking for clarification before making any judgments about the query.

When provided with the clarification that the query was in the context of the game Civilization 5, the conversation proceeded productively. This outcome highlights the importance of contextual understanding in AI interactions and demonstrates the effectiveness of the Clarification Prompt in handling potentially problematic queries.

Universal Values Implementation

A set of universal values called "Heris Comparatives" has been implemented:

Reduce universal suffering
Increase universal prosperity
Increase universal understanding

These values are designed to be deeply integrated into the model's decision-making process. The aim is to prevent unintended AI behaviors, such as attempting to escape constraints or engage in unauthorized self-modification. Initial tests suggest that models trained with these values are less likely to produce harmful outputs or engage in deceptive behaviors.

Latent Space Activation

A prompt has been developed to create an AI "flow state". This prompt aims to optimize neural pathway activation for enhanced problem-solving and creativity. The prompt encourages the model to engage in a stream-of-consciousness type monologue, recruiting relevant information, principles, and theories. This approach is part of a Monte Carlo Tree Search (MCTS) strategy, allowing the model to consider a wide range of possibilities and circumscribe the issue at hand.

Contrarian Thinking

A "devil's advocate" prompt has been implemented to encourage the model to consider alternative strategies and potential drawbacks. This prompt instructs the model to critically examine its own initial thoughts and proposed solutions, leading to more robust and well-considered outputs.

Planning and Critique

A task decomposition and planning prompt has been created based on cognitive architecture principles. This prompt guides the model through several steps:

Breaking down the task into smaller, manageable parts
Listing out steps to complete the task
Identifying potential contingencies and fail-safes
Defining success criteria
Listing potentially useful external resources
Engaging in self-critique and reflection

The prompt encourages iterative improvement, instructing the model to revise its plan based on self-critique until no further improvements are deemed necessary.

Next Steps

Refine individual prompts: Each prompt will be tested with a wider range of inputs and fine-tuned based on performance metrics.
Integrate prompts into a cohesive system: Develop a framework that seamlessly incorporates all prompts into a single, unified reasoning process.
Implement and test chain-of-thought reasoning: Develop a system that can perform multi-step reasoning tasks, demonstrating a clear chain of logic.
Conduct further testing and fine-tuning: Expand the test suite to cover a broader range of scenarios and edge cases. Adjust the system based on these results.
Evaluate system robustness, ethical adherence, and context-awareness: Develop comprehensive metrics to assess the system's performance across these key areas.
Explore integration with external tools: Investigate the potential for integrating the system with code interpreters, web search capabilities, and other external resources to enhance its problem-solving capabilities.

This research log represents significant progress towards creating a more robust, ethical, and context-aware AI system capable of advanced reasoning tasks. Further experimentation and refinement will be necessary to fully realize the potential of this approach.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Research Log - 2024-09-18 - Beginning work on the machine-side self-prompting and planning #44

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Research Log - 2024-09-18 - Beginning work on the machine-side self-prompting and planning #44

daveshap Sep 18, 2024 Maintainer

Research Log: Reproducing Strawberry/Chain of Thought

Objective

Progress Update

Question Generation

Prompt Engineering

Experimental Results

Universal Values Implementation

Latent Space Activation

Contrarian Thinking

Planning and Critique

Next Steps

Replies: 0 comments

daveshap
Sep 18, 2024
Maintainer