Assertions are bery inconsistent #87

Agustin-Perezz · 2025-02-20T13:58:44Z

I'm testing the detox copilot but is doing things inconsistent, even if the task is very simple.

Here is my handler:

import OpenAI from 'openai';
import type { ChatCompletionMessageParam } from 'openai/resources/chat/completions';

class OpenAIPromptHandler {
	private readonly openai = new OpenAI({
		apiKey: process.env.EXPO_PUBLIC_OPENAI_API_KEY,
	});

	// GPT-3.5-turbo can handle up to 4K tokens in context window
	private readonly MAX_PROMPT_LENGTH = 4000; // Use most of the available context window
	private readonly MAX_TOKENS = 256; // Reasonable limit for response tokens

	private readonly SYSTEM_PROMPT = `You are a Detox E2E test assistant for React Native. You MUST ONLY generate Detox commands.

CORRECT Detox patterns to use:
expect(element(by.text("Welcome"))).toBeVisible()
expect(element(by.id("button"))).toExist()
expect(element(by.id("input"))).toHaveText("text")
await element(by.text("Submit")).tap()

INCORRECT patterns (DO NOT USE):
❌ onView(withText("text"))              // This is Espresso
❌ cy.get("[data-test=button]")          // This is Cypress
❌ await page.locator("text").click()    // This is Playwright
❌ import statements or setup code
❌ comments or explanations

Rules:
1. Return ONLY the Detox command
2. No imports, no comments, no setup
3. No code blocks or markdown
4. Keep exact text/labels from the prompt

Example inputs and outputs:
Input: "Verify that the Welcome message is visible"
Output: expect(element(by.text("Welcome"))).toBeVisible()

Input: "Check if Submit button exists"
Output: expect(element(by.text("Submit"))).toExist()

Input: 'Verify that the "Hello!" message is displayed'
Output: expect(element(by.text("Hello!"))).toBeVisible()`;

	async runPrompt(prompt: string): Promise<string> {
		const truncatedPrompt =
			prompt.length > this.MAX_PROMPT_LENGTH
				? prompt.substring(0, this.MAX_PROMPT_LENGTH) + '...(truncated)'
				: prompt;

		const messages: ChatCompletionMessageParam[] = [
			{
				role: 'system',
				content: this.SYSTEM_PROMPT,
			},
			{ role: 'user', content: truncatedPrompt },
		];

		try {
			const response = await this.openai.chat.completions.create({
				model: 'gpt-3.5-turbo',
				messages: messages,
				max_tokens: this.MAX_TOKENS,
				temperature: 0.1, 
			});

			return (response.choices[0].message.content ?? '').trim();
		} catch (error: any) {
			console.error('OpenAI API Error:', error);
			throw new Error(
				`Failed to generate test commands: ${error?.message || 'Unknown error'}`,
			);
		}
	}

	isSnapshotImageSupported() {
		return true;
	}
}

export default OpenAIPromptHandler;

The test:

import { copilot, expect } from 'detox';

import OpenAIPromptHandler from './OpenAIPromptHandler';

describe('Home Screen', () => {
	beforeAll(async () => {
		await device.launchApp();
		const promptHandler = new OpenAIPromptHandler();
		copilot.init(promptHandler);
	});

	beforeEach(async () => {
		await device.reloadReactNative();
	});

	it('Should render home view', async () => {
		await copilot.perform('Verify that the "Welcome!" message is displayed');
	});
});

And the result:

The text was updated successfully, but these errors were encountered:

Agustin-Perezz · 2025-02-20T14:00:04Z

If I left the handler implementation like the docs throws me an error of to much tokens in the request.

asafkorem · 2025-02-27T15:12:26Z

Thanks for the report @Agustin-Perezz, did you try other LLMs like Sonnet?

asafkorem · 2025-02-27T15:13:01Z

Also, Detox Copilot uses an older version of Pilot, we'll upgrade it's version soon and it might improve your tests.

asafkorem · 2025-03-05T09:42:27Z

@Agustin-Perezz try to remove the instructions from the system prompt, Detox Pilot already gives the LLM the necessary context for the APIs supported. Not clear why it writes Espresso code. Also, it looks like your issue is with the LLM you're using

Agustin-Perezz · 2025-03-05T11:23:28Z

Thanks, I will try another LLM

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Assertions are bery inconsistent #87

Assertions are bery inconsistent #87

Agustin-Perezz commented Feb 20, 2025

Agustin-Perezz commented Feb 20, 2025

asafkorem commented Feb 27, 2025

asafkorem commented Feb 27, 2025

asafkorem commented Mar 5, 2025

Agustin-Perezz commented Mar 5, 2025 •

edited

Loading

Assertions are bery inconsistent #87

Assertions are bery inconsistent #87

Comments

Agustin-Perezz commented Feb 20, 2025

Agustin-Perezz commented Feb 20, 2025

asafkorem commented Feb 27, 2025

asafkorem commented Feb 27, 2025

asafkorem commented Mar 5, 2025

Agustin-Perezz commented Mar 5, 2025 • edited Loading

Agustin-Perezz commented Mar 5, 2025 •

edited

Loading