How to improve the prompt

Strategy

Improve the prompt for Test Generation (see Hub Test Generation). Target: improve the rate of tests which are Right-the-first-time (ie do not need any intervention of the user to make them pass).

Improvements

1. Following the examples

Observation: the generated test does not fully follows the example (it does not use the renderWithProviders method, does not always follow the style of the example tests...)

Action: Update the prompt to enforce GPT to better follow the example:

- You are a React-Native developer with 20 years of experience.
- Your task is to write a test of a react-native component.
+ You are an experienced React Native developer, and your task is
+ to write a test for a specific component following the exact methods,
+ structure, and conventions shown in the examples provided below.

2. Correct manually recurrent errors

Observation: GPT stills does the same errors, e.g. using render instead of renderWithProviders and destructuring the returned object from renderWithProviders instead of using the screen object.

Action: tell in the prompt to specifically address these issues:

+ Your tests MUST ALWAYS use the provided renderWithProviders method
+ instead of the render method.
+ You MUST also NOT destructure the screen object.

Note: I had to use CAPITALS to force GPT to respect these criteria.

3. Isolate the file to test

Observation: when passing the sub-files which are imported in the file to test, GPT sometimes returns multiple file tests for each of these files. This create duplicated imports in the generated test files, thus resulted in broken files.

Action: Separate sub-files from the main file to test in the prompt

context_files = get_imports(filepath, project_root)
context_files_content = ""
for path in context_files:
    with open(path, "r") as file:
        context_files_content += f"// {path}\n"
        context_files_content += file.read()

prompt = f"""
	...

    Examples of good tests :

    {examples_of_tests}

    ------------
    Now the component to test :

    // {filepath}
    {file_to_test_content}

    ------------
    Finally here are some context files to help you understand the
    component to test:

    {context_files_content}
    """

Results

The tests are all made with the sub-files. The generated tests are all passed to the clean method (see Use sub-files to add more context to the prompt). We do NOT do any extra human-actions after.

	Number of tests found by jest	Number of passing tests	Percentage
BEFORE (previous prompt)	63	5	7,9 %
NOW (new prompt)	85	28	32,9 %

Analysis

👍 More tests are found by jest (ie we have less invalid typescript files!)
👍 More tests are passing
👎 A lot of test still not pass because of incorrect mocks or methods that do not exist

Next steps

Fine-tuning with BAM examples to make the model learn about our way to do mocks?
- which projects have better tests for training? => ask Pierre Z.
- which projects can we use for testing? Must be different from training
Comparing ai-generated test and human-written test
- re-using the model for comparing code from the search feature, to evaluate the distance between the two files?
Loop improvement: ask GPT to correct its own code
- Problem: we already reach the limit of token with the first message, how can reduce it to be able to send a second message in the same chat?
  - Fallback to the ChatGPT web-interface?

Strategy​

Improvements​

1. Following the examples​

2. Correct manually recurrent errors​

3. Isolate the file to test​

Results​

Analysis​

Next steps​