Why it does not work

After some research, we did not success to correct bugs with AI. Here is why.

Which types of bugs are we trying to fix?

There are many types of bugs, but the ones we are trying to fix are the ones that are most common at BAM, i.e. those for which we usually do a QRQC or that require a fix commit. Regarding all content from Notion, it seems that there most of the time more product bugs than code bugs (eslint and typescript already catches most of code bugs)

Why can't we learn from these bugs (QRQC or FixCommits)?

The problem of the context

To understand WHY a bug is a bug, and how it was solved, we always need to bring some context. It could be:

one or multiple Pull Request
the Notion/Trello ticket(s) relevant to the bugs
the design conception (Figma)
the BPMN
the global context of the project and/or of the feature (what were we trying to do, etc.)
some screenshots

To understand a bug then, we need to:

have access to all of this context
being able to pick in all of this context what is relevant

Concerning an AI, we could have access to all of the context, but it is very lot of work (needs to transcript screenshots, diagram, figma, or BPMN to texts + needs to retrieve data from Notion or other tools). Also, with too much data as input an AI can't be able to pick what is relevant, so we need to filter all data before passing it to an AI.

The problem of the types of bugs

When giving a code to a LMM such as ChatGPT, it tends to find coding bugs, but never product/integration bugs: for example, it can't understand that a popup should be displayed when the request returns an error. Detecting this second type of bug requires a lot more context first, but also more capabilities from the AI that we don't have yet.

On what should we focus now?

A more pragmatic approach concerning AI should rather be to built tools to generate code: AI is especially very efficient to create small part of code that are redondant or similar to other part of codes. The next topic we choose will be to generate tests files, since they could add a great value to the product and help developers to detect product/integration bugs.

Which types of bugs are we trying to fix?​

Why can't we learn from these bugs (QRQC or FixCommits)?​

The problem of the context​

The problem of the types of bugs​

On what should we focus now?​

Which types of bugs are we trying to fix?

Why can't we learn from these bugs (QRQC or FixCommits)?

The problem of the context

The problem of the types of bugs

On what should we focus now?