Can AI Help with Repository Base Code Understanding?

<Zadajte anotáciu>

Understanding and maintaining large codebases is a common challenge in software development, leading to significant time and resource expenditure. Addressing this issue is essential for improving developer productivity and reducing technical debt.

 

What is Code?

Code is a recipe for solving a concrete problem. With just the code, you can reverse-engineer to understand which problem it solves and how it does so. This reverse engineering allows you to formulate user stories describing the problem. From these user stories, AI can generate new code. Is this just theoretical, or can current technology help create tools to solve this problem?

 

Current State of AI Systems

In DTIT, particularly within AI4Coding, we’re thinking about technological debt and how to address it. We start from the premise that the current state of AI systems is not able to offer the in-depth contextual understanding necessary for effective coding support at the repository level. Users of AI tools for code generation and completion often encounter reliability issues when dealing with larger codebases.

 

The Role of RAG

Our research indicates that RAG (retrieval-augmented generation) can be beneficial but has limits. Even concepts like Agentic with Chain of Thoughts or Tree of Thoughts are insufficient and can be costly.

 

Exploring Abstract Syntax Trees (ASTs)

Abstract Syntax Trees (ASTs) are useful, but they don’t provide a repository-level understanding of the code.

 

Knowledge Graphs: A Game-Changer

Current research shows that knowledge graphs excel in modeling complex relationships and dependencies within code across entire repositories. We utilize RAG, Agentic approaches, and ASTs, but knowledge graphs have been a game-changer for our product—Advanced Coding Assistant. Why do we still have “assistant” in the title? Even though we are trying to use all known best approaches, keeping the developer in the loop is crucial.

 

Conclusion

So, my answer to my introductory theoretical question is YES, but we are not in the Harry Potter universe, and AI is not a magic wand, and you cannot expect a “one click” solution. However, providing developers with tools that enhance code understanding at the project level enables them to not only work faster but also tackle tasks that were previously unsolvable.

 

For more information, please read the articles by my colleagues:

https://medium.com/@cyrilsadovsky/advanced-coding-chatbot-knowledge-graphs-and-asts-0c18c90373be

https://medium.com/@ziche94/building-knowledge-graph-over-a-codebase-for-llm-245686917f96

Stay tuned for more information. We will definitely share results from our research.