-In Table 8, we first evaluate the generation capability of each large language model for code in a single-file setting as a baseline. Then, using BM25 as the similarity metric, we search for similar code within the project based on the context and use it as a prompt to re-evaluate the model's generation performance. Finally, "w/Ref." represents the scenario where we assume we know what the correct reference code looks like, and we search for similar code within the project using the references as a prompt to re-evaluate the model's generation performance. Ultimately, the aiXcoder-7B model achieves the best performance in all languages, indicating that our model has the strongest ability to extract contextual information, especially cross-file contextual information.
0 commit comments