Is it possible to parameterize the SonarQube code duplication detector to stop at the borders of the method?

Im using SonarQube for my Java projects and want to eliminate code duplication from our code as far as possible.

My problem is that SonarQubes code duplication detection does not respect method boundaries. It lists identical parts of files as duplicates, and it often happens that duplication starts in the middle of a method and ends in the middle of another. They can hardly be reorganized.

Here is an example. Click on the MavenArtifactRepository.java file in the upper right list and look at the fourth duplication block at the bottom of the page.

Is there a way to parameterize a duplicate code detection plugin to show duplications that are syntactically coherent?

+8
java sonarqube code-duplication
source share
2 answers

Currently, you cannot achieve this by setting up your SonarQube yourself. However, you can try our SourceMeter tool with the SonarQube plugin , which implements ACT-based clone detection, and therefore it presents syntactically coherent duplications inside SonarQube. For example, you can watch an online demo.

+4
source share

The problem you are talking about is well known in the research clone community and one of the main problems why many people use cloning in practice. SonarQube implements a fairly simple and naive algorithm that detects code duplication based on sequences of tokens and therefore does not understand what a method is (besides a number of other problems). So the answer to your question is no .

One solution would be to look for a clone detection algorithm that detects code duplication based on abstract syntax trees (AST). But, as far as I know, there is no such tool for free.

An alternative solution would be to use ConQAT . ConQAT also uses a token-based clone detection approach, but has quite complex post-processing steps. One of them is the so-called "AST-alignment", where duplicated code fragments are aligned with the syntactic units (for example, methods) in the source code after detection. This should be exactly what you are looking for.

+2
source share

All Articles