Ensure CPD uses tab width of 1 for tabs consistently

The columns that are reported by CPD were inconsistent across languages
before. A language like Java (using a JavaCC-based tokenizer) would use
a width of 8 for tabs, whereas a language like C# (using an Antlr-based
tokenizer) would use 1 instead.

This includes unit tests for most languages to ensure a tab character is
counted as 1. The configuration for JavaCC has been adjusted to respect
this as well.
This commit is contained in:
Maikel Steneker
2020-07-20 10:42:21 +02:00
parent 25405eb870
commit 6fb5ac59b9
45 changed files with 724 additions and 62 deletions

View File

@ -40,4 +40,9 @@ public class KotlinTokenizerTest extends CpdTextComparisonTest {
public void testImportsIgnored() {
doTest("imports");
}
@Test
public void testTabWidth() {
doTest("tabWidth");
}
}

View File

@ -0,0 +1,5 @@
var x = 0
fun increment() {
x += 1
}

View File

@ -0,0 +1,19 @@
[Image] or [Truncated image[ Bcol Ecol
L1
[var] 1 3
[x] 5 5
[=] 7 7
[0] 9 9
L3
[fun] 1 3
[increment] 5 13
[(] 14 14
[)] 15 15
[{] 17 17
L4
[x] 2 2
[+=] 4 5
[1] 7 7
L5
[}] 1 1
EOF