pmd/pmd-core
Jan van Nunen b1769846e5 Created extra CSV output format 'csv_with_linecount_per_file' which outputs the correct line count per file.
Some of the tokenizers ignore comments and therefore the line count of a
duplication can differ per file. Take for example the following files:

FileA.java:

1: public class FileA {
2:  pulbic String Foo() {
3:    return "Foo";
4:   }
5:  }

FileB.java:

1: public class FileB {
2:   pulbic String Foo() {
3:     // This is a comment
4:     return "Foo";
5:   }
6: }

When comments are ignored and not tokenized, the duplication consist of
the following tokens:

'{', 'public', 'String', 'Foo', '(', ')', '{', 'return', 'Foo', ';',
'}', '}'

For 'FileA.java' the duplication is 5 lines long, it starts at line 1
and ends at line 5. For 'FileB.java' the duplication is 6 lines long, it
starts at line 1 and ends at line 6.

Note that this is just 1 example, because for most tokenizers comments
and white spaces are not significant. For example the following file
contains the same duplication all on 1 line:

FileC.java

1: public class FileC { public String Foo() { return "Foo"; } }

For us the correct line count per file is important, because we
highlight the duplications in an annotated source view and show the
percentage of duplicated code the file contains. The current output
formats only contain 1 line count per duplication and file set. For the
above example CPD would output the following:

Found a 4 line (12 tokens) duplication in the following files:
Starting at line 1 of FileA.java
Starting at line 1 of FileB.java

For FileB.java this is not correct and would lead to incorrect
percentage of duplicated code. (66% (4 of 6 lines) instead of the
correct 83% (5 of 6 lines)).

To fix the problem, I created an extra output format
'csv_with_linecount_per_file' which outputs the correct line count per
file. The format contains the following:

tokens,occurrences
<nr of tokens>,<nr of occurrences>(,<begin line>,<line count>,<file
name>)+

For the above example the output would be

tokens,occurrences
12,2,1,4,FileA.java,1,5,FileB.java
2015-01-21 13:55:49 +01:00
..
2015-01-18 12:19:52 +01:00

How to build PMD ?
==================

Simply use maven: $ mvn compile

PMD now uses a small plugin to generate its website, so if you want to build the
website ($ mvn site), you'll need to install it:

$ cd ../maven-plugin-pmd-build
$ mvn clean install

That's all !

How to quickly build a "release" (zipfiles - for testing purpose only) ?
------------------------------------------------------------------------

$ mvn -Dmaven.test.skip=true -Dmaven.clover.skip=true verify post-site

Full release process is documented in src/site/xdocs/pmd-release-process.xml