Update doc about "How PMD Works"

2017-09-29 10:30:23 +02:00
parent f7d67df442
commit eb6624deb6
1 changed files with 37 additions and 14 deletions
--- a/docs/pages/pmd/devdocs/how_pmd_works.md
+++ b/docs/pages/pmd/devdocs/how_pmd_works.md
@ -1,22 +1,45 @@
 ---
-title: PMD How it Works
+title: How PMD Works
 tags: [customizing]
-summary: How PMD Works
-last_updated: July 3, 2016
+summary: Processing overview of the different steps taken by PMD.
+last_updated: September 2017
 permalink: pmd_devdocs_how_pmd_works.html
-author: Tom Copeland
+author: Tom Copeland, Andreas Dangel <andreas.dangel@adangel.org>
 ---

-# How it works
+## Overview

-PMD checks source code against rules and produces a report. Like this:
+The processing starts e.g. with the main class: `net.sourceforge.pmd.PMD`

-*   Something passes a file name and a RuleSet into PMD.
-*   PMD hands an InputStream of the source file to a JavaCC-generated parser.
-*   PMD gets a reference to an Abstract Syntax Tree back from the parser.
-*   PMD hands the AST off to the symbol table layer which builds scopes, finds declarations, and find usages.
-*   If any rules need data flow analysis, PMD hands the AST over to the DFA layer for building control flow graphs and data flow nodes.
-*   Each Rule in the RuleSet gets to traverse the AST and check for problems. The rules can also poke around the symbol table and DFA nodes.
-*   The Report is now filled with RuleViolations, and those get printed out in XML or HTML or whatever.
+{%include note.html content="This is the command line interface. There are many other means, who
+PMD can be invoked. E.g. via ant, maven, gradle..." %}

-Not much detail here… if you think this document can be improved, please post [here](http://sourceforge.net/p/pmd/discussion/188192) and let us know how. Thanks!
+*   Parse command line parameters (see net.sourceforge.pmd.cli.PMDParameters)
+    Also load the incremental analysis cache file
+*   Load rulesets/rules
+*   Determine languages (rules of different languages might be mixed in rulesets)
+*   Determine files (uses the given source directory, filter by the language's file extensions)
+*   Prepare the renderer
+*   Sort the files by name
+*   Check whether we can use the incremental analysis cache (if the rulesets changed, it will be invalid)
+*   Prepare the SourceCodeProcessor based on the configuration
+*   Analyze the files. Either single threaded or multi-threaded parallel. This task is encapsulated
+    in `net.sourceforge.pmd.processor.PMDRunnable`:
+    *   Create input stream
+    *   Call source code processor (`net.sourceforge.pmd.SourceCodeProcessor`):
+        1.  Determine the language
+        2.  Check whether the file is already analyzed and a result is available from the analysis cache
+        3.  Parse the source code. Result is the root AST node.
+        4.  Always run the SymbolFacade visitor. It builds scopes, finds declarations and usages.
+        5.  Run DFA (data flow analysis) visitor (if at least one rule requires it) for building
+            control flow graphs and data flow nodes.
+        6.  Run TypeResolution visitor (if at least one rule requires it)
+        7.  FUTURE: Run multifile analysis (if at least one rule requires it)
+        8.  Execute the rules:
+            *   First run the rules that opted in for the rule chain mechanism
+            *   Run all the other rules and let them traverse the AST. The rules can use the symbol table,
+                type resolution information and DFA nodes.
+            *   The rules will report found problems as RuleViolations.
+*   Render the found violations into the wanted format (XML, text, HTML, ...)
+*   Store the incremental analysis cache
+*   Depending on the number of violations found, exit with code 0 or 4.