<!-- this highlights the active parent class in the navgoco sidebar. this is critical so that the parent expands when you're viewing a page. This must appear below the sidebar code above. Otherwise, if placed inside customscripts.js, the script runs before the sidebar code runs and the class never gets inserted.-->
<h1class="post-title-main">Adding PMD support for a new language</h1>
</div>
<divclass="post-content">
<divclass="summary">How to add a new language to PMD.</div>
<!-- this handles the automatic toc. use ## for subheads to auto-generate the on-page minitoc. if you use html tags, you must supply an ID for the heading element in order for it to appear in the minitoc. -->
<h2id="1--start-with-a-new-sub-module">1. Start with a new sub-module.</h2>
<ul>
<li>See pmd-java or pmd-vm for examples.</li>
</ul>
<h2id="2--implement-an-ast-parser-for-your-language">2. Implement an AST parser for your language</h2>
<ul>
<li>Ideally an AST parser should be implemented as a JJT file <em>(see VmParser.jjt or Java.jjt for example)</em></li>
<li>There is nothing preventing any other parser implementation, as long as you have some way to convert an input stream into an AST tree. Doing it as a JJT simplifies maintenance down the road.</li>
<li>See this link for reference: <ahref="https://javacc.java.net/doc/JJTree.html">https://javacc.java.net/doc/JJTree.html</a></li>
<li>For each AST node that your parser can generate, there should be a class</li>
<li>The name of the AST class should be “AST” + “whatever is the name of the node in JJT file”.
<ul>
<li>For example, if JJT contains a node called “IfStatement”, there should be a class called “ASTIfStatement”</li>
</ul>
</li>
<li>Each AST class should have two constructors: one that takes an int id; and one that takes an instance of the parser, and an int id</li>
<li>It’s a good idea to create a parent AST class for all AST classes of the language. This simplifies rule creation later. <em>(see SimpleNode for Velocity and AbstractJavaNode for Java for example)</em></li>
<li>Note: These AST node classes are generated usually once by javacc/jjtree and can then be modified as needed.</li>
</ul>
<h2id="4--compile-your-parser-if-using-jjt">4. Compile your parser (if using JJT)</h2>
<ul>
<li>An ant script is being used to compile jjt files into classes. This is in <codeclass="language-plaintext highlighter-rouge">pmd-<lang>/src/main/ant/alljavacc.xml</code> file.</li>
<li>Create <codeclass="language-plaintext highlighter-rouge">alljavacc.xml</code> file for your language, you can use one from <codeclass="language-plaintext highlighter-rouge">pmd-java</code> as an example.</li>
<li>You would probably want to adjust contents of the <codeclass="language-plaintext highlighter-rouge"><delete></code> tag: start with an empty <codeclass="language-plaintext highlighter-rouge"><fileset></code> and add there <codeclass="language-plaintext highlighter-rouge"><include></code>s for those AST nodes you had to manually rewrite (moving those node classes from autogenerated directory to the regular source tree).</li>
</ul>
<h2id="5--create-a-tokenmanager">5. Create a TokenManager</h2>
<ul>
<li>Create a new class that implements the <codeclass="language-plaintext highlighter-rouge">TokenManager</code> interface <em>(see VmTokenManager or JavaTokenManager for example)</em></li>
</ul>
<h2id="6--create-a-pmd-parser-adapter">6. Create a PMD parser “adapter”</h2>
<ul>
<li>Create a new class that extends AbstractParser</li>
<li>There are two important methods to implement
<ul>
<li><codeclass="language-plaintext highlighter-rouge">createTokenManager</code> method should return a new instance of a token manager for your language <em>(see step #5)</em></li>
<li><codeclass="language-plaintext highlighter-rouge">parse</code> method should return the root node of the AST tree obtained by parsing the Reader source</li>
<li>See <codeclass="language-plaintext highlighter-rouge">VmParser</code> class as an example</li>
</ul>
</li>
</ul>
<h2id="7--create-a-rule-violation-factory">7. Create a rule violation factory</h2>
<ul>
<li>Extend <codeclass="language-plaintext highlighter-rouge">AbstractRuleViolationFactory</code><em>(see VmRuleViolationFactory for example)</em></li>
<li>The purpose of this class is to create a rule violation instance specific to your language</li>
</ul>
<h2id="8--create-a-version-handler">8. Create a version handler</h2>
<ul>
<li>Extend <codeclass="language-plaintext highlighter-rouge">AbstractLanguageVersionHandler</code><em>(see VmHandler for example)</em></li>
<li>This class is sort of a gateway between PMD and all parsing logic specific to your language. It has 3 purposes:
<ul>
<li><codeclass="language-plaintext highlighter-rouge">getRuleViolationFactory</code> method returns an instance of your rule violation factory <em>(see step #7)</em></li>
<li><codeclass="language-plaintext highlighter-rouge">getParser</code> returns an instance of your parser adapter <em>(see step #6)</em></li>
<li><codeclass="language-plaintext highlighter-rouge">getDumpFacade</code> returns a <codeclass="language-plaintext highlighter-rouge">VisitorStarter</code> that allows to dump a text representation of the AST into a writer <em>(likely for debugging purposes)</em></li>
</ul>
</li>
</ul>
<h2id="9--create-a-parser-visitor-adapter">9. Create a parser visitor adapter</h2>
<ul>
<li>If you use JJT to generate your parser, it should also generate an interface for a parser visitor <em>(see VmParserVisitor for example)</em></li>
<li>Create a class that implements this auto-generated interface <em>(see VmParserVisitorAdapter for example)</em></li>
<li>The purpose of this class is to serve as a pass-through <codeclass="language-plaintext highlighter-rouge">visitor</code> implementation, which, for all AST types in your language, just executes visit on the base AST type</li>
</ul>
<h2id="10-create-a-rule-chain-visitor">10. Create a rule chain visitor</h2>
<ul>
<li>Extend <codeclass="language-plaintext highlighter-rouge">AbstractRuleChainVisitor</code><em>(see VmRuleChainVisitor for example)</em></li>
<li>This class should <codeclass="language-plaintext highlighter-rouge">implement</code> two <codeclass="language-plaintext highlighter-rouge">important</code> methods:
<ul>
<li><codeclass="language-plaintext highlighter-rouge">indexNodes</code> generates a map of “node type” to “list of nodes of that type”. This is used to visit all applicable nodes when a rule is applied.</li>
<li><codeclass="language-plaintext highlighter-rouge">visit</code> method should evaluate what kind of rule is being applied, and execute appropriate logic. Usually it will just check if the rule is a “parser visitor” kind of rule specific to your language, then execute the visitor. If it’s an XPath rule, then we just need to execute evaluate on that.</li>
</ul>
</li>
</ul>
<h2id="11-make-pmd-recognize-your-language">11. Make PMD recognize your language</h2>
<ul>
<li>Create your own subclass of <codeclass="language-plaintext highlighter-rouge">net.sourceforge.pmd.lang.BaseLanguageModule</code>. <em>(see VmLanguageModule or JavaLanguageModule as an example)</em></li>
<li>You’ll need to refer the rule chain visitor created in step #10.</li>
<li>Add for each version of your language a call to <codeclass="language-plaintext highlighter-rouge">addVersion</code> in your language module’s constructor.</li>
<li>Create the service registration via the text file <codeclass="language-plaintext highlighter-rouge">src/main/resources/META-INF/services/net.sourceforge.pmd.lang.Language</code>. Add your fully qualified class name as a single line into it.</li>
<p>For languages, that use an external library for parsing, the AST can easily change when upgrading the library.
Also for languages, where we have the grammar under our control, it useful to have such tests.</p>
<p>The tests parse one or more source files and generate a textual representation of the AST. This text is compared
against a previously recorded version. If there are differences, the test fails.</p>
<p>This helps to detect anything in the AST structure, that changed, maybe unexpectedly.</p>
<ul>
<li>Create a test class in the package <codeclass="language-plaintext highlighter-rouge">net.sourceforge.pmd.lang.$lang.ast</code> with the name <codeclass="language-plaintext highlighter-rouge">$langTreeDumpTest</code>.</li>
<li>This test class must extend <codeclass="language-plaintext highlighter-rouge">net.sourceforge.pmd.lang.ast.test.BaseTreeDumpTest</code>. Note: This class
is written in kotlin and is available in the module “lang-test”.</li>
<li>
<p>Add a default constructor, that calls the super constructor like so:</p>
<p>Replace “$lang” and “$extension” accordingly.</p>
</li>
<li>Implement the method <codeclass="language-plaintext highlighter-rouge">getParser()</code>. It must return a
subclass of <codeclass="language-plaintext highlighter-rouge">net.sourceforge.pmd.lang.ast.test.BaseParsingHelper</code>. See
<codeclass="language-plaintext highlighter-rouge">net.sourceforge.pmd.lang.ecmascript.ast.JsParsingHelper</code> for a example.
With this parser helper you can also specify, where the test files are searched, by using
the method <codeclass="language-plaintext highlighter-rouge">withResourceContext(Class<?>, String)</code>.</li>
<li>
<p>Add one or more test methods. Each test method parses one file and compares the result. The base
class has a helper method <codeclass="language-plaintext highlighter-rouge">doTest(String)</code> that does all the work. This method just needs to be called:</p>
<li>On the first test run the test fails. A text file (with the extension <codeclass="language-plaintext highlighter-rouge">.txt</code>) is created, that records the
current AST. On the next run, the text file is used as comparison and the test should pass. Don’t forget
to commit the generated text file.</li>
</ul>
<p>A complete example can be seen in the JavaScript module: <codeclass="language-plaintext highlighter-rouge">net.sourceforge.pmd.lang.ecmascript.ast.JsTreeDumpTest</code>.
The test resources are in the subpackage “testdata”: <codeclass="language-plaintext highlighter-rouge">pmd-javascript/src/test/resources/net/sourceforge/pmd/lang/ecmascript/ast/testdata/</code>.</p>
<p>The Scala module also has a test, written in Kotlin instead of Java:
<h2id="13-create-an-abstract-rule-class-for-the-language">13. Create an abstract rule class for the language</h2>
<ul>
<li>Extend <codeclass="language-plaintext highlighter-rouge">AbstractRule</code> and implement the parser visitor interface for your language <em>(see AbstractVmRule for example)</em></li>
<li>All other rules for your language should extend this class. The purpose of this class is to implement visit methods for all AST types to simply delegate to default behavior. This is useful because most rules care only about specific AST nodes, but PMD needs to know what to do with each node - so this just lets you use default behavior for nodes you don’t care about.</li>
</ul>
<h2id="14-create-rules">14. Create rules</h2>
<ul>
<li>Rules are created by extending the abstract rule class created in step 13 <em>(see <codeclass="language-plaintext highlighter-rouge">EmptyForeachStmtRule</code> for example)</em></li>
<li>Creating rules is already pretty well documented in PMD - and it’s no different for a new language, except you may have different AST nodes.</li>
</ul>
<h2id="15-test-the-rules">15. Test the rules</h2>
<ul>
<li>See BasicRulesTest for example</li>
<li>You have to create a rule set for your language <em>(see vm/basic.xml for example)</em></li>
<li>For each rule in this set you want to test, call <codeclass="language-plaintext highlighter-rouge">addRule</code> method in setUp of the unit test
<ul>
<li>This triggers the unit test to read the corresponding XML file with rule test data <em>(see <codeclass="language-plaintext highlighter-rouge">EmptyForeachStmtRule.xml</code> for example)</em></li>
<li>This test XML file contains sample pieces of code which should trigger a specified number of violations of this rule. The unit test will execute the rule on this piece of code, and verify that the number of violations matches</li>
</ul>
</li>
<li>
<p>To verify the validity of the created ruleset, create a subclass of <codeclass="language-plaintext highlighter-rouge">AbstractRuleSetFactoryTest</code> (<em>see <codeclass="language-plaintext highlighter-rouge">RuleSetFactoryTest</code> in pmd-vm for example)</em>.
This will load all rulesets and verify, that all required attributes are provided.</p>
<p><em>Note:</em> You’ll need to add your ruleset to <codeclass="language-plaintext highlighter-rouge">rulesets.properties</code>, so that it can be found.</p>
</li>
</ul>
<h2id="debugging-with-rule-designer">Debugging with Rule Designer</h2>
<p>When implementing your grammar it may be very useful to see how PMD parses your example files.
This can be achieved with Rule Designer:</p>
<ul>
<li>Override the <codeclass="language-plaintext highlighter-rouge">getXPathNodeName</code> in your AST nodes for Designer to show node names.</li>
<li>Make sure to override both <codeclass="language-plaintext highlighter-rouge">jjtOpen</code> and <codeclass="language-plaintext highlighter-rouge">jjtClose</code> in your AST node base class so that they set both start and end line and column for proper node bound highlighting.</li>
<li><em>Not strictly required but trivial and useful:</em> implement syntax highlighting for Rule Designer:
<ul>
<li>Fork and clone the <ahref="https://github.com/pmd/pmd-designer">pmd/pmd-designer</a> repository.</li>
<li>Add a syntax highlighter implementation to <codeclass="language-plaintext highlighter-rouge">net.sourceforge.pmd.util.fxdesigner.util.codearea.syntaxhighlighting</code> (you could use Java as an example).</li>
<li>Register it in the <codeclass="language-plaintext highlighter-rouge">AvailableSyntaxHighlighters</code> enumeration.</li>
<li>Now build your implementation and place the <codeclass="language-plaintext highlighter-rouge">target/pmd-ui-<version>-SNAPSHOT.jar</code> to the <codeclass="language-plaintext highlighter-rouge">lib</code> directory inside your <codeclass="language-plaintext highlighter-rouge">pmd-bin-...</code> distribution (you have to delete old <codeclass="language-plaintext highlighter-rouge">pmd-ui-*.jar</code> from there).</li>