pmd/pmd_devdocs_major_adding_new_language_javacc.html

2340 lines
70 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="description" content="How to add a new language to PMD using JavaCC grammar.">
<meta name="keywords" content="devdocsextending, ">
<title>Adding PMD support for a new JavaCC grammar based language | PMD Source Code Analyzer</title>
<link rel="stylesheet" type="text/css" href="assets/fontawesome-free-5.15.4-web/css/all.min.css">
<link rel="stylesheet" type="text/css" href="assets/bootstrap-4.5.2-dist/css/bootstrap.min.css">
<link rel="stylesheet" type="text/css" href="css/syntax.css">
<link rel="stylesheet" type="text/css" href="css/modern-business.css">
<link rel="stylesheet" type="text/css" href="css/customstyles.css">
<link rel="stylesheet" type="text/css" href="css/theme-green.css">
<link rel="stylesheet" type="text/css" href="css/pmd-customstyles.css">
<link rel="shortcut icon" href="images/logo/favicon.ico" type="image/x-icon">
<link rel="icon" href="images/logo/favicon.ico" type="image/x-icon">
<link rel="alternate" type="application/rss+xml" title="" href="feed.xml">
</head>
<body>
<!-- Content is offset by the height of the topnav bar. -->
<!-- There's already a padding-top rule in modern-business.css, but it apparently doesn't work on Firefox 60 and Chrome 67 -->
<div id="topbar-content-offset">
<!-- Navigation -->
<nav class="navbar navbar-expand-lg fixed-top navbar-dark">
<div class="container topnavlinks">
<a class="navbar-brand fas fa-home fa-lg" href="index.html">&nbsp;<span class="projectTitle"> PMD Source Code Analyzer Project</span></a>
<button class="navbar-toggler" type="button" data-toggle="collapse" data-target="#navbarSupportedContent" aria-controls="navbarSupportedContent" aria-expanded="false" aria-label="Toggle navigation">
<span class="navbar-toggler-icon"></span>
</button>
<div class="collapse navbar-collapse" id="navbarSupportedContent">
<ul class="navbar-nav mr-auto mt-2 mt-lg-0"></ul>
<ul class="navbar-nav">
<!-- toggle sidebar button -->
<li class="nav-item"><a id="tg-sb-link" class="nav-link" href="#"><i id="tg-sb-icon" class="fas fa-toggle-on"></i> Nav</a></li>
<!-- entries without drop-downs appear here -->
<li class="nav-item"><a class="nav-link" href="https://github.com/pmd/pmd/releases/latest" target="_blank">Download</a></li>
<li class="nav-item"><a class="nav-link" href="https://github.com/pmd/pmd" target="_blank">Fork us on github</a></li>
<!-- entries with drop-downs appear here -->
<!-- conditional logic to control which topnav appears for the audience defined in the configuration file.-->
</ul>
<form class="form-inline my-2 my-lg-0">
<input class="form-control mr-sm-2" type="search" placeholder="search..." id="search-input">
<ul id="results-container"></ul>
</form>
</div>
</div>
</nav>
<!-- Page Content -->
<div class="container-toc-wrapper">
<div class="container">
<div class="col-lg-12">&nbsp;</div>
<!-- Content Row -->
<div class="row">
<!-- Sidebar Column -->
<div class="col-md-3" id="tg-sb-sidebar">
<ul id="mysidebar" class="nav">
<li class="sidebarTitle">PMD 7.3.0-SNAPSHOT</li>
<div class="sidebarTitleDate">Release date: 28-June-2024</div>
<li>
<a href="#">About</a>
<ul>
<li><a href="index.html">Home</a></li>
<li><a href="pmd_release_notes.html">Release notes</a></li>
<li><a href="pmd_release_notes_pmd7.html">Release notes (PMD 7)</a></li>
<li><a href="pmd_about_help.html">Getting help</a></li>
<li><a href="pmd_about_release_policies.html">Release policies</a></li>
<li><a href="pmd_about_support_lifecycle.html">Support lifecycle</a></li>
</ul>
</li>
<li>
<a href="#">User Documentation</a>
<ul>
<li><a href="pmd_userdocs_migrating_to_pmd7.html">Migration Guide for PMD 7</a></li>
<li><a href="pmd_userdocs_installation.html">Installation and basic CLI usage</a></li>
<li><a href="pmd_userdocs_making_rulesets.html">Making rulesets</a></li>
<li><a href="pmd_userdocs_configuring_rules.html">Configuring rules</a></li>
<li><a href="pmd_userdocs_best_practices.html">Best practices</a></li>
<li><a href="pmd_userdocs_suppressing_warnings.html">Suppressing warnings</a></li>
<li><a href="pmd_userdocs_incremental_analysis.html">Incremental analysis</a></li>
<li><a href="pmd_userdocs_cli_reference.html">PMD CLI reference</a></li>
<li><a href="pmd_userdocs_report_formats.html">PMD Report formats</a></li>
<li><a href="pmd_userdocs_3rdpartyrulesets.html">3rd party rulesets</a></li>
<li class="subfolders">
<a href="#">CPD reference</a>
<ul>
<li><a href="pmd_userdocs_cpd.html">Copy-paste detection</a></li>
<li><a href="pmd_userdocs_cpd_report_formats.html">CPD Report formats</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">Extending PMD</a>
<ul>
<li><a href="pmd_userdocs_extending_writing_rules_intro.html">Introduction to writing rules</a></li>
<li><a href="pmd_userdocs_extending_your_first_rule.html">Your first rule</a></li>
<li><a href="pmd_userdocs_extending_writing_xpath_rules.html">XPath rules</a></li>
<li><a href="pmd_userdocs_extending_writing_java_rules.html">Java rules</a></li>
<li><a href="pmd_userdocs_extending_designer_reference.html">Rule designer reference</a></li>
<li><a href="pmd_userdocs_extending_defining_properties.html">Defining rule properties</a></li>
<li><a href="pmd_userdocs_extending_rule_guidelines.html">Rule guidelines</a></li>
<li><a href="pmd_userdocs_extending_testing.html">Testing your rules</a></li>
<li><a href="pmd_userdocs_extending_ast_dump.html">Creating (XML) dump of the AST</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">Tools / Integrations</a>
<ul>
<li><a href="pmd_userdocs_tools_maven.html">Maven PMD Plugin</a></li>
<li><a href="pmd_userdocs_tools_gradle.html">Gradle</a></li>
<li><a href="pmd_userdocs_tools_ant.html">Ant</a></li>
<li><a href="pmd_userdocs_tools_java_api.html">PMD Java API</a></li>
<li><a href="pmd_userdocs_tools_bld.html">bld PMD Extension</a></li>
<li><a href="pmd_userdocs_tools_ci.html">CI integrations</a></li>
<li><a href="pmd_userdocs_tools.html">Other Tools / Integrations</a></li>
</ul>
</li>
</ul>
</li>
<li>
<a href="#">Rule Reference</a>
<ul>
<li class="subfolders">
<a href="#">Apex Rules</a>
<ul>
<li><a href="pmd_rules_apex.html">Index</a></li>
<li><a href="pmd_rules_apex_bestpractices.html">Best Practices</a></li>
<li><a href="pmd_rules_apex_codestyle.html">Code Style</a></li>
<li><a href="pmd_rules_apex_design.html">Design</a></li>
<li><a href="pmd_rules_apex_documentation.html">Documentation</a></li>
<li><a href="pmd_rules_apex_errorprone.html">Error Prone</a></li>
<li><a href="pmd_rules_apex_performance.html">Performance</a></li>
<li><a href="pmd_rules_apex_security.html">Security</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">HTML Rules</a>
<ul>
<li><a href="pmd_rules_html.html">Index</a></li>
<li><a href="pmd_rules_html_bestpractices.html">Best Practices</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">Java Rules</a>
<ul>
<li><a href="pmd_rules_java.html">Index</a></li>
<li><a href="pmd_rules_java_bestpractices.html">Best Practices</a></li>
<li><a href="pmd_rules_java_codestyle.html">Code Style</a></li>
<li><a href="pmd_rules_java_design.html">Design</a></li>
<li><a href="pmd_rules_java_documentation.html">Documentation</a></li>
<li><a href="pmd_rules_java_errorprone.html">Error Prone</a></li>
<li><a href="pmd_rules_java_multithreading.html">Multithreading</a></li>
<li><a href="pmd_rules_java_performance.html">Performance</a></li>
<li><a href="pmd_rules_java_security.html">Security</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">Java Server Pages Rules</a>
<ul>
<li><a href="pmd_rules_jsp.html">Index</a></li>
<li><a href="pmd_rules_jsp_bestpractices.html">Best Practices</a></li>
<li><a href="pmd_rules_jsp_codestyle.html">Code Style</a></li>
<li><a href="pmd_rules_jsp_design.html">Design</a></li>
<li><a href="pmd_rules_jsp_errorprone.html">Error Prone</a></li>
<li><a href="pmd_rules_jsp_security.html">Security</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">JavaScript Rules</a>
<ul>
<li><a href="pmd_rules_ecmascript.html">Index</a></li>
<li><a href="pmd_rules_ecmascript_bestpractices.html">Best Practices</a></li>
<li><a href="pmd_rules_ecmascript_codestyle.html">Code Style</a></li>
<li><a href="pmd_rules_ecmascript_errorprone.html">Error Prone</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">Kotlin Rules</a>
<ul>
<li><a href="pmd_rules_kotlin.html">Index</a></li>
<li><a href="pmd_rules_kotlin_bestpractices.html">Best Practices</a></li>
<li><a href="pmd_rules_kotlin_errorprone.html">Error Prone</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">Maven POM Rules</a>
<ul>
<li><a href="pmd_rules_pom.html">Index</a></li>
<li><a href="pmd_rules_pom_errorprone.html">Error Prone</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">Modelica Rules</a>
<ul>
<li><a href="pmd_rules_modelica.html">Index</a></li>
<li><a href="pmd_rules_modelica_bestpractices.html">Best Practices</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">PLSQL Rules</a>
<ul>
<li><a href="pmd_rules_plsql.html">Index</a></li>
<li><a href="pmd_rules_plsql_bestpractices.html">Best Practices</a></li>
<li><a href="pmd_rules_plsql_codestyle.html">Code Style</a></li>
<li><a href="pmd_rules_plsql_design.html">Design</a></li>
<li><a href="pmd_rules_plsql_errorprone.html">Error Prone</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">Salesforce Visualforce Rules</a>
<ul>
<li><a href="pmd_rules_visualforce.html">Index</a></li>
<li><a href="pmd_rules_visualforce_security.html">Security</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">Scala Rules</a>
<ul>
<li><a href="pmd_rules_scala.html">Index</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">Swift Rules</a>
<ul>
<li><a href="pmd_rules_swift.html">Index</a></li>
<li><a href="pmd_rules_swift_bestpractices.html">Best Practices</a></li>
<li><a href="pmd_rules_swift_errorprone.html">Error Prone</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">Velocity Template Language (VTL) Rules</a>
<ul>
<li><a href="pmd_rules_velocity.html">Index</a></li>
<li><a href="pmd_rules_velocity_bestpractices.html">Best Practices</a></li>
<li><a href="pmd_rules_velocity_design.html">Design</a></li>
<li><a href="pmd_rules_velocity_errorprone.html">Error Prone</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">WSDL Rules</a>
<ul>
<li><a href="pmd_rules_wsdl.html">Index</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">XML Rules</a>
<ul>
<li><a href="pmd_rules_xml.html">Index</a></li>
<li><a href="pmd_rules_xml_bestpractices.html">Best Practices</a></li>
<li><a href="pmd_rules_xml_errorprone.html">Error Prone</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">XSL Rules</a>
<ul>
<li><a href="pmd_rules_xsl.html">Index</a></li>
<li><a href="pmd_rules_xsl_codestyle.html">Code Style</a></li>
<li><a href="pmd_rules_xsl_performance.html">Performance</a></li>
</ul>
</li>
</ul>
</li>
<li>
<a href="#">Language-Specific Documentation</a>
<ul>
<li><a href="pmd_languages_index.html">Overview</a></li>
<li><a href="pmd_languages_configuration.html">Language configuration</a></li>
<li><a href="pmd_languages_apex.html">Apex</a></li>
<li><a href="pmd_languages_cpp.html">C/C++</a></li>
<li><a href="pmd_languages_cs.html">C#</a></li>
<li><a href="pmd_languages_coco.html">Coco</a></li>
<li><a href="pmd_languages_dart.html">Dart</a></li>
<li><a href="pmd_languages_fortran.html">Fortran</a></li>
<li><a href="pmd_languages_gherkin.html">Gherkin</a></li>
<li><a href="pmd_languages_go.html">Go</a></li>
<li><a href="pmd_languages_html.html">HTML</a></li>
<li><a href="pmd_languages_java.html">Java</a></li>
<li><a href="pmd_languages_js_ts.html">JavaScript / TypeScript</a></li>
<li><a href="pmd_languages_jsp.html">JSP</a></li>
<li><a href="pmd_languages_julia.html">Julia</a></li>
<li><a href="pmd_languages_kotlin.html">Kotlin</a></li>
<li><a href="pmd_languages_lua.html">Lua</a></li>
<li><a href="pmd_languages_matlab.html">Matlab</a></li>
<li><a href="pmd_languages_modelica.html">Modelica</a></li>
<li><a href="pmd_languages_objectivec.html">Objective-C</a></li>
<li><a href="pmd_languages_perl.html">Perl</a></li>
<li><a href="pmd_languages_php.html">PHP</a></li>
<li><a href="pmd_languages_plsql.html">PLSQL</a></li>
<li><a href="pmd_languages_python.html">Python</a></li>
<li><a href="pmd_languages_ruby.html">Ruby</a></li>
<li><a href="pmd_languages_scala.html">Scala</a></li>
<li><a href="pmd_languages_swift.html">Swift</a></li>
<li><a href="pmd_languages_tsql.html">T-SQL</a></li>
<li><a href="pmd_languages_visualforce.html">Visualforce</a></li>
<li><a href="pmd_languages_velocity.html">Velocity Template Language (VTL)</a></li>
<li><a href="pmd_languages_xml.html">XML and XML dialects</a></li>
</ul>
</li>
<li>
<a href="#">Developer Documentation</a>
<ul>
<li><a href="pmd_devdocs_development.html">Developer resources</a></li>
<li><a href="pmd_devdocs_building.html">Building PMD from source</a></li>
<li><a href="https://github.com/pmd/pmd/blob/master/CONTRIBUTING.md" target="_blank">Contributing</a></li>
<li><a href="pmd_devdocs_writing_documentation.html">Writing documentation</a></li>
<li><a href="pmd_devdocs_roadmap.html">Roadmap</a></li>
<li><a href="pmd_devdocs_how_pmd_works.html">How PMD works</a></li>
<li><a href="pmd_devdocs_pmdtester.html">Pmdtester</a></li>
<li><a href="pmd_devdocs_rule_deprecation_policy.html">Rule Deprecation Policy</a></li>
<li class="subfolders">
<a href="#">Major contributions</a>
<ul>
<li><a href="pmd_devdocs_major_rule_guidelines.html">Rule Guidelines</a></li>
<li class="active"><a href="pmd_devdocs_major_adding_new_language_javacc.html">Adding a new language (JavaCC)</a></li>
<li><a href="pmd_devdocs_major_adding_new_language_antlr.html">Adding a new language (ANTLR)</a></li>
<li><a href="pmd_devdocs_major_adding_new_cpd_language.html">Adding a new CPD language</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">Experimental features</a>
<ul>
<li><a href="tag_experimental.html">List of experimental Features</a></li>
</ul>
</li>
</ul>
</li>
<li>
<a href="#">Project documentation</a>
<ul>
<li class="subfolders">
<a href="#">Trivia about PMD</a>
<ul>
<li><a href="pmd_projectdocs_trivia_news.html">PMD in the press</a></li>
<li><a href="pmd_projectdocs_trivia_products.html">Products & books related to PMD</a></li>
<li><a href="pmd_projectdocs_trivia_similarprojects.html">Similar projects</a></li>
<li><a href="pmd_projectdocs_trivia_meaning.html">What does 'PMD' mean?</a></li>
</ul>
</li>
<li><a href="pmd_projectdocs_logo.html">Logo</a></li>
<li><a href="pmd_projectdocs_faq.html">FAQ</a></li>
<li><a href="license.html">License</a></li>
<li><a href="pmd_projectdocs_credits.html">Credits</a></li>
<li><a href="pmd_release_notes_old.html">Old release notes</a></li>
<li><a href="pmd_projectdocs_decisions.html">Decisions</a></li>
<li class="subfolders">
<a href="#">Project management</a>
<ul>
<li><a href="pmd_projectdocs_committers_infrastructure.html">Infrastructure</a></li>
<li><a href="pmd_projectdocs_committers_releasing.html">Release process</a></li>
<li><a href="pmd_projectdocs_committers_merging_pull_requests.html">Merging pull requests</a></li>
<li><a href="pmd_projectdocs_committers_main_landing_page.html">Main Landing page</a></li>
</ul>
</li>
</ul>
</li>
</ul>
</div>
<!-- Content Column -->
<div class="col-md-9" id="tg-sb-content">
<header>
<div class="row">
<div class="col-lg-12">
<a href="./" role="button"
><i class="fa fa-home fa-lg"></i
></a>
» Adding PMD support for a new JavaCC grammar based language
<a
target="_blank"
href="https://github.com/pmd/pmd/blob/master/docs/pages/pmd/devdocs/major_contributions/adding_a_new_javacc_based_language.md"
class="float-right"
role="button"
><i class="fab fa-github fa-lg"></i> Edit on GitHub</a
>
</div>
</div>
<hr />
</header>
<div class="post-header">
<h1 class="post-title-main">Adding PMD support for a new JavaCC grammar based language</h1>
</div>
<div class="post-content" data-github-edit-url="https://github.com/pmd/pmd/blob/master/docs/pages/pmd/devdocs/major_contributions/adding_a_new_javacc_based_language.md">
<div class="summary">How to add a new language to PMD using JavaCC grammar.</div>
<details id="inline-toc-details">
<summary>Table of Contents</summary>
<div id="inline-toc"><!-- empty, move TOC here when screen size too small --></div>
</details>
<div class="bs-callout bs-callout-warning">
<strong>Before you start…</strong><br /><br />
This is really a big contribution and cant be done with a drive by contribution. It requires dedicated passion
and long commitment to implement support for a new language.<br /><br />
This step-by-step guide is just a small intro to get the basics started, and its also not necessarily up-to-date
or complete. You have to be able to fill in the blanks.<br /><br />
After the basic support for a language is there, there are lots of missing features left. Typical features
that can greatly improve rule writing are: symbol table, type resolution, call/data flow analysis.<br /><br />
Symbol table keeps track of variables and their usages. Type resolution tries to find the actual class type
of each used type, following along method calls (including overloaded and overwritten methods), allowing
to query subtypes and type hierarchy. This requires additional configuration of an auxiliary classpath.
Call and data flow analysis keep track of the data as it is moving through different execution paths
a program has.<br /><br />
These features are out of scope of this guide. Type resolution and data flow are features that
definitely dont come for free. It is much effort and requires perseverance to implement.<br /><br />
</div>
<h2 id="steps">Steps</h2>
<h3 id="1--start-with-a-new-sub-module">1. Start with a new sub-module</h3>
<ul>
<li>See pmd-java or pmd-vm for examples.</li>
<li>Make sure to add your new module to PMDs parent pom as <code class="language-plaintext highlighter-rouge">&lt;module&gt;</code> entry, so that it is built alongside the
other languages.</li>
<li>Also add your new module to the dependencies list in “pmd-languages-deps/pom.xml”, so that the new language
is automatically available in the binary distribution (pmd-dist).</li>
</ul>
<h3 id="2--implement-an-ast-parser-for-your-language">2. Implement an AST parser for your language</h3>
<ul>
<li>Ideally an AST parser should be implemented as a JJT file <em>(see VmParser.jjt or Java.jjt for example)</em></li>
<li>There is nothing preventing any other parser implementation, as long as you have some way to convert an input
stream into an AST tree. Doing it as a JJT simplifies maintenance down the road.</li>
<li>See this link for reference: <a href="https://javacc.java.net/doc/JJTree.html">https://javacc.java.net/doc/JJTree.html</a></li>
</ul>
<h3 id="3--create-ast-node-classes">3. Create AST node classes</h3>
<ul>
<li>For each AST node that your parser can generate, there should be a class</li>
<li>The name of the AST class should be “AST” + “whatever is the name of the node in JJT file”.
<ul>
<li>For example, if JJT contains a node called “IfStatement”, there should be a class called “ASTIfStatement”</li>
</ul>
</li>
<li>Each AST class should have one package-private constructor, that takes an <code class="language-plaintext highlighter-rouge">int id</code>.</li>
<li>Its a good idea to create a parent AST class for all AST classes of the language. This simplifies rule
creation later. <em>(see SimpleNode for Velocity and AbstractJavaNode for Java for example)</em></li>
<li>Note: These AST node classes are generated usually once by javacc/jjtree and can then be modified as needed.</li>
<li>You can add additional methods in your AST node classes, that can be used in rules. Most getters
are also available for XPath rules, see section <a href="#xpath-integration">XPath integration</a> below.</li>
</ul>
<h3 id="4--generate-your-parser-using-jjt">4. Generate your parser (using JJT)</h3>
<ul>
<li>An ant script is being used to compile jjt files into classes. This is in <code class="language-plaintext highlighter-rouge">javacc-wrapper.xml</code> file in the
top-level pmd sources.</li>
<li>The ant script is executed via the <code class="language-plaintext highlighter-rouge">maven-antrun-plugin</code>. Add this plugin to your <code class="language-plaintext highlighter-rouge">pom.xml</code> file and configure
it the language name. You can use <code class="language-plaintext highlighter-rouge">pmd-java/pom.xml</code> as an example.</li>
<li>The ant script is called in the phase <code class="language-plaintext highlighter-rouge">generate-sources</code> whenever the whole project is built. But you can
call <code class="language-plaintext highlighter-rouge">./mvnw generate-sources</code> directly for your module if you want your parser to be generated.</li>
</ul>
<h3 id="5--create-a-pmd-parser-adapter">5. Create a PMD parser “adapter”</h3>
<ul>
<li>Create a new class that extends <code class="language-plaintext highlighter-rouge">JjtreeParserAdapter</code>.</li>
<li>This is a generic class, and you need to declare the root AST node.</li>
<li>There are two important methods to implement
<ul>
<li><code class="language-plaintext highlighter-rouge">tokenBehavior</code> method should return a new instance of <code class="language-plaintext highlighter-rouge">TokenDocumentBehavior</code> constructed with the list
of tokes in your language. The compile step #4 will generate a class <code class="language-plaintext highlighter-rouge">$langTokenKinds</code> which has
all the available tokens in the field <code class="language-plaintext highlighter-rouge">TOKEN_NAMES</code>.</li>
<li><code class="language-plaintext highlighter-rouge">parseImpl</code> method should return the root node of the AST tree obtained by parsing the CharStream source</li>
<li>See <code class="language-plaintext highlighter-rouge">VmParser</code> class as an example</li>
</ul>
</li>
</ul>
<h3 id="6--create-a-language-version-handler">6. Create a language version handler</h3>
<ul>
<li>Extend <code class="language-plaintext highlighter-rouge">AbstractPmdLanguageVersionHandler</code> <em>(see VmHandler for example)</em></li>
<li>This class is sort of a gateway between PMD and all parsing logic specific to your language.</li>
<li>For a minimal implementation, it just needs to return a parser <em>(see step #5)</em>.</li>
<li>It can be used to provide other features for your language like
<ul>
<li>violation suppression logic</li>
<li><a href="https://docs.pmd-code.org/apidocs/pmd-core/7.3.0-SNAPSHOT/net/sourceforge/pmd/reporting/ViolationDecorator.html#"><code>ViolationDecorator</code></a>s, to add additional language specific information to the
created violations. The <a href="pmd_languages_java.html#violation-decorators">Java language module</a> uses this to
provide the method name or class name, where the violation occurred.</li>
<li>metrics (see below “Optional features”)</li>
<li>custom XPath functions</li>
</ul>
</li>
<li>See <code class="language-plaintext highlighter-rouge">VmHandler</code> class as an example</li>
</ul>
<h3 id="7--create-a-base-visitor">7. Create a base visitor</h3>
<ul>
<li>A parser visitor adapter is not needed anymore with PMD 7. The visitor interface now provides a default
implementation.</li>
<li>The visitor for JavaCC based AST is generated along the parser from the grammar file. The
base interface for a visitor is <a href="https://github.com/pmd/pmd/blob/pmd/7.0.x/pmd-core/src/main/java/net/sourceforge/pmd/lang/ast/AstVisitor.java"><code class="language-plaintext highlighter-rouge">AstVisitor</code></a>.</li>
<li>The generated visitor class for VM is called <code class="language-plaintext highlighter-rouge">VmVisitor</code>.</li>
<li>In order to help use this visitor later on, a base visitor class should be created.
See <code class="language-plaintext highlighter-rouge">VmVisitorBase</code> as an example.</li>
</ul>
<h3 id="8-make-pmd-recognize-your-language">8. Make PMD recognize your language</h3>
<ul>
<li>Create your own subclass of <code class="language-plaintext highlighter-rouge">net.sourceforge.pmd.lang.impl.SimpleLanguageModuleBase</code>. <em>(see VmLanguageModule or
JavaLanguageModule as an example)</em></li>
<li>Add for each version of your language a call to <code class="language-plaintext highlighter-rouge">addVersion</code> in your language modules constructor.
Use <code class="language-plaintext highlighter-rouge">addDefaultVersion</code> for defining the default version.</li>
<li>Youll need to refer the language version handler created in step #6.</li>
<li>Create the service registration via the text file <code class="language-plaintext highlighter-rouge">src/main/resources/META-INF/services/net.sourceforge.pmd.lang.Language</code>.
Add your fully qualified class name as a single line into it.</li>
</ul>
<h3 id="9-add-ast-regression-tests">9. Add AST regression tests</h3>
<p>For languages, that use an external library for parsing, the AST can easily change when upgrading the library.
Also for languages, where we have the grammar under our control, it is useful to have such tests.</p>
<p>The tests parse one or more source files and generate a textual representation of the AST. This text is compared
against a previously recorded version. If there are differences, the test fails.</p>
<p>This helps to detect anything in the AST structure that changed, maybe unexpectedly.</p>
<ul>
<li>Create a test class in the package <code class="language-plaintext highlighter-rouge">net.sourceforge.pmd.lang.$lang.ast</code> with the name <code class="language-plaintext highlighter-rouge">$langTreeDumpTest</code>.</li>
<li>This test class must extend <code class="language-plaintext highlighter-rouge">net.sourceforge.pmd.lang.test.ast.BaseTreeDumpTest</code>. Note: This class
is written in kotlin and is available in the module “lang-test”.</li>
<li>
<p>Add a default constructor, that calls the super constructor like so:</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="kd">public</span> <span class="n">$langTreeDumpTest</span><span class="o">()</span> <span class="o">{</span>
<span class="kd">super</span><span class="o">(</span><span class="nc">NodePrintersKt</span><span class="o">.</span><span class="na">getSimpleNodePrinter</span><span class="o">(),</span> <span class="s">".$extension"</span><span class="o">);</span>
<span class="o">}</span>
</code></pre></div> </div>
<p>Replace “$lang” and “$extension” accordingly.</p>
</li>
<li>Implement the method <code class="language-plaintext highlighter-rouge">getParser()</code>. It must return a
subclass of <code class="language-plaintext highlighter-rouge">net.sourceforge.pmd.lang.test.ast.BaseParsingHelper</code>. See
<code class="language-plaintext highlighter-rouge">net.sourceforge.pmd.lang.ecmascript.ast.JsParsingHelper</code> for an example.
With this parser helper you can also specify, where the test files are searched, by using
the method <code class="language-plaintext highlighter-rouge">withResourceContext(Class&lt;?&gt;, String)</code>.</li>
<li>
<p>Add one or more test methods. Each test method parses one file and compares the result. The base
class has a helper method <code class="language-plaintext highlighter-rouge">doTest(String)</code> that does all the work. This method just needs to be called:</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="nd">@Test</span>
<span class="kd">public</span> <span class="kt">void</span> <span class="nf">myFirstAstTest</span><span class="o">()</span> <span class="o">{</span>
<span class="n">doTest</span><span class="o">(</span><span class="s">"filename-without-extension"</span><span class="o">);</span>
<span class="o">}</span>
</code></pre></div> </div>
</li>
<li>On the first test run the test fails. A text file (with the extension <code class="language-plaintext highlighter-rouge">.txt</code>) is created, that records the
current AST. On the next run, the text file is used as comparison and the test should pass. Dont forget
to commit the generated text file.</li>
</ul>
<p>A complete example can be seen in the JavaScript module: <code class="language-plaintext highlighter-rouge">net.sourceforge.pmd.lang.ecmascript.ast.JsTreeDumpTest</code>.
The test resources are in the subpackage “testdata”: <code class="language-plaintext highlighter-rouge">pmd-javascript/src/test/resources/net/sourceforge/pmd/lang/ecmascript/ast/testdata/</code>.</p>
<p>The Scala module also has a test, written in Kotlin instead of Java:
<code class="language-plaintext highlighter-rouge">net.sourceforge.pmd.lang.scala.ast.ScalaParserTests</code>.</p>
<h3 id="10-create-an-abstract-rule-class-for-the-language">10. Create an abstract rule class for the language</h3>
<ul>
<li>Extend <code class="language-plaintext highlighter-rouge">AbstractRule</code> and implement the parser visitor interface for your language <em>(see AbstractVmRule for example)</em></li>
<li>All other rules for your language should extend this class. The purpose of this class is to implement visit
methods for all AST types to simply delegate to default behavior. This is useful because most rules care only
about specific AST nodes, but PMD needs to know what to do with each node - so this just lets you use default
behavior for nodes you dont care about.</li>
</ul>
<h3 id="11-create-rules">11. Create rules</h3>
<ul>
<li>Rules are created by extending the abstract rule class created in step 9 <em>(see <code class="language-plaintext highlighter-rouge">EmptyForeachStmtRule</code> for example)</em></li>
<li>Creating rules is already pretty well documented in PMD - and its no different for a new language,
except you may have different AST nodes.</li>
</ul>
<h3 id="12-test-the-rules">12. Test the rules</h3>
<ul>
<li>Testing rules is described in depth in <a href="pmd_userdocs_extending_testing.html">Testing your rules</a>.
<ul>
<li>Each rule has its own test class: Create a test class for your rule extending <code class="language-plaintext highlighter-rouge">PmdRuleTst</code>
<em>(see AvoidReassigningParametersTest in pmd-vm for example)</em></li>
<li>Create a category rule set for your language <em>(see category/vm/bestpractices.xml for example)</em></li>
<li>Place the test XML file with the test cases in the correct location</li>
<li>When executing the test class
<ul>
<li>this triggers the unit test to read the corresponding XML file with the rule test data
<em>(see <code class="language-plaintext highlighter-rouge">AvoidReassigningParameters.xml</code> for example)</em></li>
<li>This test XML file contains sample pieces of code which should trigger a specified number of
violations of this rule. The unit test will execute the rule on this piece of code, and verify
that the number of violations matches.</li>
</ul>
</li>
</ul>
</li>
<li>
<p>To verify the validity of the created ruleset, create a subclass of <code class="language-plaintext highlighter-rouge">AbstractRuleSetFactoryTest</code>
(<em>see <code class="language-plaintext highlighter-rouge">RuleSetFactoryTest</code> in pmd-vm for example)</em>.
This will load all rulesets and verify, that all required attributes are provided.</p>
<p><em>Note:</em> Youll need to add your category ruleset to <code class="language-plaintext highlighter-rouge">categories.properties</code>, so that it can be found.</p>
</li>
</ul>
<h3 id="13-create-documentation-page">13. Create documentation page</h3>
<p>Finishing up your new language module by adding a page in the documentation. Create a new markdown file
<code class="language-plaintext highlighter-rouge">&lt;langId&gt;.md</code> in <code class="language-plaintext highlighter-rouge">docs/pages/pmd/languages/</code>. This file should have the following frontmatter:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>---
title: &lt;Language Name&gt;
permalink: pmd_languages_&lt;langId&gt;.html
last_updated: &lt;Month&gt; &lt;Year&gt; (&lt;PMD Version&gt;)
tags: [languages, PmdCapableLanguage, CpdCapableLanguage]
---
</code></pre></div></div>
<p>On this page, language specifics can be documented, e.g. when the language was first supported by PMD.
There is also the following Jekyll Include, that creates summary box for the language:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
{% include language_info.html name='&lt;Language Name&gt;' id='&lt;langId&gt;' implementation='&lt;langId&gt;::lang.&lt;langId&gt;.&lt;langId&gt;LanguageModule' supports_cpd=true supports_pmd=true %}
</code></pre></div></div>
<h2 id="xpath-integration">XPath integration</h2>
<p>PMD exposes the AST nodes for use by XPath based rules (see <a href="pmd_userdocs_extending_writing_xpath_rules.html#dom-representation-of-asts">DOM representation of ASTs</a>).
Most Java getters in the AST classes are made available by default. These getters constitute the API of the language.
If a getter method is renamed, then every XPath rule that uses this getter also needs to be adjusted. In order to
have more control over this, there are two annotations that can be used for AST classes and their methods:</p>
<ul>
<li>
<p><a href="https://docs.pmd-code.org/apidocs/pmd-core/7.3.0-SNAPSHOT/net/sourceforge/pmd/lang/rule/xpath/DeprecatedAttribute.html#"><code>DeprecatedAttribute</code></a>: Getters might be annotated with that indicating, that
this getter method should not be used in XPath rules. When a XPath rule uses such a method, a warning is
issued. If the method additionally has the standard Java <code class="language-plaintext highlighter-rouge">@Deprecated</code> annotation, then the getter is also
deprecated for java usage. Otherwise, the getter is only deprecated for usage in XPath rules.</p>
<p>When a getter is deprecated and there is a different getter to be used instead, then the
attribute <code class="language-plaintext highlighter-rouge">replaceWith</code> should be used.</p>
</li>
<li>
<p><a href="https://docs.pmd-code.org/apidocs/pmd-core/7.3.0-SNAPSHOT/net/sourceforge/pmd/lang/rule/xpath/NoAttribute.html#"><code>NoAttribute</code></a>: This annotation can be used on an AST node type or on individual
methods in order to filter out which methods are available for XPath rules.
When used on a type, either all methods can be filtered or only inherited methods (see attribute <code class="language-plaintext highlighter-rouge">scope</code>).
When used directly on an individual method, then only this method will be filtered out.
That way methods can be added in AST nodes, that should only be used in Java rules, e.g. as auxiliary methods.</p>
</li>
</ul>
<div class="alert alert-info" role="alert"><i class="fas fa-info-circle"></i> <b>Note:</b>
Not all getters are available for XPath rules. It depends on the result type.
Especially <strong>Lists</strong> or Collections in general are <strong>not supported</strong>.</div>
<p>Only the following Java result types are supported:</p>
<ul>
<li>String</li>
<li>any Enum-type</li>
<li>int</li>
<li>boolean</li>
<li>double</li>
<li>long</li>
<li>char</li>
<li>float</li>
</ul>
<h2 id="debugging-with-rule-designer">Debugging with Rule Designer</h2>
<p>When implementing your grammar it may be very useful to see how PMD parses your example files.
This can be achieved with Rule Designer:</p>
<ul>
<li>Override the <code class="language-plaintext highlighter-rouge">getXPathNodeName</code> in your AST nodes for Designer to show node names.</li>
<li>Make sure to override both <code class="language-plaintext highlighter-rouge">jjtOpen</code> and <code class="language-plaintext highlighter-rouge">jjtClose</code> in your AST node base class so that they set both start and end line and column for proper node bound highlighting.</li>
<li><em>Not strictly required but trivial and useful:</em> implement syntax highlighting for Rule Designer:
<ul>
<li>Fork and clone the <a href="https://github.com/pmd/pmd-designer">pmd/pmd-designer</a> repository.</li>
<li>Add a syntax highlighter implementation to <code class="language-plaintext highlighter-rouge">net.sourceforge.pmd.util.fxdesigner.util.codearea.syntaxhighlighting</code> (you could use Java as an example).</li>
<li>Register it in the <code class="language-plaintext highlighter-rouge">AvailableSyntaxHighlighters</code> enumeration.</li>
<li>Now build your implementation and place the <code class="language-plaintext highlighter-rouge">target/pmd-designer-&lt;version&gt;-SNAPSHOT.jar</code> to the <code class="language-plaintext highlighter-rouge">lib</code> directory inside your <code class="language-plaintext highlighter-rouge">pmd-bin-...</code> distribution (you have to delete old <code class="language-plaintext highlighter-rouge">pmd-designer-*.jar</code> from there).</li>
</ul>
</li>
</ul>
<h2 id="optional-features">Optional features</h2>
<h3 id="metrics">Metrics</h3>
<p>If you want to add support for computing metrics:</p>
<ul>
<li>Create a package <code class="language-plaintext highlighter-rouge">lang.&lt;langname&gt;.metrics</code></li>
<li>Create a utility class <code class="language-plaintext highlighter-rouge">&lt;langname&gt;Metrics</code></li>
<li>Implement new metrics and add them as static constants. Be sure to document them.</li>
<li>Implement <a href="https://docs.pmd-code.org/apidocs/pmd-core/7.3.0-SNAPSHOT/net/sourceforge/pmd/lang/LanguageVersionHandler.html#getLanguageMetricsProvider()"><code>getLanguageMetricsProvider</code></a>, to make the metrics available in the designer.</li>
</ul>
<p>See <a href="https://docs.pmd-code.org/apidocs/pmd-java/7.3.0-SNAPSHOT/net/sourceforge/pmd/lang/java/metrics/JavaMetrics.html#"><code>JavaMetrics</code></a> for an example.</p>
<h3 id="symbol-table">Symbol table</h3>
<p>A symbol table keeps track of variables and their usages. It is part of semantic analysis and would
be executed in your parser adapter as an additional pass after you got the initial AST.</p>
<p>There is no general language independent API in PMD core. For now, each language will need to implement
its own solution. The symbol information that has been resolved in the additional parser pass
can be made available on the AST nodes via extra methods, e.g. <code class="language-plaintext highlighter-rouge">getSymbolTable()</code>, <code class="language-plaintext highlighter-rouge">getSymbol()</code>, or
<code class="language-plaintext highlighter-rouge">getUsages()</code>.</p>
<p>Currently only Java provides an implementation for symbol table,
see <a href="pmd_languages_java.html">Java-specific features and guidance</a>.</p>
<div class="alert alert-info" role="alert"><i class="fas fa-info-circle"></i> <b>Note:</b>
With PMD 7.0.0 the symbol table and type resolution implementation has been
rewritten from scratch. There is still an old API for symbol table support, that is used by PLSQL,
see <a href="https://docs.pmd-code.org/apidocs/pmd-core/7.3.0-SNAPSHOT/net/sourceforge/pmd/lang/symboltable/package-summary.html#"><code>net.sourceforge.pmd.lang.symboltable</code></a>. This will be deprecated and should not be used.
</div>
<h3 id="type-resolution">Type resolution</h3>
<p>For typed languages like Java type information can be useful for writing rules, that trigger only on
specific types. Resolving types of expressions and variables would be done after in your parser
adapter as yet another additional pass, potentially after resolving the symbol table.</p>
<p>Type resolution tries to find the actual class type of each used type, following along method calls
(including overloaded and overwritten methods), allowing to query subtypes and type hierarchy.
This might require additional configuration for the language, e.g. in Java you need
to configure an auxiliary classpath.</p>
<p>There is no general language independent API in PMD core. For now, each language will need to implement
its own solution. The type information can be made available on the AST nodes via extra methods,
e.g. <code class="language-plaintext highlighter-rouge">getType()</code>.</p>
<p>Currently only Java provides an implementation for type resolution,
see <a href="pmd_languages_java.html">Java-specific features and guidance</a>.</p>
<h3 id="call-and-data-flow-analysis">Call and data flow analysis</h3>
<p>Call and data flow analysis keep track of the data as it is moving through different execution paths
a program has. This would be yet another analysis pass.</p>
<p>There is no general language independent API in PMD core. For now, each language will need to implement
its own solution.</p>
<p>Currently Java has some limited support for data flow analysis,
see <a href="pmd_languages_java.html">Java-specific features and guidance</a>.</p>
<div class="tags">
<b>Tags: </b>
<a href="tag_devdocs.html" class="btn btn-outline-secondary navbar-btn cursorNorm" role="button">devdocs</a>
<a href="tag_extending.html" class="btn btn-outline-secondary navbar-btn cursorNorm" role="button">extending</a>
</div>
</div>
<footer>
<hr />
<div>
This documentation is written in markdown. <br />
If there is something missing or can be improved, edit this page on
github and create a PR:
<a
target="_blank"
href="https://github.com/pmd/pmd/blob/master/docs/pages/pmd/devdocs/major_contributions/adding_a_new_javacc_based_language.md"
role="button"
><i class="fab fa-github fa-lg"></i> Edit on GitHub</a
>
</div>
<hr />
<div class="row">
<div class="col-lg-12 footer">
&copy;2024 PMD Open Source Project. All rights
reserved. <br />
<span>Page last updated:</span>
December 2023 (7.0.0)<br /> Site last generated: Jun 28, 2024 <br />
<p>
<img src="images/logo/pmd-logo-70px.png" alt="PMD
logo"/>
</p>
</div>
</div>
</footer>
</div>
<!-- /.row -->
</div>
<!-- /.container -->
</div>
<!-- Sticky TOC column -->
<div class="toc-col">
<div id="toc"></div>
</div>
<!-- /.toc-container-wrapper -->
</div>
</div>
<script type="application/javascript" src="assets/jquery-3.5.1/jquery-3.5.1.min.js"></script>
<script type="application/javascript" src="assets/anchorjs-4.2.2/anchor.min.js"></script>
<script type="application/javascript" src="assets/navgoco-0.2.1/src/jquery.navgoco.min.js"></script>
<script type="application/javascript" src="assets/bootstrap-4.5.2-dist/js/bootstrap.bundle.min.js"></script>
<script type="application/javascript" src="assets/Simple-Jekyll-Search-1.0.8/dest/jekyll-search.js"></script>
<script type="application/javascript" src="assets/jekyll-table-of-contents/toc.js"></script>
<script type="application/javascript" src="js/tabstate.js"></script>
<script type="application/javascript" src="js/customscripts.js"></script>
</body>
</html>