Migrating to PMD 7 from PMD 6.x
Table of Contents

Before you update

Before updating to PMD 7, you should first update to the latest PMD 6 version 6.55.0 and try to fix all deprecation warnings.

There are a couple of deprecated things in PMD 6, you might encounter:

  • Properties: In order to define property descriptors, you should use PropertyFactory now. This factory can create properties of any type. E.g. instead of StringProperty.named(...) use PropertyFactory.stringProperty(...).

    Also note, that uiOrder is gone. You can just remove it.

    See also Defining rule properties

  • When reporting a violation, you might see a deprecation of the addViolation methods. These methods have been moved to RuleContext. E.g. instead of addViolation(data, node, ...) use asCtx(data).addViolation(node, ...).

  • When you are calling PMD from CLI, you need to stop using deprecated CLI params, e.g.
    • -no-cache ➡️ --no-cache
    • -failOnViolation ➡️ --fail-on-violation
    • -reportfile ➡️ --report-file
    • -language ➡️ --use-version
  • If you have written custom XPath rule, look out for warnings about deprecated XPath attributes. These warnings might look like
    WARNING: Use of deprecated attribute 'VariableId/@Image' by XPath rule 'VariableNaming' (in ruleset 'VariableNamingRule'), please use @Name instead
    

    and often already suggest an alternative.

  • If you still reference rulesets or rules the old way which has been deprecated since 6.46.0:
    • <lang-name>-<ruleset-name>, eg java-basic, which resolves to rulesets/java/basic.xml
    • the internal release number, eg 600, which resolves to rulesets/releases/600.xml

    Such usages produce deprecation warnings that should be easy to spot, e.g.

    Ruleset reference 'java-basic' uses a deprecated form, use 'rulesets/java/basic.xml' instead
    

    Use the explicit forms of these references to be compatible with PMD 7.

    Note: Since PMD 6, all rules are sorted into categories (such as “Best Practices”, “Design”, “Error Prone”) and the old rulesets like basic.xml have been deprecated and have been removed with PMD 7. It is about time to create a custom ruleset.

Use cases

I’m using only built-in rules

When you are using only built-in rules, then you should check, whether you use any deprecated rule. With PMD 7 many deprecated rules are finally removed. You can see a complete list of the removed rules in the release notes for PMD 7. The release notes also mention the replacement rule, that should be used instead. For some rules, there is no replacement.

Then many rules have been changed or improved. New properties have been added to make them more versatile or properties have been removed, if they are not necessary anymore. See changed rules in the release notes for PMD 7.

All properties which accept multiple values now use a comma (,) as a delimiter. The previous default was a pipe character (|). The delimiter is not configurable anymore. If needed, the comma can be escaped with a backslash. This affects the following rules: AvoidUsingHardCodedIP, LooseCoupling, UnusedPrivateField, UnusedPrivateMethod, AtLeastOneConstructor, CommentDefaultAccessModifier, FieldNamingConventions, LinguisticNaming, UnnecessaryConstructor, CyclomaticComplexity, NcssCount, SingularField, AvoidBranchingStatementAsLastInLoop, CloseResource.

A handful of rules are new to PMD 7. You might want to check these out: new rules.

Once you have reviewed your ruleset(s), you can switch to PMD 7.

I’m using custom rules

Ideally, you have written good tests already for your custom rules - see Testing your rules. This helps to identify problems early on.

Ruleset XML

The <rule> tag, that defines your custom rule, is required to have a language attribute now. This was always the case for XPath rules, but is now a requirement for Java rules.

XPath rules

If you have XPath based rules, the first step will be to migrate to XPath 2.0 and then to XPath 3.1. XPath 2.0 is available in PMD 6 already and can be used right away. PMD 7 will use by default XPath 3.1 and won’t support XPath 1.0 anymore. The difference between XPath 2.0 and XPath 3.1 is not big, so your XPath 2.0 can be expected to work in PMD 7 without any further changes. So the migration path is to simply migrate to XPath 2.0.

After you have migrated your XPath rules to XPath 2.0, remove the “version” property, since that has been removed with PMD 7. PMD 7 by default uses XPath 3.1. See below XPath for details.

Then change the class attribute of your rule to net.sourceforge.pmd.lang.rule.xpath.XPathRule - because the class XPathRule has been moved into subpackage net.sourceforge.pmd.lang.rule.xpath.

There are some general changes for AST nodes regarding the @Image attribute. See below General AST Changes to avoid @Image.

Additional infos:

  • The custom XPath function typeOf has been removed (deprecated since 6.4.0). Use the function pmd-java:typeIs or pmd-java:typeIsExactly instead. See PMD extension functions for available functions.

Java rules

If you have Java based rules, and you are using rulechain, this works a bit different now. The RuleChain API has changed, see [core] Simplify the rulechain (#2490) for the full details. But in short, you don’t call addRuleChainVisit(...) in the rule’s constructor anymore. Instead, you override the method buildTargetSelector:

    protected RuleTargetSelector buildTargetSelector() {
        return RuleTargetSelector.forTypes(ASTVariableId.class);
    }

Java AST changes

The API to navigate the AST also changed significantly:

  • Tree traversal using Node API
  • Consider using the new NodeStream API to navigate with null-safety. This is optional.

Additionally, if you have created rules for Java - regardless whether it is a XPath based rule or a Java based rule - you might need to adjust your queries or visitor methods. The Java AST has been refactored substantially. The easiest way is to use the PMD Rule Designer to see the structure of the AST. See the section Java AST below for details.

I’ve extended PMD with a custom language…

The guides for Adding a new language with JavaCC and Adding a new CPD language have been updated.

Most notable changes are:

  • As an alternative, PMD 7 now supports ANTLR in addition to JavaCC: Adding a new language with ANTLR.
  • There is a shared ant script that wraps the calls to javacc: javacc-wrapper.xml. This should be used now.
  • PMD’s parser adapter for JavaCC generated parsers is called now JjtreeParserAdapter. This is the class that needs to be implemented now.
  • There is no need anymore to write a custom TokenManager - we have now a common base class for JavaCC generated token managers. This base class is AbstractTokenManager.
  • A rule violation factory is not needed anymore. For language specific information on rule violations, there is now a ViolationDecorator that a language can implement. These ViolationDecorators are called when a violation is reported and they can provide the additional information. This information can be used by renderers via RuleViolation#getAdditionalInfo.
  • A parser visitor adapter is not needed anymore. The visitor interface now provides a default implementation. Instead, a base visitor for the language should be created, which extends AstVisitorBase.
  • A rule chain visitor is not needed anymore. PMD provides a common implementation that fits all languages.

I’ve extended PMD with a custom feature…

In that case we can’t provide a general guide unless we know the specific custom feature. If you are having difficulties finding your way around the PMD source code and javadocs and you don’t see the aspect of PMD documented you are using, we are probably missing documentation. Please reach out to us by opening a discussion. We then can enhance the documentation and/or the PMD API.

Special topics

Release downloads

  • The asset filenames of PMD on GitHub Releases are now pmd-dist-<version>-bin.zip, pmd-dist-<version>-src.zip and pmd-dist-<version>-doc.zip. Keep that in mind, if you have an automated download script.
  • The structure inside the ZIP files stay the same, e.g. we still provide inside the binary distribution ZIP file the base directory pmd-bin-<version>.

CLI Changes

The CLI has been revamped completely (see Release Notes: Revamped Command Line Interface).

Most notable changes:

  • Unified start script on all platforms for all commands (PMD, CPD, Designer). Instead of run.sh and pmd.bat, we now have pmd only (technically on Windows, there is still a pmd.bat, but it behaves the same).
    • Executing PMD from CLI now means: run.sh pmd / pmd.bat ➡️ pmd check
    • Executing CPD: run.sh cpd / cpd.bat ➡️ pmd cpd
    • Executing Designer: run.sh designer / designer.bat ➡️ pmd designer
    • Executing CPD GUI: run.sh cpd-gui / cpdgui.bat ➡️ pmd cpd-gui
  • There are some changes to the CLI arguments:
    • --fail-on-violation false ➡️ --no-fail-on-violation

      If you don’t replace this argument, then “false” will be interpreted as a file to analyze. You might see then an error message such as [main] ERROR net.sourceforge.pmd.cli.commands.internal.PmdCommand - No such file false.

    • PMD tries to display a progress bar. If you don’t want this (e.g. on a CI build server), you can disable this with --no-progress.
    • --no-ruleset-compatibility has been removed

Custom distribution packages

When creating a custom distribution which only integrates the languages you need, there are some changes to apply:

  • In addition to the language dependencies you want, you also need add a dependency to net.sourceforge.pmd:pmd-cli in order to get the CLI classes.
  • When fetching the scripts for the CLI with “maven-dependency-plugin”, you need to additionally fetch the logging configuration. That means, the line <includes>scripts/**,LICENSE</includes> needs to be changed to <includes>scripts/**,LICENSE,conf/**</includes>.
  • Since the assembly descriptor pmd-bin includes now also a BOM (bill of material), you need to create one for your custom distribution as well. Simply add the following plugin configuration:
       <plugin>
          <groupId>org.cyclonedx</groupId>
          <artifactId>cyclonedx-maven-plugin</artifactId>
          <version>2.7.6</version>
          <executions>
            <execution>
              <phase>package</phase>
              <goals>
                <goal>makeAggregateBom</goal>
              </goals>
            </execution>
          </executions>
          <!-- https://github.com/CycloneDX/cyclonedx-maven-plugin/issues/326 -->
          <dependencies>
            <dependency>
              <groupId>org.ow2.asm</groupId>
              <artifactId>asm</artifactId>
              <version>9.5</version>
            </dependency>
          </dependencies>
        </plugin>
    

Rule tests are now using JUnit5

When you have custom rules, and you have written rule tests according to the guide Testing your rules, you might want to consider upgrading your other tests to JUnit 5. The tests in PMD 7 have been migrated to JUnit5 - including the rule tests for the built-in rules.

When executing the rule tests, you need to make sure to have JUnit5 on the classpath - which you automatically get when you depend on net.sourceforge.pmd:pmd-test. If you also have JUnit4 tests, you need to make sure to have a junit-vintage-engine as well on the test classpath, so that all tests are executed. That means, you might need to add now a dependency to JUnit4 explicitly if needed.

CPD: Reported endcolumn is now exclusive

In PMD 6, the reported position of the duplicated tokens in CPD where always including, e.g. the following described a duplication of length 4 in PMD 6: beginLine=1, endLine=1, beginColumn=1, endColumn=4 - these are the first 4 character in the first line. With PMD 7, the endColumn is now excluding. The same duplication will be reported in PMD 7 as: beginLine=1, endLine=1, beginColumn=1, endColumn=5.

The reported positions in a file follow now the usual meaning: line numbering starts from 1, begin line and end line are inclusive, begin column is inclusive and end column is exclusive. This is the usual behavior of the most common text editors and the PMD part already used that meaning in RuleViolations for a long time in PMD 6 already.

This only affects the XML report format as the others don’t provide column information.

Node API

Starting from one node in the AST, you can navigate to children or parents with the following methods. This is the “traditional” way for simple cases. For more complex cases, consider to use the new NodeStream API.

Many methods available in PMD 6 have been deprecated and removed for a slicker API with consistent naming, that also integrates tightly with the NodeStream API.

  • getNthParent(n) ➡️ ancestors().get(n - 1)
  • getFirstParentOfType(parentType) ➡️ ancestors(parentType).first()
  • getParentsOfType(parentType) ➡️ ancestors(parentType).toList()
  • findChildrenOfType(childType) ➡️ children(childType).toList()
  • findDescendantsOfType(targetType) ➡️ descendants(targetType).toList()
  • getFirstChildOfType(childType) ➡️ firstChild(childType)
  • getFirstDescendantOfType(descendantType) ➡️ descendants(descendantType).first()
  • hasDescendantOfType(type) ➡️ descendants(type).nonEmpty()

Unchanged methods that work as before:

New methods:

New methods that integrate with NodeStream:

Methods removed completely:

  • getFirstParentOfAnyType(parentTypes):️ There is no direct replacement, but something along the lines:
          ancestors()
                  .filter(n -> Arrays.stream(classes)
                          .anyMatch(c -> c.isInstance(n)))
                  .first();
    
  • findChildNodesWithXPath: Has been removed, because it is very inefficient. Use NodeStream instead.
  • hasDescendantMatchingXPath: Has been removed, because it is very inefficient. Use NodeStream instead.
  • jjt* like jjtGetParent. These methods were implementation specific. Use the equivalent methods like getParent().

See Node for the details.

NodeStream API

In java rule implementations, you often need to navigate the AST to find the interesting nodes. In PMD 6, this was often done by calling jjtGetChild(int) or jjtGetParent(int) and then checking the node type with instanceof. There are also helper methods available, like getFirstChildOfType(Class) or findDescendantsOfType(Class). These methods might return null and you need to check this for every level.

The new NodeStream API provides easy to use methods that follow the Java Stream API (java.util.stream).

Many complex predicates about nodes can be expressed by testing the emptiness of a node stream. E.g. the following tests if the node is a variable declarator id initialized to the value 0:

Example:

     NodeStream.of(someNode)                           // the stream here is empty if the node is null
               .filterIs(ASTVariableId.class)          // the stream here is empty if the node was not a variable id
               .followingSiblings()                    // the stream here contains only the siblings, not the original node
               .children(ASTNumericLiteral.class)
               .filter(ASTNumericLiteral::isIntLiteral)
               .filterMatching(ASTNumericLiteral::getValueAsInt, 0)
               .nonEmpty(); // If the stream is non empty here, then all the pipeline matched

See NodeStream for the details. Note: This was implemented via PR #1622 [core] NodeStream API

XPath: Migrating from 1.0 to 2.0

XPath 1.0 and 2.0 have some incompatibilities. The XPath 2.0 specification describes them precisely. Those are however mostly corner cases and XPath rules usually don’t feature any of them.

The incompatibilities that are most relevant to migrating your rules are not caused by the specification, but by the different engines we use to run XPath 1.0 and 2.0 queries. Here’s a list of known incompatibilities:

  • The namespace prefixes fn: and string: should not be mentioned explicitly. In XPath 2.0 mode, the engine will complain about an undeclared namespace, but the functions are in the default namespace. Removing the namespace prefixes fixes it.
    • fn:substring("Foo", 1)substring("Foo", 1)
  • Conversely, calls to custom PMD functions like typeIs must be prefixed with the namespace of the declaring module (pmd-java).
    • typeIs("Foo")pmd-java:typeIs("Foo")
  • Boolean attribute values on our 1.0 engine are represented as the string values "true" and "false". In 2.0 mode though, boolean values are truly represented as boolean values, which in XPath may only be obtained through the functions true() and false(). If your XPath 1.0 rule tests an attribute like @Private="true", then it just needs to be changed to @Private=true() when migrating. A type error will warn you that you must update the comparison. More is explained on issue #1244.
    • "true", 'true'true()
    • "false", 'false'false()
  • In XPath 1.0, comparing a number to a string coerces the string to a number. In XPath 2.0, a type error occurs. Like for boolean values, numeric values are represented by our 1.0 implementation as strings, meaning that @BeginLine > "1" worked —that’s not the case in 2.0 mode.
    • @ArgumentCount > '1'@ArgumentCount > 1
  • In XPath 1.0, the expression /Foo matches the children of the root named Foo. In XPath 2.0, that expression matches the root, if it is named Foo. Consider the following tree:
    Foo
    └─ Foo
    └─ Foo
    

    Then /Foo will match the root in XPath 2.0, and the other nodes (but not the root) in XPath 1.0. See e.g. an issue caused by this in Apex, with nested classes.

  • The custom function “pmd:matches” which checks a regular expression against a string has been removed, since there is a built-in function available since XPath 2.0 which can be used instead. If you use “pmd:matches” simply remove the “pmd:” prefix.

General AST Changes to avoid @Image

An abstract syntax tree should be abstract, but in the same time, should not be too abstract. One of the base interfaces for PMD’s AST for all languages is Node, which provides the methods getImage and hasImageEqualTo. However, these methods don’t necessarily make sense for all nodes in all contexts. That’s why getImage() often returns just null. Also, the name is not very describing. AST nodes should try to use more specific names, such as getValue() or getName().

For PMD 7, most languages have been adapted. And when writing XPath rules, you need to replace @Image with whatever is appropriate now (e.g. @Name). See below for details.

Apex and Visualforce

There are many usages of @Image. These will be refactored after PMD 7 is released by deprecating the attribute and providing alternatives.

See also issue Deprecate getImage/@Image #4787.

Html

  • ASTHtmlTextNode: @Image ➡️ @Text, @NormalizedText ➡️ @Text, @Text ➡️ @WholeText.

Java

There are still many usages of @Image which are not refactored yet. This will be done after PMD 7 is released by deprecating the attribute and providing alternatives.

See also issue Deprecate getImage/@Image #4787.

Some nodes have already the image attribute (and others) deprecated. These deprecated attributes are removed now:

JavaScript

JSP

Modelica

PLSQL

There are many usages of @Image. These will be refactored after PMD 7 is released by deprecating the attribute and providing alternatives.

See also issue Deprecate getImage/@Image #4787.

Scala

XML (and POM)

When using XPathRule, text of text nodes was exposed as @Image of normal element type nodes. Now the attribute is called @Text.

Note: In general, it is recommended to use DomXPathRule instead, which exposes text nodes as real XPath/XML text nodes which conforms to the XPath spec. There is no difference, text of text nodes can be selected using text().

Java AST

The Java grammar has been refactored substantially in order to make it easier to maintain and more correct regarding the Java Language Specification.

Here you can see the most important changes as a comparison between the PMD 6 AST (“Old AST”) and PMD 7 AST (“New AST”) and with some background info about the changes.

When in doubt, it is recommended to use the PMD Designer which can also display the AST.

Renamed classes / interfaces

Annotations

  • What: Annotations are consolidated into a single node. SingleMemberAnnotation, NormalAnnotation and MarkerAnnotation are removed in favour of ASTAnnotation. The Name node is removed, replaced by a ASTClassType.
  • Why: Those different node types implement a syntax-only distinction, that only makes semantically equivalent annotations have different possible representations. For example, @A and @A() are semantically equivalent, yet they were parsed as MarkerAnnotation resp. NormalAnnotation. Similarly, @A("") and @A(value="") were parsed as SingleMemberAnnotation resp. NormalAnnotation. This also makes parsing much simpler. The nested ClassOrInterface type is used to share the disambiguation logic.
  • Related issue: [java] Use single node for annotations (#2282)
Annotation AST Examples
CodeOld AST (PMD 6)New AST (PMD 7)
@A
└─ Annotation "A"
   └─ MarkerAnnotation "A"
      └─ Name "A"
└─ Annotation "A"
   └─ ClassOrInterfaceType "A"
@A()
└─ Annotation "A"
   └─ NormalAnnotation "A"
      └─ Name "A"
└─ Annotation "A"
   ├─ ClassType "A"
   └─ AnnotationMemberList
@A(value="v")
└─ Annotation "A"
   └─ NormalAnnotation "A"
      ├─ Name "A"
      └─ MemberValuePairs
         └─ MemberValuePair "value"
            └─ MemberValue
               └─ PrimaryExpression
                  └─ PrimaryPrefix
                     └─ Literal '"v"'
└─ Annotation "A"
   ├─ ClassType "A"
   └─ AnnotationMemberList
      └─ MemberValuePair "value" [ @Shorthand = false() ]
         └─ StringLiteral '"v"'
@A("v")
└─ Annotation "A"
   └─ SingleMemberAnnotation "A"
      ├─ Name "A"
      └─ MemberValue
         └─ PrimaryExpression
            └─ PrimaryPrefix
               └─ Literal '"v"'
└─ Annotation "A"
   ├─ ClassType "A"
   └─ AnnotationMemberList
      └─ MemberValuePair "value" [ @Shorthand = true() ]
         └─ StringLiteral '"v"'
@A(value="v", on=true)
└─ Annotation "A"
   └─ NormalAnnotation "A"
      ├─ Name "A"
      └─ MemberValuePairs
         ├─ MemberValuePair "value"
           └─ MemberValue
              └─ PrimaryExpression
                 └─ PrimaryPrefix
                    └─ Literal '"v"'
         └─ MemberValuePair "on"
            └─ MemberValue
               └─ PrimaryExpression
                  └─ PrimaryPrefix
                     └─ Literal
                        └─ BooleanLiteral [ @True = true() ]
└─ Annotation "A"
   ├─ ClassType "A"
   └─ AnnotationMemberList
      ├─ MemberValuePair "value" [ @Shorthand = false() ]
        └─ StringLiteral '"v"'
      └─ MemberValuePair "on"
         └─ BooleanLiteral [ @True = true() ]
Annotation nesting
Annotation nesting Examples
CodeOld AST (PMD 6)New AST (PMD 7)
Method
@A
public void set(int x) { }
└─ ClassOrInterfaceBodyDeclaration
   ├─ Annotation "A"
     └─ MarkerAnnotation "A"
        └─ Name "A"
   └─ MethodDeclaration
      ├─ ResultType[ @Void = true ]
      ├─ ...
└─ MethodDeclaration
   ├─ ModifierList
     └─ Annotation "A"
        └─ ClassType "A"
   ├─ VoidType
   ├─ ...
Top-level type declaration
@A class C {}
└─ TypeDeclaration
   ├─ Annotation "A"
     └─ MarkerAnnotation "A"
        └─ Name "A"
   └─ ClassOrInterfaceDeclaration "C"
      └─ ClassOrInterfaceBody
└─ ClassDeclaration
    ├─ ModifierList
      └─ Annotation "A"
         └─ ClassType "A"
    └─ ClassBody
Cast expression
var x = (@A T.@B S) expr;
└─ CastExpression
   ├─ Annotation "A"
     └─ MarkerAnnotation "A"
        └─ Name "A"
   ├─ Type
     └─ ReferenceType
        └─ ClassOrInterfaceType "T.S"
           └─ Annotation "B"
              └─ MarkerAnnotation "B"
                 └─ Name "B"
   └─ PrimaryExpression
      └─ PrimaryPrefix
         └─ Name "expr"
└─ CastExpression
   ├─ ClassType "S"
     ├─ ClassType "T"
       └─ Annotation "A"
          └─ ClassType "A"
     └─ Annotation "B"
        └─ ClassType "B"
   └─ VariableAccess "expr"
Cast expression with intersection
var x = (@A T & S) expr;
└─ CastExpression
   ├─ Annotation "A"
     └─ MarkerAnnotation "A"
        └─ Name "A"
   ├─ Type
     └─ ReferenceType
        └─ ClassOrInterfaceType "T"
   ├─ ReferenceType
     └─ ClassOrInterfaceType "S"
   └─ PrimaryExpression
      └─ PrimaryPrefix
         └─ Name "expr"
└─ CastExpression
  ├─ IntersectionType
    ├─ ClassType "T"
      └─ Annotation "A"
         └─ ClassType "A"
    └─ ClassType "S"
  └─ VariableAccess "expr"
Notice @A binds to T, not T & S
Constructor call
new @A T()
└─ AllocationExpression
   ├─ Annotation "A"
     └─ MarkerAnnotation "A"
        └─ Name "A"
   ├─ ClassOrInterfaceType "T"
   └─ Arguments
└─ ConstructorCall
   ├─ ClassType "T"
     └─ Annotation "A"
        └─ ClassType "A"
   └─ ArgumentList
Array allocation
new @A int[0]
└─ AllocationExpression
   ├─ Annotation "A"
     └─ MarkerAnnotation "A"
        └─ Name "A"
   ├─ PrimitiveType "int"
   └─ ArrayDimsAndInits
      └─ Expression
         └─ PrimaryExpression
            └─ PrimaryPrefix
               └─ Literal "0"
└─ ArrayAllocation
   └─ ArrayType
      ├─ PrimitiveType "int"
        └─ Annotation "A"
           └─ ClassType "A"
      └─ ArrayDimensions
         └─ ArrayDimExpr
            └─ NumericLiteral "0"
Array type
@A int @B[] x;
└─ LocalVariableDeclaration
   ├─ Annotation "A"
     └─ MarkerAnnotation "A"
        └─ Name "A"
   ├─ Type[ @ArrayType = true() ]
     └─ ReferenceType
        ├─ PrimitiveType "int"
        └─ Annotation "B"
           └─ MarkerAnnotation "B"
              └─ Name "B"
   └─ VariableDeclarator
      └─ VariableDeclaratorId "x"
└─ LocalVariableDeclaration
  ├─ ModifierList
    └─ Annotation "A"
       └─ ClassType "A"
  ├─ ArrayType
    ├─ PrimitiveType "int"
    └─ ArrayDimensions
       └─ ArrayTypeDim
          └─ Annotation "B"
             └─ ClassType "B"
  └─ VariableDeclarator
     └─ VariableId "x"
Type parameters
<@A T, @B S extends @C Object>
└─ TypeParameters
   ├─ TypeParameter "T"
     └─ Annotation "A"
        └─ MarkerAnnotation "A"
           └─ Name "A"
   └─ TypeParameter "S"
      ├─ Annotation "B"
        └─ MarkerAnnotation "B"
           └─ Name "B"
      └─ TypeBound
         ├─ Annotation "C"
           └─ MarkerAnnotation "C"
              └─ Name "C"
         └─ ClassOrInterfaceType "Object"
└─ TypeParameters
   ├─ TypeParameter "T"
     └─ Annotation "A"
        └─ ClassType "A"
   └─ TypeParameter "S" [ @TypeBound = true() ]
      ├─ Annotation "B"
        └─ ClassType "B"
      └─ ClassType "Object"
         └─ Annotation "C"
            └─ ClassType "C"
  • TypeParameters now only can have TypeParameter as a child
  • Annotations that apply to the param are in the param
  • Annotations that apply to the bound are in the type
  • This removes the need for TypeBound, because annotations are cleanly placed.
Enum constants
enum E {
  @A E1, @B E2;
}
└─ EnumBody
  ├─ Annotation "A"
    └─ MarkerAnnotation "A"
       └─ Name "A"
  ├─ EnumConstant "E1"
  ├─ Annotation "B"
    └─ MarkerAnnotation "B"
       └─ Name "B"
  └─ EnumConstant "E2"
└─ EnumBody
   ├─ EnumConstant "E1"
     ├─ ModifierList
       └─ Annotation "A"
          └─ ClassType "A"
     └─ VariableId "E1"
   └─ EnumConstant "E2"
      ├─ ModifierList
        └─ Annotation "B"
           └─ ClassType "B"
      └─ VariableId "E2"
  • Annotations are not just randomly in the enum body anymore

Types

Type and ReferenceType
  • What:
  • Why:
    • some syntactic contexts only allow reference types, other allow any kind of type. If you want to match all types of a program, then matching Type would be the intuitive solution. But in 6.0.x, it wouldn’t have sufficed, since in some contexts, no Type node was pushed, only a ReferenceType
    • Regardless of the original syntactic context, any reference type is a type, and searching for ASTType should yield all the types in the tree.
    • Using interfaces allows to abstract behaviour and make a nicer and safer API.
  • Migrating
    • There is currently no way to match abstract types (or interfaces) with XPath, so Type and ReferenceType name tests won’t match anything anymore.
    • Type/ReferenceType/ClassOrInterfaceType ➡️ ClassType
    • Type/PrimitiveType ➡️ PrimitiveType.
    • Type/ReferenceType[@ArrayDepth > 1]/ClassOrInterfaceType ➡️ ArrayType/ClassType.
    • Type/ReferenceType/PrimitiveType ➡️ ArrayType/PrimitiveType.
    • Note that in most cases you should check the type of a variable with e.g. VariableId[pmd-java:typeIs("java.lang.String[]")] because it considers the additional dimensions on declarations like String foo[];. The Java equivalent is TypeHelper.isA(id, String[].class);
Type and ReferenceType Examples
CodeOld AST (PMD 6)New AST (PMD 7)
// in the context of a variable declaration
List<String> strs;
└─ Type (1)
   └─ ReferenceType
      └─ ClassOrInterfaceType "List"
         └─ TypeArguments
            └─ TypeArgument
               └─ ReferenceType (2)
                  └─ ClassOrInterfaceType "String"
  1. Notice that there is a Type node here, since a local var can have a primitive type.
  2. In contrast, notice that there is no Type here, since only reference types are allowed as type arguments.
└─ ClassType "List"
   └─ TypeArguments
      └─ ClassType "String"
  • ClassType implements ASTReferenceType, which implements ASTType.
Array changes
Array Examples
CodeOld AST (PMD 6)New AST (PMD 7)
String[][] myArray;
└─ Type[ @ArrayType = true() ]
   └─ ReferenceType
      └─ ClassOrInterfaceType[ @Array = true() ][ @ArrayDepth = 2 ] "String"
└─ ArrayType[ @ArrayDepth = 2 ]
   ├─ ClassType "String"
   └─ ArrayDimensions[ @Size = 2 ]
      ├─ ArrayTypeDim
      └─ ArrayTypeDim
String @Annotation1[] @Annotation2[] myArray;
└─ Type[ @ArrayType = true() ]
   └─ ReferenceType
      ├─ ClassOrInterfaceType[ @Array = true() ][ @ArrayDepth = 2 ] "String"
      ├─ Annotation "Annotation1"
        └─ MarkerAnnotation "Annotation1"
           └─ Name "Annotation1"
      └─ Annotation "Annotation2"
         └─ MarkerAnnotation "Annotation2"
            └─ Name "Annotation2"
└─ ArrayType[ @ArrayDepth = 2 ]
   ├─ ClassType "String"
   └─ ArrayDimensions[ @Size = 2 ]
      ├─ ArrayTypeDim
        └─ Annotation "Annotation1"
           └─ ClassType "Annotation1"
      └─ ArrayTypeDim
         └─ Annotation "Annotation2"
            └─ ClassType "Annotation2"
new int[2][];
new @Bar int[3][2];
new Foo[] { f, g };
└─ AllocationExpression
   ├─ PrimitiveType "int"
   └─ ArrayDimsAndInits[ @ArrayDepth = 2 ]
      └─ Expression
         └─ PrimaryExpression
            └─ PrimaryPrefix
               └─ Literal "2"

└─ AllocationExpression
   ├─ Annotation "Bar"
     └─ MarkerAnnotation "Bar"
        └─ Name "Bar"
   ├─ PrimitiveType "int"
   └─ ArrayDimsAndInits[ @ArrayDepth = 2 ]
      ├─ Expression
        └─ PrimaryExpression
           └─ PrimaryPrefix
              └─ Literal "3"
      └─ Expression
         └─ PrimaryExpression
            └─ PrimaryPrefix
               └─ Literal "2"

└─ AllocationExpression
   ├─ ClassOrInterfaceType "Foo"
   └─ ArrayDimsAndInits[ @ArrayDepth = 1 ]
      └─ ArrayInitializer
         ├─ VariableInitializer
           └─ Expression
              └─ PrimaryExpression
                 └─ PrimaryPrefix
                    └─ Name "f"
         └─ VariableInitializer
            └─ Expression
               └─ PrimaryExpression
                  └─ PrimaryPrefix
                     └─ Name "g"
└─ ArrayAllocation[ @ArrayDepth = 2 ]
   └─ ArrayType[ @ArrayDepth = 2 ]
      ├─ PrimitiveType "int"
      └─ ArrayDimensions[ @Size = 2]
         ├─ ArrayDimExpr
           └─ NumericLiteral "2"
         └─ ArrayTypeDim

└─ ArrayAllocation[ @ArrayDepth = 2 ]
   └─ ArrayType[ @Array Depth = 2 ]
      ├─ PrimitiveType "int"
        └─ Annotation "Bar"
           └─ ClassType "Bar"
      └─ ArrayDimensions[ @Size = 2 ]
         ├─ ArrayDimExpr
           └─ NumericLiteral "3"
         └─ ArrayDimExpr
            └─ NumericLiteral "2"

└─ ArrayAllocation[ @ArrayDepth = 1 ]
   └─ ArrayType[ @ArrayDepth = 1 ]
     ├─ ClassType "Foo"
     └─ ArrayDimensions[ @Size = 1 ]
        └─ ArrayTypeDim
   └─ ArrayInitializer[ @Length = 2 ]
      ├─ VariableAccess "f"
      └─ VariableAccess "g"
ClassType nesting
ClassType Examples
CodeOld AST (PMD 6)New AST (PMD 7)
Map.Entry<K,V>
└─ ClassOrInterfaceType "Map.Entry"
   └─ TypeArguments
      ├─ TypeArgument
        └─ ReferenceType
           └─ ClassOrInterfaceType "K"
      └─ TypeArgument
         └─ ReferenceType
            └─ ClassOrInterfaceType "V"
└─ ClassType "Entry"
   ├─ ClassType "Map"
   └─ TypeArguments[ @Size = 2 ]
      ├─ ClassType "K"
      └─ ClassType "V"
First<K>.Second.Third<V>
└─ ClassOrInterfaceType "First.Second.Third"
   ├─ TypeArguments
     └─ TypeArgument
        └─ ReferenceType
           └─ ClassOrInterfaceType "K"
   └─ TypeArguments
      └─ TypeArgument
         └─ ReferenceType
            └─ ClassOrInterfaceType "V"
└─ ClassType "Third"
   ├─  ClassType "Second"
      └─ ClassType "First"
         └─ TypeArguments[ @Size = 1]
            └─ ClassType "K"
   └─ TypeArguments[ @Size = 1 ]
      └─ ClassType "V"
TypeArgument and WildcardType
  • What:
  • Why: Because wildcard types are types in their own right, and having a node to represent them skims several levels of nesting off.
TypeArgument and WildcardType Examples
CodeOld AST (PMD 6)New AST (PMD 7)
Entry<String, ? extends Node>
└─ ClassOrInterfaceType "Entry"
   └─ TypeArguments
      ├─ TypeArgument
        └─ ReferenceType
           └─ ClassOrInterfaceType "String"
      └─ TypeArgument[ @Wildcard = true() ]
         └─ WildcardBounds[ @UpperBound = true() ]
            └─ ReferenceType
               └─ ClassOrInterfaceType "Node"
└─ ClassType "Entry"
   └─ TypeArguments[ @Size = 2 ]
      ├─ ClassType "String"
      └─ WildcardType[ @UpperBound = true() ]
         └─ ClassType "Node"
List<?>
└─ ClassOrInterfaceType "List"
   └─ TypeArguments
      └─ TypeArgument[ @Wildcard = true() ]
└─ ClassType "List"
   └─ TypeArguments[ @Size = 1 ]
      └─ WildcardType[ @UpperBound = true() ]

Declarations

Import and Package declarations
  • What: Remove the Name node in imports and package declaration nodes.
  • Why: Name is a TypeNode, but it’s equivalent to ASTAmbiguousName in that it describes nothing about what it represents. The name in an import may represent a method name, a type name, a field name… It’s too ambiguous to treat in the parser and could just be the image of the import, or package, or module.
  • Related issue: [java] Remove Name nodes in Import- and PackageDeclaration (#1888)
Import and Package declarations Examples
CodeOld AST (PMD 6)New AST (PMD 7)
import java.util.ArrayList;
import static java.util.Comparator.reverseOrder;
import java.util.*;
├─ ImportDeclaration
  └─ Name "java.util.ArrayList"
├─ ImportDeclaration[ @Static=true() ]
  └─ Name "java.util.Comparator.reverseOrder"
└─ ImportDeclaration[ @ImportOnDemand = true() ]
   └─ Name "java.util"
├─ ImportDeclaration "java.util.ArrayList"
├─ ImportDeclaration[ @Static = true() ] "java.util.Comparator.reverseOrder"
└─ ImportDeclaration[ @ImportOnDemand = true() ] "java.util"
package com.example.tool;
└─ PackageDeclaration
   └─ Name "com.example.tool"
└─ PackageDeclaration "com.example.tool"
   └─ ModifierList
Modifier lists
  • What: ModifierOwner (formerly AccessNode) is now based on a node: ASTModifierList. That node represents modifiers occurring before a declaration. It provides a flexible API to query modifiers, both explicit and implicit. All declaration nodes now have such a modifier list, even if it’s implicit (no explicit modifiers).
  • Why: ModifierOwner (formerly AccessNode) gave a lot of irrelevant methods to its subtypes. E.g. ASTFieldDeclaration::isSynchronized makes no sense. Now, these irrelevant methods don’t clutter the API. The API of ModifierList is both more general and flexible.
  • Related issue: [java] Rework AccessNode (#2259)
Modifier lists Examples
CodeOld AST (PMD 6)New AST (PMD 7)
Method
@A
public void set(final int x, int y) { }
└─ ClassOrInterfaceBodyDeclaration
   ├─ Annotation "A"
     └─ MarkerAnnotation "A"
        └─ Name "A"
   └─ MethodDeclaration[ @Public = true() ] "set"
      ├─ ResultType[ @Void = true() ]
      └─ MethodDeclarator
         └─ FormalParameters[ @Size = 2 ]
            ├─ FormalParameter[ @Final = true() ]
              ├─ Type
                └─ PrimitiveType "int"
              └─ VariableDeclaratorId "x"
            └─ FormalParameter[ @Final = false() ]
               ├─ Type
                 └─ PrimitiveType "int"
               └─ VariableDeclaratorId "y"
└─ MethodDeclaration[ pmd-java:modifiers() = 'public' ] "set"
   ├─ ModifierList
     └─ Annotation "A"
        └─ ClassType "A"
   ├─ VoidType
   └─ FormalParameters
      ├─ FormalParameter[ pmd-java:modifiers() = 'final' ]
        ├─ ModifierList
        └─ VariableId "x"
      └─ FormalParameter[ pmd-java:modifiers() = () ]
         ├─ ModifierList
         └─ VariableId "y"
Top-level type declaration
public @A class C {}
└─ TypeDeclaration
   ├─ Annotation "A"
     └─ MarkerAnnotation "A"
        └─ Name "A"
   └─ ClassOrInterfaceDeclaration[ @Public = true() ] "C"
      └─ ClassOrInterfaceBody
└─ ClassDeclaration[ pmd-java:modifiers() = 'public' ] "C"
   ├─ ModifierList
     └─ Annotation "A"
        └─ ClassType "A"
   └─ ClassBody
Flattened body declarations
Flattened body declarations Examples
CodeOld AST (PMD 6)New AST (PMD 7)
public class Flat {
    private int f;
}
└─ CompilationUnit
   └─ TypeDeclaration
      └─ ClassOrInterfaceDeclaration "Flat"
         └─ ClassOrInterfaceBody
            └─ ClassOrInterfaceBodyDeclaration
               └─ FieldDeclaration
                  ├─ Type
                    └─ PrimitiveType "int"
                  └─ VariableDeclarator
                     └─ VariableDeclaratorId "f"
└─ CompilationUnit
   └─ ClassDeclaration "Flat"
      ├─ ModifierList
      └─ ClassBody
         └─ FieldDeclaration
            ├─ ModifierList
            ├─ PrimitiveType "int"
            └─ VariableDeclarator
               └─ VariableId "f"
public @interface FlatAnnotation {
    String value() default "";
}
└─ CompilationUnit
   └─ TypeDeclaration
      └─ AnnotationTypeDeclaration "FlatAnnotation"
         └─ AnnotationTypeBody
            └─ AnnotationTypeMemberDeclaration
               └─ AnnotationMethodDeclaration "value"
                  ├─ Type
                    └─ ReferenceType
                       └─ ClassOrInterfaceType "String"
                  └─ DefaultValue
                     └─ MemberValue
                        └─ PrimaryExpression
                           └─ PrimaryPrefix
                              └─ Literal "\"\""
└─ CompilationUnit
   └─ AnnotationTypeDeclaration "FlatAnnotation"
      ├─ ModifierList
      └─ AnnotationTypeBody
         └─ MethodDeclaration "value"
            ├─ ModifierList
            ├─ ClassType "String"
            ├─ FormalParameters
            └─ DefaultValue
               └─ StringLiteral "\"\""
Module declarations
  • What: Removes the generic Name node and uses instead ASTClassType where appropriate. Also uses specific node types for different directives (requires, exports, uses, provides).
  • Why: Simplify queries, support type resolution
  • Related issue: [java] Improve module grammar (#3890)
Module declarations Examples
CodeOld AST (PMD 6)New AST (PMD 7)
open module com.example.foo {
    requires com.example.foo.http;
    requires java.logging;
    requires transitive com.example.foo.network;

    exports com.example.foo.bar;
    exports com.example.foo.internal to com.example.foo.probe;

    uses com.example.foo.spi.Intf;

    provides com.example.foo.spi.Intf with com.example.foo.Impl;
}
└─ CompilationUnit
   └─ ModuleDeclaration[ @Image = 'com.example.foo' ][ @Open = true() ]
      ├─ ModuleDirective[ @Type = 'REQUIRES' ]
        └─ ModuleName[ @Image = 'com.example.foo.http' ]
      ├─ ModuleDirective[ @Type = 'REQUIRES' ]
        └─ ModuleName[ @Image = 'java.logging' ]
      ├─ ModuleDirective[ @Type = 'REQUIRES' ][ @RequiresModifier = 'TRANSITIVE' ]
        └─ ModuleName[ @Image = 'com.example.foo.network' ]
      ├─ ModuleDirective[ @Type = 'EXPORTS' ]
        └─ Name[ @Image = 'com.example.foo.bar' ]
      ├─ ModuleDirective[ @Type = 'EXPORTS' ]
        ├─ Name[ @Image = 'com.example.foo.internal' ]
        └─ ModuleName[ @Image = 'com.example.foo.probe' ]
      ├─ ModuleDirective[ @Type = 'USES' ]
        └─ Name[ @Image = 'com.example.foo.spi.Intf' ]
      └─ ModuleDirective[ @Type = 'PROVIDES' ]
         ├─ Name[ @Image = 'com.example.foo.spi.Intf' ]
         └─ Name[ @Image = 'com.example.foo.Impl' ]
└─ CompilationUnit
   └─ ModuleDeclaration[ @Name = 'com.example.foo' ][ @Open = true() ]
      ├─ ModuleName[ @Name = 'com.example.foo' ]
      ├─ ModuleRequiresDirective
        └─ ModuleName[ @Name = 'com.example.foo.http' ]
      ├─ ModuleRequiresDirective
        └─ ModuleName[ @Name = 'java.logging' ]
      ├─ ModuleRequiresDirective[ @Transitive = true ]
        └─ ModuleName[ @Name = 'com.example.foo.network' ]
      ├─ ModuleExportsDirective[ @PackageName = 'com.example.foo.bar' ]
      ├─ ModuleExportsDirective[ @PackageName = 'com.example.foo.internal' ]
        └─ ModuleName [ @Name = 'com.example.foo.probe' ]
      ├─ ModuleUsesDirective
        └─ ClassType[ pmd-java:typeIs("com.example.foo.spi.Intf") ]
      └─ ModuleProvidesDirective
         ├─ ClassType[ pmd-java:typeIs("com.example.foo.spi.Intf") ]
         └─ ClassType[ pmd-java:typeIs("com.example.foo.Impl") ]
Anonymous class declarations
Anonymous class declarations Examples
CodeOld AST (PMD 6)New AST (PMD 7)
Object anonymous = new Object() {  };
└─ LocalVariableDeclaration
   ├─ Type
     └─ ReferenceType
        └─ ClassOrInterfaceType[ @Image = 'Object' ]
   └─ VariableDeclarator
      ├─ VariableDeclaratorId "anonymous"
      └─ VariableInitializer
         └─ Expression
            └─ PrimaryExpression
               └─ PrimaryPrefix
                  └─ AllocationExpression
                     ├─ ClassOrInterfaceType[ @AnonymousClass = true() ][ @Image = 'Object' ]
                     ├─ Arguments
                     └─ ClassOrInterfaceBody
└─ LocalVariableDeclaration
   ├─ ModifierList
   ├─ ClassType[ @SimpleName = 'Object' ]
   └─ VariableDeclarator
      ├─ VariableId[ @Name = 'anonymous' ]
      └─ ConstructorCall
         ├─ ClassType[ @SimpleName = 'Object' ]
         ├─ ArgumentList
         └─ AnonymousClassDeclaration
            ├─ ModifierList
            └─ ClassBody

Method and Constructor declarations

Method grammar simplification
  • What: Simplify and align the grammar used for method and constructor declarations. The methods in an annotation type are now also method declarations.
  • Why: The method declaration had a nested node “MethodDeclarator”, which was not available for constructor declarations. This made it difficult to write rules, that concern both methods and constructors without explicitly differentiate between these two.
  • Related issue: [java] Align method and constructor declaration grammar (#2034)
Method grammar Examples
CodeOld AST (PMD 6)New AST (PMD 7)
public class Sample {
    public Sample(int arg) throws Exception {
        super();
        greet(arg);
    }
    public void greet(int arg) throws Exception {
        System.out.println("Hello");
    }
}
└─ ClassOrInterfaceBody
   ├─ ClassOrInterfaceBodyDeclaration
     └─ ConstructorDeclaration[ @Image = 'Sample' ]
        ├─ FormalParameters
          └─ FormalParameter
             ├─ ...
        ├─ NameList
          └─ Name[ @Image = 'Exception' ]
        ├─ ExplicitConstructorInvocation
          └─ Arguments
        └─ BlockStatement
           └─ Statement
              └─ ...
   └─ ClassOrInterfaceBodyDeclaration
      └─ MethodDeclaration[ @Name = 'greet' ]
         ├─ ResultType
         ├─ MethodDeclarator[ @Image = 'greet' ]
           └─ FormalParameters
              └─ FormalParameter
                 ├─ ...
         ├─ NameList
           └─ Name[ @Image = 'Exception' ]
         └─ Block
            └─ BlockStatement
               └─ Statement
                  └─ ...
└─ ClassBody
   ├─ ConstructorDeclaration[ @Name = 'Sample' ]
     ├─ ModifierList
     ├─ FormalParameters
       └─ FormalParameter
          ├─ ...
     ├─ ThrowsList
       └─ ClassType[ @SimpleName = 'Exception' ]
     └─ Block
        ├─ ExplicitConstructorInvocation
          └─ ArgumentList
        └─ ExpressionStatement
           └─ ...
   └─ MethodDeclaration[ @Name = 'greet' ]
      ├─ ModifierList
      ├─ VoidType
      ├─ FormalParameters
        └─ FormalParameter
           ├─ ...
      ├─ ThrowsList
        └─ ClassType[ @SimpleName = 'Exception' ]
      └─ Block
         └─ ExpressionStatement
            └─ ...
public @interface MyAnnotation {
    int value() default 1;
}
└─ AnnotationTypeDeclaration[ @SimpleName = 'MyAnnotation' ]
   └─ AnnotationTypeBody
      └─ AnnotationTypeMemberDeclaration
         └─ AnnotationMethodDeclaration[ @Image = 'value' ]
            ├─ Type ...
            └─ DefaultValue ...
└─ AnnotationTypeDeclaration[ @SimpleName = 'MyAnnotation' ]
   ├─ ModifierList
   └─ AnnotationTypeBody
      └─ MethodDeclaration[ @Name = 'value' ]
         ├─ ModifierList
         ├─ PrimitiveType
         ├─ FormalParameters
         └─ DefaultValue ...
Formal parameters
  • What: Use ASTFormalParameter only for method and constructor declaration. Lambdas use ASTLambdaParameter, catch clauses use ASTCatchParameter.
  • Why: FormalParameter’s API is different from the other ones.
    • FormalParameter must mention a type node.
    • LambdaParameter can be inferred
    • CatchParameter cannot be varargs
    • CatchParameter can have multiple exception types (a ASTUnionType now)
Formal parameters Examples
CodeOld AST (PMD 6)New AST (PMD 7)
try {

} catch (@A IOException | IllegalArgumentException e) {

}
└─ TryStatement
   ├─ Block
   └─ CatchStatement
      ├─ FormalParameter
        ├─ Annotation[ @AnnotationName = 'A' ]
          └─ MarkerAnnotation[ @AnnotationName = 'A' ]
             └─ Name[ @Image = 'A' ]
        ├─ Type
          └─ ReferenceType
             └─ ClassOrInterfaceType[ @Image = 'IOException' ]
        ├─ Type
          └─ ReferenceType
             └─ ClassOrInterfaceType[ @Image = 'IllegalArgumentException' ]
        └─ VariableDeclaratorId[ @Name = 'e' ]
      └─ Block
└─ TryStatement
   ├─ Block
   └─ CatchClause
      ├─ CatchParameter
        ├─ ModifierList
          └─ Annotation[ @SimpleName = 'A' ]
             └─ ClassType[ @SimpleName = 'A' ]
        ├─ UnionType
          ├─ ClassType[ @SimpleName = 'IOException' ]
          └─ ClassType[ @SimpleName = 'IllegalArgumentException' ]
        └─ VariableId[ @Name = 'e' ]
      └─ Block
(a, b) -> {};
c -> {};
(@A var d) -> {};
(@A int e) -> {};
└─ StatementExpression
   └─ PrimaryExpression
      └─ PrimaryPrefix
         └─ LambdaExpression
            ├─ VariableDeclaratorId[ @Name = 'a' ]
            ├─ VariableDeclaratorId[ @Name = 'b' ]
            └─ Block

└─ StatementExpression
   └─ PrimaryExpression
      └─ PrimaryPrefix
         └─ LambdaExpression
            ├─ VariableDeclaratorId[ @Name = 'c' ]
            └─ Block

└─ StatementExpression
   └─ PrimaryExpression
      └─ PrimaryPrefix
         └─ LambdaExpression
            ├─ FormalParameters
              └─ FormalParameter
                 ├─ Annotation[ @AnnotationName = 'A' ]
                   └─ MarkerAnnotation[ @AnnotationName = 'A' ]
                      └─ Name[ @Image = 'A' ]
                 └─ VariableDeclaratorId[ @Name = 'd' ]
            └─ Block

└─ StatementExpression
   └─ PrimaryExpression
      └─ PrimaryPrefix
         └─ LambdaExpression
            ├─ FormalParameters
              └─ FormalParameter
                 ├─ Annotation[ @AnnotationName = 'A' ]
                   └─ MarkerAnnotation[ @AnnotationName = 'A' ]
                      └─ Name[ @Image = 'A' ]
                 ├─ Type
                   └─ PrimitiveType[ @Image = 'int' ]
                 └─ VariableDeclaratorId[ @Name = 'e' ]
            └─ Block
└─ ExpressionStatement
   └─ LambdaExpression
      ├─ LambdaParameterList
        ├─ LambdaParameter
          ├─ ModifierList
          └─ VariableId[ @Name = 'a' ]
        └─ LambdaParameter
           ├─ ModifierList
           └─ VariableId[ @Name = 'b' ]
      └─ Block

└─ ExpressionStatement
   └─ LambdaExpression
      ├─ LambdaParameterList
        └─ LambdaParameter
           ├─ ModifierList
           └─ VariableId[ @Name = 'c' ]
      └─ Block

└─ ExpressionStatement
   └─ LambdaExpression
      ├─ LambdaParameterList
        └─ LambdaParameter
           ├─ ModifierList
             └─ Annotation[ @SimpleName = 'A' ]
                └─ ClassType[ @SimpleName = 'A' ]
           └─ VariableId[ @Name = 'd' ]
      └─ Block

└─ ExpressionStatement
   └─ LambdaExpression
      ├─ LambdaParameterList
        └─ LambdaParameter
           ├─ ModifierList
             └─ Annotation[ @SimpleName = 'A' ]
                └─ ClassType[ @SimpleName = 'A' ]
           ├─ PrimitiveType[ @Kind = 'int' ]
           └─ VariableId[ @Name = 'e' ]
      └─ Block
New node for explicit receiver parameter
explicit receiver parameter Examples
CodeOld AST (PMD 6)New AST (PMD 7)
void myMethod(@A Foo this, Foo other) {}
└─ FormalParameters (1)
   ├─ FormalParameter[ @ExplicitReceiverParameter = true() ]
     ├─ Annotation "A"
       └─ MarkerAnnotation "A"
          └─ Name "A"
     ├─ Type
       └─ ReferenceType
          └─ ClassOrInterfaceType "Foo"
     └─ VariableDeclaratorId[ @ExplicitReceiverParameter = true() ] "this"
   └─ FormalParameter
      ├─ Type
        └─ ReferenceType
           └─ ClassOrInterfaceType "Foo"
      └─ VariableDeclaratorId "other"
└─ FormalParameters (1)
   ├─ ReceiverParameter
     └─ ClassType "Foo"
        └─ Annotation "A"
           └─ ClassType "A"
   └─ FormalParameter
      ├─ ModifierList
      ├─ ClassType "Foo"
      └─ VariableId "other"
Varargs
  • What: parse the varargs ellipsis as an ASTArrayType.
  • Why: this improves regularity of the grammar, and allows type annotations to be added to the ellipsis
Varargs Examples
CodeOld AST (PMD 6)New AST (PMD 7)
void myMethod(int... is) {}
└─ FormalParameter[ @Varargs = true() ]
   ├─ Type
     └─ PrimitiveType "int"
   └─ VariableDeclaratorId "is"
└─ FormalParameter[ @Varargs = true() ]
   ├─ ModifierList
   ├─ ArrayType
     ├─ PrimitiveType "int"
     └─ ArrayDimensions
        └─ ArrayTypeDim[ @Varargs = true() ]
   └─ VariableId "is"
void myMethod(int @A ... is) {}
└─ FormalParameter[ @Varargs = true() ]
   ├─ Type
     └─ PrimitiveType "int"
   ├─ Annotation "A"
     └─ MarkerAnnotation "A"
        └─ Name "A"
   └─ VariableDeclaratorId "is"
└─ FormalParameter[ @Varargs = true() ]
   ├─ ModifierList
   ├─ ArrayType
     ├─ PrimitiveType "int"
     └─ ArrayDimensions
        └─ ArrayTypeDim[ @Varargs = true() ]
           └─ Annotation "A"
              └─ ClassType "A"
   └─ VariableId "is"
void myMethod(int[]... is) {}
└─ FormalParameter[ @Varargs = true() ]
   ├─ Type[ @ArrayType = true() ]
     └─ ReferenceType
        └─ PrimitiveType "int"
   └─ VariableDeclaratorId "is"
└─ FormalParameter[ @Varargs = true() ]
   ├─ ModifierList
   ├─ ArrayType (2)
     ├─ PrimitiveType "int"
     └─ ArrayDimensions (2)
        ├─ ArrayTypeDim
        └─ ArrayTypeDim[ @Varargs = true() ]
   └─ VariableId "is"
Add void type node to replace ResultType
Void Type Examples
CodeOld AST (PMD 6)New AST (PMD 7)
void foo();
└─ MethodDeclaration "foo"
   ├─ ResultType[ @Void = true() ]
   └─ MethodDeclarator
      └─ FormalParameters
└─ MethodDeclaration "foo"
   ├─ ModifierList
   ├─ VoidType
   └─ FormalParameters
int foo();
└─ MethodDeclaration "foo"
   ├─ ResultType[ @Void = false() ]
     └─ Type
        └─ PrimitiveType "int"
   └─ MethodDeclarator
      └─ FormalParameters
└─ MethodDeclaration "foo"
   ├─ ModifierList
   ├─ PrimitiveType "int"
   └─ FormalParameters

Statements

Statements are flattened
  • What: Statements are flattened. There are no superfluous BlockStatement and Statement nodes anymore. All children of a ASTBlock are by definition ASTStatements, which is now an interface implemented by all statements.
  • Why: This simplifies the tree traversal. The removed nodes BlockStatement and Statement didn’t add any additional information. We only need a Statement abstraction. BlockStatement was used to enforce, that no variable or local class declaration is found alone as the child of e.g. an unbraced if, else, for, etc. This is a parser-only distinction that’s not that useful for analysis later on.
  • Related issue: [java] Improve statement grammar (#2164)
Statements Examples
CodeOld AST (PMD 6)New AST (PMD 7)
int i;
i = 1;
└─ Block
   ├─ BlockStatement
     └─ LocalVariableDeclaration
        ├─ Type
          └─ PrimitiveType "int"
        └─ VariableDeclarator
           └─ VariableDeclaratorId "i"
   └─ BlockStatement
      └─ Statement
         └─ StatementExpression
            ├─ PrimaryExpression
              └─ PrimaryPrefix
                 └─ Name "i"
            ├─ AssignmentOperator "="
            └─ Expression
               └─ PrimaryExpression
                  └─ PrimaryPrefix
                     └─ Literal "1"
└─ Block
   ├─ LocalVariableDeclaration
     ├─ ModifierList
     ├─ PrimitiveType "int"
     └─ VariableDeclarator
        └─ VariableId "i"
   └─ ExpressionStatement
      └─ AssignmentExpression "="
         ├─ VariableAccess "i"
         └─ NumericLiteral "1"
New node for For-each statements
  • What: New node for For-each statements: ASTForeachStatement instead of ForStatement.
  • Why: This makes it a lot easier to distinguish in the AST between For-loops and For-Each-loops. E.g. some rules only apply to one or the other, and it was complicated to write a rule that works with both different subtrees (for loops have additional children ForInit and ForUpdate)
  • Related issue: [java] Improve statement grammar (#2164)
For-each statement Examples
CodeOld AST (PMD 6)New AST (PMD 7)
for (String s : List.of("a", "b")) { }
└─ BlockStatement
   └─ Statement
      └─ ForStatement[ @Foreach = true() ]
         ├─ LocalVariableDeclaration
           ├─ Type
             └─ ReferenceType
                └─ ClassOrInterfaceType "String"
           └─ VariableDeclarator
              └─ VariableDeclaratorId "s"
         ├─ Expression
           └─ PrimaryExpression
              ├─ PrimaryPrefix
                └─ Name "List.of"
              └─ PrimarySuffix
                 └─ Arguments (2)
                    └─ ArgumentList (2)
                       ├─ Expression
                         └─ PrimaryExpression
                            └─ PrimaryPrefix
                               └─ Literal[ @StringLiteral = true() ][ @Image = '"a"' ]
                       └─ Expression
                          └─ PrimaryExpression
                             └─ PrimaryPrefix
                                └─ Literal[ @StringLiteral = true() ][ @Image = '"b"' ]
         └─ Statement
            └─ Block
└─ Block
   └─ ForeachStatement
      ├─ LocalVariableDeclaration
        ├─ ModifierList
        ├─ ClassType "String"
        └─ VariableDeclarator "s"
           └─ VariableId "s"
      ├─ MethodCall "of"
        ├─ TypeExpression
          └─ ClassType "List"
        └─ ArgumentList (2)
           ├─ StringLiteral[ @Image = '"a"' ]
           └─ StringLiteral[ @Image = '"b"' ]
      └─ Block
New nodes for ExpressionStatement, LocalClassStatement
ExpressionStatement, LocalClassStatement Examples
CodeOld AST (PMD 6)New AST (PMD 7)
i++;
class LocalClass {}
└─ Block
   ├─ BlockStatement
     └─ Statement
        └─ StatementExpression
           └─ PostfixExpression "++"
              └─ PrimaryExpression
                 └─ PrimaryPrefix
                    └─ Name "i"
   └─ BlockStatement
      └─ ClassOrInterfaceDeclaration[ @Local = true() ] "LocalClass"
         └─ ClassOrInterfaceBody
└─ Block
   ├─ ExpressionStatement
     └─ UnaryExpression "++"
        └─ VariableAccess "i"
   └─ LocalClassStatement
      └─ ClassDeclaration "LocalClass"
         ├─ ModifierList
         └─ ClassBody
Improve try-with-resources grammar
Try-With-Resources Examples
CodeOld AST (PMD 6)New AST (PMD 7)
try (InputStream in = new FileInputStream(); OutputStream out = new FileOutputStream();) { }
└─ TryStatement
   └─ ResourceSpecification
      └─ Resources
         ├─ Resource
           ├─ Type
             └─ ReferenceType
                └─ ClassOrInterfaceType "InputStream"
           ├─ VariableDeclaratorId "in"
           └─ Expression
              └─ ...
         └─ Resource
            ├─ Type
              └─ ReferenceType
                 └─ ClassOrInterfaceType "OutputStream"
            ├─ VariableDeclaratorId "out"
            └─ Expression
               └─ ...
└─ TryStatement
   └─ ResourceList[ @TrailingSemiColon = true() ] (2)
      ├─ Resource[ @ConciseResource = false() ] "in"
        └─ LocalVariableDeclaration
           ├─ ModifierList
           ├─ ClassType "InputStream"
           └─ VariableDeclarator
              ├─ VariableId "in"
              └─ ConstructorCall
                 ├─ ClassType "FileInputStream"
                 └─ ArgumentList (0)
      └─ Resource[ @ConciseResource = false() ] "out"
         └─ LocalVariableDeclaration
            ├─ ModifierList
            ├─ ClassType "OutputStream"
            └─ VariableDeclarator
               ├─ VariableId "out"
               └─ ConstructorCall
                  ├─ ClassType "FileOutputStream"
                  └─ ArgumentList (0)
InputStream in = new FileInputStream();
try (in) {}
└─ TryStatement
   └─ ResourceSpecification
      └─ Resources
         └─ Resource "in"
            └─ Name "in"
└─ TryStatement
   └─ ResourceList[ @TrailingSemiColon = false() ] (1)
      └─ Resource[ @ConciseResource = true() ] "in"
         └─ VariableAccess "in"

Expressions

  • ASTExpression and ASTPrimaryExpression have been turned into interfaces. These added no information to the AST and increased its depth unnecessarily. All expressions implement the first interface. Both of those nodes can no more be found in ASTs.

  • Migrating:

    • Basically, Expression/X or Expression/PrimaryExpression/X, just becomes X
    • There is currently no way to match abstract or interface types with XPath, so Expression or PrimaryExpression name tests won’t match anything anymore. However, the axis step *[@Expression=true()] matches any expression.
New nodes for different literals types
Literals Examples
CodeOld AST (PMD 6)New AST (PMD 7)
char c = 'c';
boolean b = true;
int i = 1;
double d = 1.0;
String s = "s";
Object n = null;
└─ Literal[ @CharLiteral = true() ] "'c'"
└─ Literal
   └─ BooleanLiteral[ @True = true() ]
└─ Literal[ @IntLiteral = true() ] "1"
└─ Literal[ @DoubleLiteral = true() ] "1.0"
└─ Literal[ @StringLiteral = true() ] "\"s\""
└─ Literal
   └─ NullLiteral
└─ CharLiteral "'c'"
└─ BooleanLiteral[ @True = true() ]
└─ NumericLiteral[ @IntLiteral = true() ] "1"
└─ NumericLiteral[ @DoubleLiteral = true() ] "1.0"
└─ StringLiteral "\"s\""
└─ NullLiteral
Method calls, constructor calls, array allocations
Method calls, constructor calls, array allocations Examples
CodeOld AST (PMD 6)New AST (PMD 7)
o.myMethod("a");
new Object("b");
new int[10];
new int[] { 1, 2, 3 };
└─ PrimaryExpression
   ├─ PrimaryPrefix
     └─ Name "o.myMethod"
   └─ PrimarySuffix
      └─ Arguments
         └─ ArgumentList (1)
            └─ Expression
               └─ PrimaryExpression
                  └─ PrimaryPrefix
                     └─ Literal "\"a\""

└─ PrimaryExpression
   └─ PrimaryPrefix
      └─ AllocationExpression
         ├─ ClassOrInterfaceType "Object"
         └─ Arguments
            └─ ArgumentList
               └─ Expression
                  └─ PrimaryExpression
                     └─ PrimaryPrefix
                        └─ Literal "\"b\""

└─ PrimaryExpression
   └─ PrimaryPrefix
      └─ AllocationExpression
         ├─ PrimitiveType "int"
         └─ ArrayDimsAndInits
            └─ Expression
               └─ PrimaryExpression
                  └─ PrimaryPrefix
                     └─ Literal "10"

└─ PrimaryPrefix
   └─ AllocationExpression
      ├─ PrimitiveType "int"
      └─ ArrayDimsAndInits
         └─ ArrayInitializer
            ├─ VariableInitializer
              └─ Expression
                 └─ PrimaryExpression
                    └─ PrimaryPrefix
                       └─ Literal "1"
            ├─ VariableInitializer
              └─ Expression
                 └─ PrimaryExpression
                    └─ PrimaryPrefix
                       └─ Literal "2"
            └─ VariableInitializer
               └─ Expression
                  └─ PrimaryExpression
                     └─ PrimaryPrefix
                        └─ Literal "3"
└─ MethodCall "myMethod"
   ├─ VariableAccess "o"
   └─ ArgumentList (1)
      └─ StringLiteral "\"a\""

└─ ConstructorCall
   ├─ ClassType "Object"
   └─ ArgumentList (1)
      └─ StringLiteral "\"b\""

└─ ArrayAllocation[ @ArrayDepth = 1 ]
   └─ ArrayType
      ├─ PrimitiveType "int"
      └─ ArrayDimensions (1)
         └─ ArrayDimExpr
            └─ NumericLiteral "10"

└─ ArrayAllocation[ @ArrayDepth = 1 ]
   ├─ ArrayType
     ├─ PrimitiveType "int"
     └─ ArrayDimensions (1)
        └─ ArrayTypeDim
   └─ ArrayInitializer[ @Length = 3 ]
      ├─ NumericLiteral "1"
      ├─ NumericLiteral "2"
      └─ NumericLiteral "3"
Method call chains are left-recursive
Method call chain Examples
CodeOld AST (PMD 6)New AST (PMD 7)
new Foo().bar.foo(1);
└─ StatementExpression
   └─ PrimaryExpression
      ├─ PrimaryPrefix
        └─ AllocationExpression
           ├─ ClassOrInterfaceType "Foo"
           └─ Arguments (0)
      ├─ PrimarySuffix "bar"
      ├─ PrimarySuffix "foo"
      └─ PrimarySuffix[ @Arguments = true() ]
         └─ Arguments (1)
            └─ ArgumentList
               └─ Expression
                  └─ PrimaryExpression
                     └─ PrimaryPrefix
                        └─ Literal "1"
└─ ExpressionStatement
   └─ MethodCall "foo"
      ├─ FieldAccess "bar"
        └─ ConstructorCall
           ├─ ClassType "Foo"
           └─ ArgumentList (0)
      └─ ArgumentList (1)
         └─ NumericLiteral "1"

Instead of being flat, the subexpressions are now nested within one another. The nesting follows the naturally recursive structure of expressions:

new Foo().bar.foo(1)
└───────┘          ConstructorCall
└───────────┘       FieldAccess
└──────────────────┘ MethodCall

This makes the AST more regular and easier to navigate. Each node contains the other nodes that are relevant to it (e.g. arguments) instead of them being spread out over several siblings. The API of all nodes has been enriched with high-level accessors to query the AST in a semantic way, without bothering with the placement details.

The amount of changes in the grammar that this change entails is enormous, but hopefully firing up the designer to inspect the new structure should give you the information you need quickly.

Note: this doesn’t affect binary expressions like ASTAdditiveExpression. E.g. a+b+c is not parsed as

AdditiveExpression
+ AdditiveExpression
  + (a)
  + (b)
+ (c)  

It’s still

AdditiveExpression
+ (a)
+ (b)
+ (c)  

which is easier to navigate, especially from XPath.

Field access, array access, variable access
Field access, array access, variable access Examples
CodeOld AST (PMD 6)New AST (PMD 7)
field = 1;
localVar = 1;
array[0] = 1;
Foo.staticField = localVar;
└─ BlockStatement
   └─ Statement
      └─ StatementExpression
         ├─ PrimaryExpression
           └─ PrimaryPrefix
              └─ Name "field"
         ├─ AssignmentOperator "="
         └─ Expression
            └─ PrimaryExpression
               └─ PrimaryPrefix
                  └─ Literal "1"

└─ BlockStatement
   └─ Statement
      └─ StatementExpression
         ├─ PrimaryExpression
           └─ PrimaryPrefix
              └─ Name "localVar"
         ├─ AssignmentOperator "="
         └─ Expression
            └─ PrimaryExpression
               └─ PrimaryPrefix
                  └─ Literal "1"

└─ BlockStatement
   └─ Statement
      └─ StatementExpression
         ├─ PrimaryExpression
           ├─ PrimaryPrefix
             └─ Name "array"
           └─ PrimarySuffix[ @ArrayDereference = true() ]
              └─ Expression
                 └─ PrimaryExpression
                    └─ PrimaryPrefix
                       └─ Literal "0"
         ├─ AssignmentOperator "="
         └─ Expression
            └─ PrimaryExpression
               └─ PrimaryPrefix
                  └─ Literal "1"

└─ BlockStatement
   └─ Statement
      └─ StatementExpression
         ├─ PrimaryExpression
           └─ PrimaryPrefix
              └─ Name "Foo.staticField"
         ├─ AssignmentOperator "="
         └─ Expression
            └─ PrimaryExpression
               └─ PrimaryPrefix
                  └─ Name "localVar"
└─ ExpressionStatement
   └─ AssignmentExpression "="
      ├─ VariableAccess "field"
      └─ NumericLiteral "1"

└─ ExpressionStatement
   └─ AssignmentExpression "="
      ├─ VariableAccess "localVar"
      └─ NumericLiteral "1"

└─ ExpressionStatement
   └─ AssignmentExpression "="
      ├─ ArrayAccess[ @AccessType = "WRITE" ]
        ├─ VariableAccess "array"
        └─ NumericLiteral "0"
      └─ NumericLiteral "1"

└─ ExpressionStatement
   └─ AssignmentExpression "="
      ├─ FieldAccess[ @AccessType = "WRITE" ] "staticField"
        └─ TypeExpression
           └─ ClassType "Foo"
      └─ VariableAccess[ @AccessType = "READ" ] "localVar"
  • As seen above, an unqualified field access currently shows up as a VariableAccess. This may be fixed future versions of PMD.
Explicit nodes for this/super expressions
this/super expressions Examples
CodeOld AST (PMD 6)New AST (PMD 7)
this.field = 1;
super.field = 1;

this.method();
super.method();
└─ BlockStatement
   └─ Statement
      └─ StatementExpression
         ├─ PrimaryExpression
           ├─ PrimaryPrefix[ @ThisModifier = true() ]
           └─ PrimarySuffix "field"
         ├─ AssignmentOperator "="
         └─ Expression
            └─ PrimaryExpression
               └─ PrimaryPrefix
                  └─ Literal "1"

└─ BlockStatement
   └─ Statement
      └─ StatementExpression
         ├─ PrimaryExpression
           ├─ PrimaryPrefix[ @SuperModifier = true() ]
           └─ PrimarySuffix "field"
         ├─ AssignmentOperator "="
         └─ Expression
            └─ PrimaryExpression
               └─ PrimaryPrefix
                  └─ Literal "1"

└─ BlockStatement
   └─ Statement
      └─ StatementExpression
         └─ PrimaryExpression
            ├─ PrimaryPrefix[ @ThisModifier = true() ]
            ├─ PrimarySuffix "method"
            └─ PrimarySuffix[ @Arguments = true() ]
               └─ Arguments (0)

└─ BlockStatement
   └─ Statement
      └─ StatementExpression
         └─ PrimaryExpression
            ├─ PrimaryPrefix[ @SuperModifier = true() ]
            ├─ PrimarySuffix "method"
            └─ PrimarySuffix[ @Arguments = true() ]
               └─ Arguments (0)
└─ ExpressionStatement
   └─ AssignmentExpression "="
      ├─ FieldAccess[ @AccessType = "WRITE" ] "field"
        └─ ThisExpression
      └─ NumericLiteral "1"

└─ ExpressionStatement
   └─ AssignmentExpression "="
      ├─ FieldAccess[ @AcessType = "WRITE" ] "field"
        └─ SuperExpression
      └─ NumericLiteral "1"

└─ ExpressionStatement
   └─ MethodCall "method"
      ├─ ThisExpression
      └─ ArgumentList (0)

└─ ExpressionStatement
   └─ MethodCall "method"
      ├─ SuperExpression
      └─ ArgumentList (0)
Type expressions
Type expressions Examples
CodeOld AST (PMD 6)New AST (PMD 7)
Foo.staticMethod();
if (x instanceof Foo) {}
var x = Foo::method;
└─ BlockStatement
   └─ Statement
      └─ StatementExpression
         └─ PrimaryExpression
            ├─ PrimaryPrefix
              └─ Name "Foo.staticMethod"
            └─ PrimarySuffix[ @Arguments = true() ]
               └─ Arguments (0)

└─ BlockStatement
   └─ Statement
      └─ IfStatement
         ├─ Expression
           └─ InstanceOfExpression
              ├─ PrimaryExpression
                └─ PrimaryPrefix
                   └─ Name "x"
              └─ Type
                 └─ ReferenceType
                    └─ ClassOrInterfaceType "Foo"
         └─ Statement
            └─ Block

└─ BlockStatement
   └─ LocalVariableDeclaration
      └─ VariableDeclarator
         ├─ VariableDeclaratorId "x"
         └─ VariableInitializer
            └─ Expression
               └─ PrimaryExpression
                  ├─ PrimaryPrefix
                    └─ Name "Foo"
                  └─ PrimarySuffix
                     └─ MemberSelector
                        └─ MethodReference "method"
└─ ExpressionStatement
   └─ MethodCall "staticMethod"
      ├─ TypeExpression
        └─ ClassType "Foo"
      └─ ArgumentList (0)

└─ IfStatement
   ├─ InfixExpression "instanceof"
     ├─ VariableAccess[ @AccessType = "READ" ] "x"
     └─ TypeExpression
        └─ ClassType "Foo"
   └─ Block

└─ LocalVariableDeclaration
   ├─ ModifierList
   └─ VariableDeclarator
      ├─ VariableId "x"
      └─ MethodReference "method"
         └─ TypeExpression
            └─ ClassType "Foo"
Merge unary expressions
Unary Expressions Examples
CodeOld AST (PMD 6)New AST (PMD 7)
++a;
--b;
c++;
d--;
└─ StatementExpression
   └─ PreIncrementExpression
      └─ PrimaryExpression
         └─ PrimaryPrefix
            └─ Name "a"

└─ StatementExpression
   └─ PreDecrementExpression
      └─ PrimaryExpression
         └─ PrimaryPrefix
            └─ Name "b"

└─ StatementExpression
   └─ PostfixExpression "++"
      └─ PrimaryExpression
         └─ PrimaryPrefix
            └─ Name "c"

└─ StatementExpression
   └─ PostfixExpression "--"
      └─ PrimaryExpression
         └─ PrimaryPrefix
            └─ Name "d"
└─ ExpressionStatement
   └─ UnaryExpression[ @Prefix = true() ][ @Operator = '++' ]
      └─ VariableAccess[ @AccessType = "WRITE" ] "a"

└─ ExpressionStatement
   └─ UnaryExpression[ @Prefix = true() ][ @Operator = '--' ]
      └─ VariableAccess[ @AccessType = "WRITE" ] "b"

└─ ExpressionStatement
   └─ UnaryExpression[ @Prefix = false() ][ @Operator = '++' ]
      └─ VariableAccess[ @AccessType = "WRITE" ] "c"

└─ ExpressionStatement
   └─ UnaryExpression[ @Prefix = false() ][ @Operator = '--' ]
      └─ VariableAccess[ @AccessType = "WRITE" ] "d"
x = ~a;
x = +a;
└─ UnaryExpressionNotPlusMinus "~"
   └─ PrimaryExpression
      └─ PrimaryPrefix
         └─ Name "a"

└─ UnaryExpression "+"
   └─ PrimaryExpression
      └─ PrimaryPrefix
         └─ Name "a"
└─ UnaryExpression[ @Prefix = true() ] "~"
   └─ VariableAccess "a"

└─ UnaryExpression[ @Prefix = true() ] "+"
   └─ VariableAccess "a"
Binary operators are left-recursive
  • What: For each operator, there were separate AST nodes (like AdditiveExpression, AndExpression, …). These are now unified into a InfixExpression, which gives access to the operator via getOperator() and to the operands (getLhs(), getRhs()). Additionally, the resulting AST is not flat anymore, but a more structured tree.
  • Why: Having different AST node types doesn’t add information, that the operator doesn’t already provide. The new structure as a result, that the expressions are now parsed left recursive, makes the AST more JLS-like. This makes it easier for the type mapping algorithms. It also provides the information, which operands are used with which operator. This information was lost if more than 2 operands where used and the tree was flattened with PMD 6.
  • Related issue: [java] Make binary operators left-recursive (#1979)
Binary operators Examples
CodeOld AST (PMD 6)New AST (PMD 7)
int i = 1 * 2 * 3 % 4;
└─ Expression
   └─ MultiplicativeExpression "%"
      ├─ PrimaryExpression
        └─ PrimaryPrefix
           └─ Literal "1"
      ├─ PrimaryExpression
        └─ PrimaryPrefix
           └─ Literal "2"
      ├─ PrimaryExpression
        └─ PrimaryPrefix
           └─ Literal "3"
      └─ PrimaryExpression
         └─ PrimaryPrefix
            └─ Literal "4"
└─ InfixExpression[ @Operator = '%' ]
   ├─ InfixExpression[@Operator='*']
     ├─ InfixExpression[@Operator='*']
       ├─ NumericLiteral[@ValueAsInt=1]
       └─ NumericLiteral[@ValueAsInt=2]
     └─ NumericLiteral[@ValueAsInt=3]
   └─ NumericLiteral[@ValueAsInt=4]
Parenthesized expressions
  • What: Parentheses are not modelled in the AST anymore, but can be checked with the attributes @Parenthesized and @ParenthesisDepth
  • Why: This keeps the tree flat while still preserving the information. The tree is the same in case of unnecessary parenthesis, which makes it harder to fool rules that look at the structure of the tree.
  • Related issue: [java] Remove ParenthesizedExpr (#1872)
Parenthesized expressions Examples
CodeOld AST (PMD 6)New AST (PMD 7)
a = (((1)));
└─ StatementExpression
   ├─ PrimaryExpression
     └─ PrimaryPrefix
        └─ Name "a"
   ├─ AssignmentOperator "="
   └─ Expression
      └─ PrimaryExpression
         └─ PrimaryPrefix
            └─ Expression
               └─ PrimaryExpression
                  └─ PrimaryPrefix
                     └─ Expression
                        └─ PrimaryExpression
                           └─ PrimaryPrefix
                              └─ Expression
                                 └─ PrimaryExpression
                                    └─ PrimaryPrefix
                                       └─ Literal "1"
└─ ExpressionStatement
   └─ AssignmentExpression
      ├─ VariableAccess "a"
      └─ NumericLiteral[ @Parenthesized = true() ][ @ParenthesisDepth = 3 ] "1"

Language versions

  • Since all languages now have defined language versions, you could now write rules that apply only for specific versions (using minimumLanguageVersion and maximumLanguageVersion).
  • All languages have a default version. If no specific version on the CLI is given using --use-version, then this default version will be used. Usually the latest version is the default version.
  • The available versions for each language can be seen in the help message of the CLI pmd check --help.
  • See also Changed: Language versions

Migrating custom CPD language modules

This is only relevant, if you are maintaining a CPD language module for a custom language.

  • Instead of AbstractLanguage extend now CpdOnlyLanguageModuleBase.
  • Instead of AntlrTokenManager use now TokenManager
  • Instead of AntlrTokenFilter also use now TokenManager
  • Instead of AntlrTokenFilter extend now BaseTokenFilter
  • CPD Module discovery change. The service loader won’t load anymore src/main/resources/META-INF/services/net.sourceforge.pmd.cpd.Language but instead src/main/resources/META-INF/services/net.sourceforge.pmd.lang.Language. This is the unified language interface for both PMD and CPD capable languages. See also the subinterfaces CpdCapableLanguage and PmdCapableLanguage.
  • The documentation How to add a new CPD language has been updated to reflect these changes.

Build Tools

Ant

  • The Ant tasks PMDTask and CPDTask have been moved from the module pmd-core into the new module pmd-ant.
  • You need to add this dependency/jar file onto the class path (net.sourceforge.pmd:pmd-ant) in order to import the tasks into your build file.
  • When using the guide Ant Task Usage then no change is needed, since the pmd-ant jar file is included in the binary distribution of PMD. It is part of PMD’s lib folder.

Maven

  • Due to some changes in PMD’s API, you can’t simply pull in the new PMD 7 dependency.
  • However, there is now a compatibility module, that makes it possible to use PMD 7 with Maven. In addition to the PMD 7 dependencies documented in Upgrading PMD at Runtime you need to add additionally the following dependency (first available version is 7.0.0-rc4):
<dependency>
  <groupId>net.sourceforge.pmd</groupId>
  <artifactId>pmd-compat6</artifactId>
  <version>${pmdVersion}</version>
</dependency>

It is important to add this dependency as the first in the list, so that maven-pmd-plugin sees the (old) compatible versions of some classes.

This module is available beginning with version 7.0.0-rc4 and will be there at least for the first final version PMD 7 (7.0.0). It’s not decided yet, whether we will keep updating it, after PMD 7 is finally released.

Note: This compatibility module only works for the built-in rules, that are still available in PMD 7. E.g. you need to review your rulesets and look out for deprecated rules and such. See the use case I’m using only built-in rules

As PMD 7 revamped the Java module, if you have custom rules, you need to migrate these rules. See the use case I’m using custom rules.

Gradle

  • Gradle uses internally PMD’s Ant task to execute PMD
  • You can set toolVersion = "7.0.0-SNAPSHOT", but you also need configure the dependencies manually for now, since the ant task is in an own dependency with PMD 7:
    pmd 'net.sourceforge.pmd:pmd-ant:7.0.0-SNAPSHOT'
    pmd 'net.sourceforge.pmd:pmd-java:7.0.0-SNAPSHOT'
    
  • Gradle 8.3 most likely will support PMD 7 out of the box.
  • See Support for PMD 7.0
Tags: userdocs