Merge pull request #3496 from adangel:pmd7-antlr-doc
[doc] Improve Antlr documentation #3496
This commit is contained in:
@@ -3,17 +3,54 @@ title: Adding PMD support for a new ANTLR grammar based language
|
||||
short_title: Adding a new language with ANTLR
|
||||
tags: [devdocs, extending]
|
||||
summary: "How to add a new language to PMD using ANTLR grammar."
|
||||
last_updated: July 21, 2019
|
||||
last_updated: October 2021
|
||||
sidebar: pmd_sidebar
|
||||
permalink: pmd_devdocs_major_adding_new_language_antlr.html
|
||||
folder: pmd/devdocs
|
||||
|
||||
# needs to be changed to branch master instead of pmd/7.0.x
|
||||
#
|
||||
# needs to be changed to branch master instead of pmd/7.0.x once pmd7 is released
|
||||
# https://github.com/pmd/pmd/blob/pmd/7.0.x -> https://github.com/pmd/pmd/blob/master
|
||||
#
|
||||
---
|
||||
|
||||
{% include callout.html type="warning" content="
|
||||
|
||||
## 1. Start with a new sub-module.
|
||||
**Before you start...**<br><br>
|
||||
|
||||
This is really a big contribution and can't be done with a drive by contribution. It requires dedicated passion
|
||||
and long commitment to implement support for a new language.<br><br>
|
||||
|
||||
This step by step guide is just a small intro to get the basics started and it's also not necessarily up-to-date
|
||||
or complete and you have to be able to fill in the blanks.<br><br>
|
||||
|
||||
Currently the Antlr integration has some basic limitations compared to JavaCC: The output of the
|
||||
Antlr parser generator is not an abstract syntax tree (AST) but a parse tree. As such, a parse tree is
|
||||
much more fine-grained than what a typical JavaCC grammar will produce. This means that the
|
||||
parse tree is much deeper and contains nodes down to the different token types.<br><br>
|
||||
|
||||
The Antlr nodes themselves don't have any attributes because they are on the wrong abstraction level.
|
||||
As they don't have attributes, there are no attributes that can be used in XPath based rules.<br><br>
|
||||
|
||||
In order to overcome these limitations, one would need to implement a post-processing step that transforms
|
||||
a parse tree into an abstract syntax tree and introducing real nodes on a higher abstraction level. This
|
||||
step is **not** described in this guide.<br><br>
|
||||
|
||||
After the basic support for a language is there, there are lots of missing features left. Typical features
|
||||
that can greatly improve rule writing are: symbol table, type resolution, call/data flow analysis.<br><br>
|
||||
|
||||
Symbol table keeps track of variables and their usages. Type resolution tries to find the actual class type
|
||||
of each used type, following along method calls (including overloaded and overwritten methods), allowing
|
||||
to query sub types and type hierarchy. This requires additional configuration of an auxiliary classpath.
|
||||
Call and data flow analysis keep track of the data as it is moving through different execution paths
|
||||
a program has.<br><br>
|
||||
|
||||
These features are out of scope of this guide. Type resolution and data flow are features that
|
||||
definitely don't come for free. It is much effort and requires perseverance to implement.<br><br>
|
||||
|
||||
" %}
|
||||
|
||||
## 1. Start with a new sub-module
|
||||
* See pmd-swift for examples.
|
||||
|
||||
## 2. Implement an AST parser for your language
|
||||
@@ -24,7 +61,7 @@ folder: pmd/devdocs
|
||||
|
||||
## 3. Create AST node classes
|
||||
* The individual AST nodes are generated, but you need to define the common interface for them.
|
||||
* You need a need to define the supertype interface for all nodes of the language. For that, we provide
|
||||
* You need to define the supertype interface for all nodes of the language. For that, we provide
|
||||
[`AntlrNode`](https://github.com/pmd/pmd/blob/pmd/7.0.x/pmd-core/src/main/java/net/sourceforge/pmd/lang/ast/impl/antlr4/AntlrNode.java).
|
||||
* See [`SwiftNode`](https://github.com/pmd/pmd/blob/pmd/7.0.x/pmd-swift/src/main/java/net/sourceforge/pmd/lang/swift/ast/SwiftNode.java)
|
||||
as an example.
|
||||
@@ -52,7 +89,7 @@ folder: pmd/devdocs
|
||||
## 4. Generate your parser
|
||||
* Make sure, you have the property `<antlr4.visitor>true</antlr4.visitor>` in your `pom.xml` file.
|
||||
* This is just a matter of building the language module. ANTLR is called via ant, and this step is added
|
||||
to the phase `generate-sources`. So you can just call e.g. `./mvnw generate-source -pl pmd-swift` to
|
||||
to the phase `generate-sources`. So you can just call e.g. `./mvnw generate-sources -pl pmd-swift` to
|
||||
have the parser generated.
|
||||
* The generated code will be placed under `target/generated-sources/antlr4` and will not be committed to
|
||||
source control.
|
||||
|
@@ -1,16 +1,40 @@
|
||||
---
|
||||
title: Adding PMD support for a new JAVACC grammar based language
|
||||
short_title: Adding a new language with JAVACC
|
||||
title: Adding PMD support for a new JavaCC grammar based language
|
||||
short_title: Adding a new language with JavaCC
|
||||
tags: [devdocs, extending]
|
||||
summary: "How to add a new language to PMD using JAVACC grammar."
|
||||
last_updated: October 5, 2019
|
||||
summary: "How to add a new language to PMD using JavaCC grammar."
|
||||
last_updated: October 2021
|
||||
sidebar: pmd_sidebar
|
||||
permalink: pmd_devdocs_major_adding_new_language_javacc.html
|
||||
folder: pmd/devdocs
|
||||
---
|
||||
|
||||
{% include callout.html type="warning" content="
|
||||
|
||||
## 1. Start with a new sub-module.
|
||||
**Before you start...**<br><br>
|
||||
|
||||
This is really a big contribution and can't be done with a drive by contribution. It requires dedicated passion
|
||||
and long commitment to implement support for a new language.<br><br>
|
||||
|
||||
This step by step guide is just a small intro to get the basics started and it's also not necessarily up-to-date
|
||||
or complete and you have to be able to fill in the blanks.<br><br>
|
||||
|
||||
After the basic support for a language is there, there are lots of missing features left. Typical features
|
||||
that can greatly improve rule writing are: symbol table, type resolution, call/data flow analysis.<br><br>
|
||||
|
||||
Symbol table keeps track of variables and their usages. Type resolution tries to find the actual class type
|
||||
of each used type, following along method calls (including overloaded and overwritten methods), allowing
|
||||
to query sub types and type hierarchy. This requires additional configuration of an auxiliary classpath.
|
||||
Call and data flow analysis keep track of the data as it is moving through different execution paths
|
||||
a program has.<br><br>
|
||||
|
||||
These features are out of scope of this guide. Type resolution and data flow are features that
|
||||
definitely don't come for free. It is much effort and requires perseverance to implement.<br><br>
|
||||
|
||||
" %}
|
||||
|
||||
|
||||
## 1. Start with a new sub-module
|
||||
* See pmd-java or pmd-vm for examples.
|
||||
|
||||
## 2. Implement an AST parser for your language
|
||||
|
Reference in New Issue
Block a user