From 01eacda5a6caeb778782c32357c38c3c3d57b1bd Mon Sep 17 00:00:00 2001 From: Andreas Dangel Date: Tue, 7 Sep 2021 19:14:43 +0200 Subject: [PATCH 1/3] [doc] Fix typo with mvn generate-sources --- .../major_contributions/adding_a_new_antlr_based_language.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/pages/pmd/devdocs/major_contributions/adding_a_new_antlr_based_language.md b/docs/pages/pmd/devdocs/major_contributions/adding_a_new_antlr_based_language.md index 7f156054d7..babf8b5812 100644 --- a/docs/pages/pmd/devdocs/major_contributions/adding_a_new_antlr_based_language.md +++ b/docs/pages/pmd/devdocs/major_contributions/adding_a_new_antlr_based_language.md @@ -52,7 +52,7 @@ folder: pmd/devdocs ## 4. Generate your parser * Make sure, you have the property `true` in your `pom.xml` file. * This is just a matter of building the language module. ANTLR is called via ant, and this step is added - to the phase `generate-sources`. So you can just call e.g. `./mvnw generate-source -pl pmd-swift` to + to the phase `generate-sources`. So you can just call e.g. `./mvnw generate-sources -pl pmd-swift` to have the parser generated. * The generated code will be placed under `target/generated-sources/antlr4` and will not be committed to source control. From d82b245e53371361966e24d7043bbee8bba6b607 Mon Sep 17 00:00:00 2001 From: Andreas Dangel Date: Wed, 8 Sep 2021 20:26:11 +0200 Subject: [PATCH 2/3] Fix typo --- .../major_contributions/adding_a_new_antlr_based_language.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/pages/pmd/devdocs/major_contributions/adding_a_new_antlr_based_language.md b/docs/pages/pmd/devdocs/major_contributions/adding_a_new_antlr_based_language.md index babf8b5812..90cd18a417 100644 --- a/docs/pages/pmd/devdocs/major_contributions/adding_a_new_antlr_based_language.md +++ b/docs/pages/pmd/devdocs/major_contributions/adding_a_new_antlr_based_language.md @@ -24,7 +24,7 @@ folder: pmd/devdocs ## 3. Create AST node classes * The individual AST nodes are generated, but you need to define the common interface for them. -* You need a need to define the supertype interface for all nodes of the language. For that, we provide +* You need to define the supertype interface for all nodes of the language. For that, we provide [`AntlrNode`](https://github.com/pmd/pmd/blob/pmd/7.0.x/pmd-core/src/main/java/net/sourceforge/pmd/lang/ast/impl/antlr4/AntlrNode.java). * See [`SwiftNode`](https://github.com/pmd/pmd/blob/pmd/7.0.x/pmd-swift/src/main/java/net/sourceforge/pmd/lang/swift/ast/SwiftNode.java) as an example. From 39807f325f28184fa55b593949be9d0c66f4c176 Mon Sep 17 00:00:00 2001 From: Andreas Dangel Date: Thu, 14 Oct 2021 14:49:17 +0200 Subject: [PATCH 3/3] [doc] Add a warning box in major language contributions --- .../adding_a_new_antlr_based_language.md | 43 +++++++++++++++++-- .../adding_a_new_javacc_based_language.md | 34 ++++++++++++--- 2 files changed, 69 insertions(+), 8 deletions(-) diff --git a/docs/pages/pmd/devdocs/major_contributions/adding_a_new_antlr_based_language.md b/docs/pages/pmd/devdocs/major_contributions/adding_a_new_antlr_based_language.md index 90cd18a417..567db3d6c2 100644 --- a/docs/pages/pmd/devdocs/major_contributions/adding_a_new_antlr_based_language.md +++ b/docs/pages/pmd/devdocs/major_contributions/adding_a_new_antlr_based_language.md @@ -3,17 +3,54 @@ title: Adding PMD support for a new ANTLR grammar based language short_title: Adding a new language with ANTLR tags: [devdocs, extending] summary: "How to add a new language to PMD using ANTLR grammar." -last_updated: July 21, 2019 +last_updated: October 2021 sidebar: pmd_sidebar permalink: pmd_devdocs_major_adding_new_language_antlr.html folder: pmd/devdocs -# needs to be changed to branch master instead of pmd/7.0.x +# +# needs to be changed to branch master instead of pmd/7.0.x once pmd7 is released # https://github.com/pmd/pmd/blob/pmd/7.0.x -> https://github.com/pmd/pmd/blob/master +# --- +{% include callout.html type="warning" content=" -## 1. Start with a new sub-module. +**Before you start...**

+ +This is really a big contribution and can't be done with a drive by contribution. It requires dedicated passion +and long commitment to implement support for a new language.

+ +This step by step guide is just a small intro to get the basics started and it's also not necessarily up-to-date +or complete and you have to be able to fill in the blanks.

+ +Currently the Antlr integration has some basic limitations compared to JavaCC: The output of the +Antlr parser generator is not an abstract syntax tree (AST) but a parse tree. As such, a parse tree is +much more fine-grained than what a typical JavaCC grammar will produce. This means that the +parse tree is much deeper and contains nodes down to the different token types.

+ +The Antlr nodes themselves don't have any attributes because they are on the wrong abstraction level. +As they don't have attributes, there are no attributes that can be used in XPath based rules.

+ +In order to overcome these limitations, one would need to implement a post-processing step that transforms +a parse tree into an abstract syntax tree and introducing real nodes on a higher abstraction level. This +step is **not** described in this guide.

+ +After the basic support for a language is there, there are lots of missing features left. Typical features +that can greatly improve rule writing are: symbol table, type resolution, call/data flow analysis.

+ +Symbol table keeps track of variables and their usages. Type resolution tries to find the actual class type +of each used type, following along method calls (including overloaded and overwritten methods), allowing +to query sub types and type hierarchy. This requires additional configuration of an auxiliary classpath. +Call and data flow analysis keep track of the data as it is moving through different execution paths +a program has.

+ +These features are out of scope of this guide. Type resolution and data flow are features that +definitely don't come for free. It is much effort and requires perseverance to implement.

+ +" %} + +## 1. Start with a new sub-module * See pmd-swift for examples. ## 2. Implement an AST parser for your language diff --git a/docs/pages/pmd/devdocs/major_contributions/adding_a_new_javacc_based_language.md b/docs/pages/pmd/devdocs/major_contributions/adding_a_new_javacc_based_language.md index c3b470247c..7e14a56a24 100644 --- a/docs/pages/pmd/devdocs/major_contributions/adding_a_new_javacc_based_language.md +++ b/docs/pages/pmd/devdocs/major_contributions/adding_a_new_javacc_based_language.md @@ -1,16 +1,40 @@ --- -title: Adding PMD support for a new JAVACC grammar based language -short_title: Adding a new language with JAVACC +title: Adding PMD support for a new JavaCC grammar based language +short_title: Adding a new language with JavaCC tags: [devdocs, extending] -summary: "How to add a new language to PMD using JAVACC grammar." -last_updated: October 5, 2019 +summary: "How to add a new language to PMD using JavaCC grammar." +last_updated: October 2021 sidebar: pmd_sidebar permalink: pmd_devdocs_major_adding_new_language_javacc.html folder: pmd/devdocs --- +{% include callout.html type="warning" content=" -## 1. Start with a new sub-module. +**Before you start...**

+ +This is really a big contribution and can't be done with a drive by contribution. It requires dedicated passion +and long commitment to implement support for a new language.

+ +This step by step guide is just a small intro to get the basics started and it's also not necessarily up-to-date +or complete and you have to be able to fill in the blanks.

+ +After the basic support for a language is there, there are lots of missing features left. Typical features +that can greatly improve rule writing are: symbol table, type resolution, call/data flow analysis.

+ +Symbol table keeps track of variables and their usages. Type resolution tries to find the actual class type +of each used type, following along method calls (including overloaded and overwritten methods), allowing +to query sub types and type hierarchy. This requires additional configuration of an auxiliary classpath. +Call and data flow analysis keep track of the data as it is moving through different execution paths +a program has.

+ +These features are out of scope of this guide. Type resolution and data flow are features that +definitely don't come for free. It is much effort and requires perseverance to implement.

+ +" %} + + +## 1. Start with a new sub-module * See pmd-java or pmd-vm for examples. ## 2. Implement an AST parser for your language