diff --git a/docs/pages/pmd/devdocs/major_contributions/adding_new_cpd_language.md b/docs/pages/pmd/devdocs/major_contributions/adding_new_cpd_language.md index 64bcacf81a..2cf623881e 100644 --- a/docs/pages/pmd/devdocs/major_contributions/adding_new_cpd_language.md +++ b/docs/pages/pmd/devdocs/major_contributions/adding_new_cpd_language.md @@ -3,7 +3,7 @@ title: How to add a new CPD language short_title: Adding a new CPD language tags: [devdocs, extending] summary: How to add a new language module with CPD support. -last_updated: April 2023 (7.0.0) +last_updated: June 2024 (7.3.0) permalink: pmd_devdocs_major_adding_new_cpd_language.html author: Matías Fraga, Clément Fournier --- @@ -45,8 +45,15 @@ Use the following guide to set up a new language module that supports CPD. } ``` + - If your language is case-insensitive, then you might want to overwrite `getImage(AntlrToken)`. There you can + change each token e.g. into uppercase, so that CPD sees the same strings and can find duplicates even when + the casing differs. See {% jdoc tsql::lang.tsql.cpd.TSqlCpdLexer %} for an example. You will also need a + "CaseChangingCharStream", so that antlr itself is case-insensitive. - For JavaCC grammars, place your grammar in `etc/grammar` and edit the `pom.xml` like the [Python implementation](https://github.com/pmd/pmd/blob/master/pmd-python/pom.xml) does. You can then subclass {% jdoc core::cpd.impl.JavaccCpdLexer %} instead of AntlrCpdLexer. + - If your JavaCC based language is case-insensitive (option `IGNORE_CASE=true`), then you need to implement + {%jdoc core::lang.ast.impl.javacc.JavaccTokenDocument.TokenDocumentBehavior %}, which can change each token + e.g. into uppercase. See {%jdoc plsql::lang.plsql.ast.PLSQLParser %} for an example. - For any other scenario just implement the interface however you can. Look at the Scala or Apex module for existing implementations. 3. Create a {% jdoc core::lang.Language %} implementation, and make it implement {% jdoc core::cpd.CpdCapableLanguage %}. diff --git a/docs/pages/release_notes.md b/docs/pages/release_notes.md index 3efe00d5c4..8b6b88b2fe 100644 --- a/docs/pages/release_notes.md +++ b/docs/pages/release_notes.md @@ -39,6 +39,7 @@ See also [Maven PMD Plugin]({{ baseurl }}pmd_userdocs_tools_maven.html). * cli * [#2827](https://github.com/pmd/pmd/issues/2827): \[cli] Consider processing errors in exit status * core + * [#4396](https://github.com/pmd/pmd/issues/4396): \[core] CPD is always case sensitive * [#4992](https://github.com/pmd/pmd/pull/4992): \[core] CPD: Include processing errors in XML report * apex * [#4922](https://github.com/pmd/pmd/issues/4922): \[apex] SOQL syntax error with TYPEOF in sub-query @@ -112,11 +113,18 @@ read the XML format should be updated. * {% jdoc !!core::cpd.CPDConfiguration#isSkipLexicalErrors() %} and {% jdoc core::cpd.CPDConfiguration#setSkipLexicalErrors(boolean) %}: Use {%jdoc core::AbstractConfiguration#setFailOnError(boolean) %} to control whether to ignore errors or fail the build. * {%jdoc !!core::cpd.XMLOldRenderer %} (the CPD format "xmlold"). + * The constructor + {%jdoc !!core::lang.ast.impl.antlr4.AntlrToken#AntlrToken(org.antlr.v4.runtime.Token,core::lang.ast.impl.antlr4.AntlrToken,core::lang.document.TextDocument) %} + shouldn't be used directly. Use {%jdoc core::lang.ast.impl.antlr4.AntlrTokenManager %} instead. * pmd-java * {% jdoc !!java::lang.java.ast.ASTResource#getStableName() %} and the corresponding attribute `@StableName`. * {%jdoc !!java::lang.java.ast.ASTRecordPattern#getVarId() %} This method was added here by mistake. Record patterns don't declare a pattern variable for the whole pattern, but rather for individual record components, which can be accessed via {%jdoc java::lang.java.ast.ASTRecordPattern#getComponentPatterns() %}. +* pmd-plsql + * {%jdoc plsql::lang.plsql.ast.PLSQLParserImpl %} is deprecated now. It should have been package-private + because this is an implementation class that should not be used directly. + * The node {%jdoc plsql::lang.plsql.ast.ASTKEYWORD_UNRESERVED %} is deprecated and is now removed from the AST. #### Breaking changes: pmd-compat6 removed diff --git a/pmd-core/src/main/java/net/sourceforge/pmd/lang/ast/impl/javacc/JavaccToken.java b/pmd-core/src/main/java/net/sourceforge/pmd/lang/ast/impl/javacc/JavaccToken.java index f8e2d1442f..70bbc59b8a 100644 --- a/pmd-core/src/main/java/net/sourceforge/pmd/lang/ast/impl/javacc/JavaccToken.java +++ b/pmd-core/src/main/java/net/sourceforge/pmd/lang/ast/impl/javacc/JavaccToken.java @@ -147,7 +147,12 @@ public class JavaccToken implements GenericToken { return image.toString(); } - /** Returns the original text of the token. The image may be normalized. */ + /** + * Returns the original text of the token. + * The image may be normalized, e.g. for case-insensitive languages. + * + * @since 7.3.0 + */ public Chars getText() { return document.getTextDocument().sliceOriginalText(getRegion()); }