pmd/pmd_userdocs_cpd.html

3314 lines
110 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta name="description" content="Learn how to use CPD, the copy-paste detector shipped with PMD.">
<meta name="keywords" content="cpduserdocs, ">
<title>Finding duplicated code with CPD | PMD Source Code Analyzer</title>
<link rel="stylesheet" type="text/css" href="assets/fontawesome-free-5.15.4-web/css/all.min.css">
<link rel="stylesheet" type="text/css" href="assets/bootstrap-4.5.2-dist/css/bootstrap.min.css">
<link rel="stylesheet" type="text/css" href="css/syntax.css">
<link rel="stylesheet" type="text/css" href="css/modern-business.css">
<link rel="stylesheet" type="text/css" href="css/customstyles.css">
<link rel="stylesheet" type="text/css" href="css/theme-green.css">
<link rel="stylesheet" type="text/css" href="css/pmd-customstyles.css">
<link rel="shortcut icon" href="images/logo/favicon.ico" type="image/x-icon">
<link rel="icon" href="images/logo/favicon.ico" type="image/x-icon">
<link rel="alternate" type="application/rss+xml" title="" href="feed.xml">
</head>
<body>
<!-- Content is offset by the height of the topnav bar. -->
<!-- There's already a padding-top rule in modern-business.css, but it apparently doesn't work on Firefox 60 and Chrome 67 -->
<div id="topbar-content-offset">
<!-- Navigation -->
<nav class="navbar navbar-expand-lg fixed-top navbar-dark">
<div class="container topnavlinks">
<a class="navbar-brand fas fa-home fa-lg" href="index.html">&nbsp;<span class="projectTitle"> PMD Source Code Analyzer Project</span></a>
<button class="navbar-toggler" type="button" data-toggle="collapse" data-target="#navbarSupportedContent" aria-controls="navbarSupportedContent" aria-expanded="false" aria-label="Toggle navigation">
<span class="navbar-toggler-icon"></span>
</button>
<div class="collapse navbar-collapse" id="navbarSupportedContent">
<ul class="navbar-nav mr-auto mt-2 mt-lg-0"></ul>
<ul class="navbar-nav">
<!-- toggle sidebar button -->
<li class="nav-item"><a id="tg-sb-link" class="nav-link" href="#"><i id="tg-sb-icon" class="fas fa-toggle-on"></i> Nav</a></li>
<!-- entries without drop-downs appear here -->
<li class="nav-item"><a class="nav-link" href="https://github.com/pmd/pmd/releases/latest" target="_blank">Download</a></li>
<li class="nav-item"><a class="nav-link" href="https://github.com/pmd/pmd" target="_blank">Fork us on github</a></li>
<!-- entries with drop-downs appear here -->
<!-- conditional logic to control which topnav appears for the audience defined in the configuration file.-->
</ul>
<form class="form-inline my-2 my-lg-0">
<input class="form-control mr-sm-2" type="search" placeholder="search..." id="search-input">
<ul id="results-container"></ul>
</form>
</div>
</div>
</nav>
<!-- Page Content -->
<div class="container-toc-wrapper">
<div class="container">
<div class="col-lg-12">&nbsp;</div>
<!-- Content Row -->
<div class="row">
<!-- Sidebar Column -->
<div class="col-md-3" id="tg-sb-sidebar">
<ul id="mysidebar" class="nav">
<li class="sidebarTitle">PMD 7.0.0-SNAPSHOT</li>
<div class="sidebarTitleDate">Release date: ??-?????-2023</div>
<li>
<a href="#">About</a>
<ul>
<li><a href="index.html">Home</a></li>
<li><a href="pmd_release_notes.html">Release notes</a></li>
<li><a href="pmd_release_notes_pmd7.html">Release notes (PMD 7)</a></li>
<li><a href="pmd_about_help.html">Getting help</a></li>
</ul>
</li>
<li>
<a href="#">User Documentation</a>
<ul>
<li><a href="pmd_userdocs_migrating_to_pmd7.html">Migration Guide for PMD 7</a></li>
<li><a href="pmd_userdocs_installation.html">Installation and basic CLI usage</a></li>
<li><a href="pmd_userdocs_making_rulesets.html">Making rulesets</a></li>
<li><a href="pmd_userdocs_configuring_rules.html">Configuring rules</a></li>
<li><a href="pmd_userdocs_best_practices.html">Best practices</a></li>
<li><a href="pmd_userdocs_suppressing_warnings.html">Suppressing warnings</a></li>
<li><a href="pmd_userdocs_incremental_analysis.html">Incremental analysis</a></li>
<li><a href="pmd_userdocs_cli_reference.html">PMD CLI reference</a></li>
<li><a href="pmd_userdocs_report_formats.html">PMD Report formats</a></li>
<li><a href="pmd_userdocs_3rdpartyrulesets.html">3rd party rulesets</a></li>
<li class="subfolders">
<a href="#">CPD reference</a>
<ul>
<li class="active"><a href="pmd_userdocs_cpd.html">Copy-paste detection</a></li>
<li><a href="pmd_userdocs_cpd_report_formats.html">CPD Report formats</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">Extending PMD</a>
<ul>
<li><a href="pmd_userdocs_extending_writing_rules_intro.html">Introduction to writing rules</a></li>
<li><a href="pmd_userdocs_extending_your_first_rule.html">Your first rule</a></li>
<li><a href="pmd_userdocs_extending_writing_xpath_rules.html">XPath rules</a></li>
<li><a href="pmd_userdocs_extending_writing_java_rules.html">Java rules</a></li>
<li><a href="pmd_userdocs_extending_designer_reference.html">Rule designer reference</a></li>
<li><a href="pmd_userdocs_extending_defining_properties.html">Defining rule properties</a></li>
<li><a href="pmd_userdocs_extending_rule_guidelines.html">Rule guidelines</a></li>
<li><a href="pmd_userdocs_extending_testing.html">Testing your rules</a></li>
<li><a href="pmd_userdocs_extending_ast_dump.html">Creating (XML) dump of the AST</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">Tools / Integrations</a>
<ul>
<li><a href="pmd_userdocs_tools_maven.html">Maven PMD Plugin</a></li>
<li><a href="pmd_userdocs_tools_gradle.html">Gradle</a></li>
<li><a href="pmd_userdocs_tools_ant.html">Ant</a></li>
<li><a href="pmd_userdocs_tools_java_api.html">PMD Java API</a></li>
<li><a href="pmd_userdocs_tools_bld.html">bld PMD Extension</a></li>
<li><a href="pmd_userdocs_tools_ci.html">CI integrations</a></li>
<li><a href="pmd_userdocs_tools.html">Other Tools / Integrations</a></li>
</ul>
</li>
</ul>
</li>
<li>
<a href="#">Rule Reference</a>
<ul>
<li class="subfolders">
<a href="#">Apex Rules</a>
<ul>
<li><a href="pmd_rules_apex.html">Index</a></li>
<li><a href="pmd_rules_apex_bestpractices.html">Best Practices</a></li>
<li><a href="pmd_rules_apex_codestyle.html">Code Style</a></li>
<li><a href="pmd_rules_apex_design.html">Design</a></li>
<li><a href="pmd_rules_apex_documentation.html">Documentation</a></li>
<li><a href="pmd_rules_apex_errorprone.html">Error Prone</a></li>
<li><a href="pmd_rules_apex_performance.html">Performance</a></li>
<li><a href="pmd_rules_apex_security.html">Security</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">HTML Rules</a>
<ul>
<li><a href="pmd_rules_html.html">Index</a></li>
<li><a href="pmd_rules_html_bestpractices.html">Best Practices</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">Java Rules</a>
<ul>
<li><a href="pmd_rules_java.html">Index</a></li>
<li><a href="pmd_rules_java_bestpractices.html">Best Practices</a></li>
<li><a href="pmd_rules_java_codestyle.html">Code Style</a></li>
<li><a href="pmd_rules_java_design.html">Design</a></li>
<li><a href="pmd_rules_java_documentation.html">Documentation</a></li>
<li><a href="pmd_rules_java_errorprone.html">Error Prone</a></li>
<li><a href="pmd_rules_java_multithreading.html">Multithreading</a></li>
<li><a href="pmd_rules_java_performance.html">Performance</a></li>
<li><a href="pmd_rules_java_security.html">Security</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">Java Server Pages Rules</a>
<ul>
<li><a href="pmd_rules_jsp.html">Index</a></li>
<li><a href="pmd_rules_jsp_bestpractices.html">Best Practices</a></li>
<li><a href="pmd_rules_jsp_codestyle.html">Code Style</a></li>
<li><a href="pmd_rules_jsp_design.html">Design</a></li>
<li><a href="pmd_rules_jsp_errorprone.html">Error Prone</a></li>
<li><a href="pmd_rules_jsp_security.html">Security</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">JavaScript Rules</a>
<ul>
<li><a href="pmd_rules_ecmascript.html">Index</a></li>
<li><a href="pmd_rules_ecmascript_bestpractices.html">Best Practices</a></li>
<li><a href="pmd_rules_ecmascript_codestyle.html">Code Style</a></li>
<li><a href="pmd_rules_ecmascript_errorprone.html">Error Prone</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">Kotlin Rules</a>
<ul>
<li><a href="pmd_rules_kotlin.html">Index</a></li>
<li><a href="pmd_rules_kotlin_bestpractices.html">Best Practices</a></li>
<li><a href="pmd_rules_kotlin_errorprone.html">Error Prone</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">Maven POM Rules</a>
<ul>
<li><a href="pmd_rules_pom.html">Index</a></li>
<li><a href="pmd_rules_pom_errorprone.html">Error Prone</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">Modelica Rules</a>
<ul>
<li><a href="pmd_rules_modelica.html">Index</a></li>
<li><a href="pmd_rules_modelica_bestpractices.html">Best Practices</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">PLSQL Rules</a>
<ul>
<li><a href="pmd_rules_plsql.html">Index</a></li>
<li><a href="pmd_rules_plsql_bestpractices.html">Best Practices</a></li>
<li><a href="pmd_rules_plsql_codestyle.html">Code Style</a></li>
<li><a href="pmd_rules_plsql_design.html">Design</a></li>
<li><a href="pmd_rules_plsql_errorprone.html">Error Prone</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">Salesforce Visualforce Rules</a>
<ul>
<li><a href="pmd_rules_vf.html">Index</a></li>
<li><a href="pmd_rules_vf_security.html">Security</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">Scala Rules</a>
<ul>
<li><a href="pmd_rules_scala.html">Index</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">Swift Rules</a>
<ul>
<li><a href="pmd_rules_swift.html">Index</a></li>
<li><a href="pmd_rules_swift_bestpractices.html">Best Practices</a></li>
<li><a href="pmd_rules_swift_errorprone.html">Error Prone</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">Velocity Template Language (VTL) Rules</a>
<ul>
<li><a href="pmd_rules_vm.html">Index</a></li>
<li><a href="pmd_rules_vm_bestpractices.html">Best Practices</a></li>
<li><a href="pmd_rules_vm_design.html">Design</a></li>
<li><a href="pmd_rules_vm_errorprone.html">Error Prone</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">WSDL Rules</a>
<ul>
<li><a href="pmd_rules_wsdl.html">Index</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">XML Rules</a>
<ul>
<li><a href="pmd_rules_xml.html">Index</a></li>
<li><a href="pmd_rules_xml_bestpractices.html">Best Practices</a></li>
<li><a href="pmd_rules_xml_errorprone.html">Error Prone</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">XSL Rules</a>
<ul>
<li><a href="pmd_rules_xsl.html">Index</a></li>
<li><a href="pmd_rules_xsl_codestyle.html">Code Style</a></li>
<li><a href="pmd_rules_xsl_performance.html">Performance</a></li>
</ul>
</li>
</ul>
</li>
<li>
<a href="#">Language-Specific Documentation</a>
<ul>
<li><a href="pmd_languages_index.html">Overview</a></li>
<li><a href="pmd_languages_configuration.html">Language configuration</a></li>
<li><a href="pmd_languages_apex.html">Apex</a></li>
<li><a href="pmd_languages_cpp.html">C/C++</a></li>
<li><a href="pmd_languages_cs.html">C#</a></li>
<li><a href="pmd_languages_coco.html">Coco</a></li>
<li><a href="pmd_languages_dart.html">Dart</a></li>
<li><a href="pmd_languages_fortran.html">Fortran</a></li>
<li><a href="pmd_languages_gherkin.html">Gherkin</a></li>
<li><a href="pmd_languages_go.html">Go</a></li>
<li><a href="pmd_languages_html.html">HTML</a></li>
<li><a href="pmd_languages_java.html">Java</a></li>
<li><a href="pmd_languages_js_ts.html">JavaScript / TypeScript</a></li>
<li><a href="pmd_languages_jsp.html">JSP</a></li>
<li><a href="pmd_languages_julia.html">Julia</a></li>
<li><a href="pmd_languages_kotlin.html">Kotlin</a></li>
<li><a href="pmd_languages_lua.html">Lua</a></li>
<li><a href="pmd_languages_matlab.html">Matlab</a></li>
<li><a href="pmd_languages_modelica.html">Modelica</a></li>
<li><a href="pmd_languages_objectivec.html">Objective-C</a></li>
<li><a href="pmd_languages_perl.html">Perl</a></li>
<li><a href="pmd_languages_php.html">PHP</a></li>
<li><a href="pmd_languages_plsql.html">PLSQL</a></li>
<li><a href="pmd_languages_python.html">Python</a></li>
<li><a href="pmd_languages_ruby.html">Ruby</a></li>
<li><a href="pmd_languages_scala.html">Scala</a></li>
<li><a href="pmd_languages_swift.html">Swift</a></li>
<li><a href="pmd_languages_tsql.html">T-SQL</a></li>
<li><a href="pmd_languages_visualforce.html">Visualforce</a></li>
<li><a href="pmd_languages_vm.html">Velocity Template Language (VTL)</a></li>
<li><a href="pmd_languages_xml.html">XML and XML dialects</a></li>
</ul>
</li>
<li>
<a href="#">Developer Documentation</a>
<ul>
<li><a href="pmd_devdocs_development.html">Developer resources</a></li>
<li><a href="pmd_devdocs_building.html">Building PMD from source</a></li>
<li><a href="https://github.com/pmd/pmd/blob/master/CONTRIBUTING.md" target="_blank">Contributing</a></li>
<li><a href="pmd_devdocs_writing_documentation.html">Writing documentation</a></li>
<li><a href="pmd_devdocs_roadmap.html">Roadmap</a></li>
<li><a href="pmd_devdocs_how_pmd_works.html">How PMD works</a></li>
<li><a href="pmd_devdocs_pmdtester.html">Pmdtester</a></li>
<li><a href="pmd_devdocs_rule_deprecation_policy.html">Rule Deprecation Policy</a></li>
<li class="subfolders">
<a href="#">Major contributions</a>
<ul>
<li><a href="pmd_devdocs_major_rule_guidelines.html">Rule Guidelines</a></li>
<li><a href="pmd_devdocs_major_adding_new_language_javacc.html">Adding a new language (JavaCC)</a></li>
<li><a href="pmd_devdocs_major_adding_new_language_antlr.html">Adding a new language (ANTLR)</a></li>
<li><a href="pmd_devdocs_major_adding_new_cpd_language.html">Adding a new CPD language</a></li>
</ul>
</li>
<li class="subfolders">
<a href="#">Experimental features</a>
<ul>
<li><a href="tag_experimental.html">List of experimental Features</a></li>
</ul>
</li>
</ul>
</li>
<li>
<a href="#">Project documentation</a>
<ul>
<li class="subfolders">
<a href="#">Trivia about PMD</a>
<ul>
<li><a href="pmd_projectdocs_trivia_news.html">PMD in the press</a></li>
<li><a href="pmd_projectdocs_trivia_products.html">Products & books related to PMD</a></li>
<li><a href="pmd_projectdocs_trivia_similarprojects.html">Similar projects</a></li>
<li><a href="pmd_projectdocs_trivia_meaning.html">What does 'PMD' mean?</a></li>
</ul>
</li>
<li><a href="pmd_projectdocs_logo.html">Logo</a></li>
<li><a href="pmd_projectdocs_faq.html">FAQ</a></li>
<li><a href="license.html">License</a></li>
<li><a href="pmd_projectdocs_credits.html">Credits</a></li>
<li><a href="pmd_release_notes_old.html">Old release notes</a></li>
<li><a href="pmd_projectdocs_decisions.html">Decisions</a></li>
<li class="subfolders">
<a href="#">Project management</a>
<ul>
<li><a href="pmd_projectdocs_committers_infrastructure.html">Infrastructure</a></li>
<li><a href="pmd_projectdocs_committers_releasing.html">Release process</a></li>
<li><a href="pmd_projectdocs_committers_merging_pull_requests.html">Merging pull requests</a></li>
<li><a href="pmd_projectdocs_committers_main_landing_page.html">Main Landing page</a></li>
</ul>
</li>
</ul>
</li>
</ul>
</div>
<!-- Content Column -->
<div class="col-md-9" id="tg-sb-content">
<header>
<div class="row">
<div class="col-lg-12">
<a href="./" role="button"
><i class="fa fa-home fa-lg"></i
></a>
» Finding duplicated code with CPD
<a
target="_blank"
href="https://github.com/pmd/pmd/blob/master/docs/pages/pmd/userdocs/cpd/cpd.md"
class="float-right"
role="button"
><i class="fab fa-github fa-lg"></i> Edit on GitHub</a
>
</div>
</div>
<hr />
</header>
<div class="post-header">
<h1 class="post-title-main">Finding duplicated code with CPD</h1>
</div>
<div class="post-content" data-github-edit-url="https://github.com/pmd/pmd/blob/master/docs/pages/pmd/userdocs/cpd/cpd.md">
<div class="summary">Learn how to use CPD, the copy-paste detector shipped with PMD.</div>
<details id="inline-toc-details">
<summary>Table of Contents</summary>
<div id="inline-toc"><!-- empty, move TOC here when screen size too small --></div>
</details>
<h2 id="overview">Overview</h2>
<p>Duplicate code can be hard to find, especially in a large project.
But PMDs <strong>Copy/Paste Detector (CPD)</strong> can find it for you!</p>
<p>CPD works with Java, JSP, C/C++, C#, Go, Kotlin, Ruby, Swift and <a href="#supported-languages">many more languages</a>.
It can be used via <a href="#cli-usage">command-line</a>, or via an <a href="#ant-task">Ant task</a>.
It can also be run with Maven by using the <code class="language-plaintext highlighter-rouge">cpd-check</code> goal on the <a href="pmd_userdocs_tools_maven.html">Maven PMD Plugin</a>.</p>
<p>Your own language is missing?
See how to add it <a href="pmd_devdocs_major_adding_new_cpd_language.html">here</a>.</p>
<h3 id="why-should-you-care-about-duplicates">Why should you care about duplicates?</h3>
<p>Its certainly important to know where to get CPD, and how to call it, but its worth stepping back for a moment and
asking yourself why you should care about this, being the occurrence of duplicate code blocks.</p>
<p>Assuming duplicated blocks of code are supposed to do the same thing, any refactoring, even simple, must be duplicated
too which is unrewarding grunt work, and puts pressure on the developer to find every place in which to perform
the refactoring. Automated tools like CPD can help with that to some extent.</p>
<p>However, failure to keep the code in sync may mean automated tools will no longer recognise these blocks as duplicates.
This means the task of finding duplicates to keep them in sync when doing subsequent refactorings can no longer be
entrusted to an automated tool adding more burden on the maintainer. Segments of code initially supposed to do the
same thing may grow apart undetected upon further refactoring.</p>
<p>Now, if the code may never change in the future, then this is not a problem.</p>
<p>Otherwise, the most viable solution is to not duplicate. If the duplicates are already there, then they should be
refactored out. We thus advise developers to use CPD to <strong>help remove duplicates</strong>, not to help keep duplicates in sync.</p>
<h3 id="refactoring-duplicates">Refactoring duplicates</h3>
<p>Once you have located some duplicates, several refactoring strategies may apply depending of the scope and extent of
the duplication. Heres a quick summary:</p>
<ul>
<li>If the duplication is local to a method or single class:
<ul>
<li>Extract a local variable if the duplicated logic is not prohibitively long</li>
<li>Extract the duplicated logic into a private method</li>
</ul>
</li>
<li>If the duplication occurs in siblings within a class hierarchy:
<ul>
<li>Extract a method and pull it up in the class hierarchy, along with common fields</li>
<li>Use the <a href="https://sourcemaking.com/design_patterns/template_method">Template Method</a> design pattern</li>
</ul>
</li>
<li>If the duplication occurs consistently in unrelated hierarchies:
<ul>
<li>Introduce a common ancestor to those class hierarchies</li>
</ul>
</li>
</ul>
<p>Novice as much as advanced readers may want to <a href="https://refactoring.guru/smells/duplicate-code">read on on Refactoring Guru</a>
for more in-depth strategies, use cases and explanations.</p>
<h3 id="finding-more-duplicates">Finding more duplicates</h3>
<p>For some languages, additional options are supported. E.g. Java supports <code class="language-plaintext highlighter-rouge">--ignore-identifiers</code>. This has the
effect, that all identifiers are replaced with the same placeholder value before the comparing. This helps to
identify structurally identical code that only differs in naming (different class names, different method names,
different parameter names).</p>
<p>There are other similar options: <code class="language-plaintext highlighter-rouge">--ignore-annotations</code>, <code class="language-plaintext highlighter-rouge">--ignore-literals</code>, <code class="language-plaintext highlighter-rouge">--ignore-literal-sequences</code>,
<code class="language-plaintext highlighter-rouge">--ignore-sequences</code>, <code class="language-plaintext highlighter-rouge">--ignore-usings</code>.</p>
<p>Note that these options are <em>disabled</em> by default (e.g. identifiers are <em>not</em> replaced with the same placeholder
value). By default, CPD finds identical duplicates. Using these options, the found duplicates are not anymore
exactly identical.</p>
<h2 id="cli-usage">CLI Usage</h2>
<h3 id="cli-options-reference">CLI options reference</h3>
<table>
<tr>
<th>Option</th>
<th>Description</th>
<th>Default</th>
<th>Applies to</th>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="-minimum-tokens"><code>--minimum-tokens&nbsp;&lt;count&gt;</code></a></td>
<td><span class="label label-primary">Required</span> The minimum token length which should be reported as a duplicate.</td>
<td><code></code></td>
<td></td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="-dir"><code>--dir&nbsp;&lt;path&gt;</code><br /><code>-d&nbsp;&lt;path&gt;</code></a></td>
<td>Path to a source file, or directory containing
source files to analyze. Zip and Jar files are
also supported, if they are specified directly
(archive files found while exploring a directory
are not recursively expanded). This option can
be repeated, and multiple arguments can be
provided to a single occurrence of the option.
One of <code>--dir</code>, <code>--file-list</code> or <code>--uri</code> must be
provided.</td>
<td><code></code></td>
<td></td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="-file-list"><code>--file-list&nbsp;&lt;filepath&gt;</code></a></td>
<td>Path to a file containing a list of files to
analyze, one path per line. One of <code>--dir</code>,
<code>--file-list</code> or <code>--uri</code> must be provided.</td>
<td><code></code></td>
<td></td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="-language"><code>--language&nbsp;&lt;lang&gt;</code><br /><code>-l&nbsp;&lt;lang&gt;</code></a></td>
<td>The source code language.
<p>See also <a href="#supported-languages">Supported Languages</a>.
Using <code>--help</code> will display a full list of supported languages.</p></td>
<td><code>java</code></td>
<td></td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="-debug"><code>--debug</code><br /><code>--verbose</code><br /><code>-D</code><br /><code>-v</code></a></td>
<td>Debug mode. Prints more log output. See also <a href="#logging">Logging</a>.</td>
<td><code></code></td>
<td></td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="-encoding"><code>--encoding&nbsp;&lt;charset&gt;</code><br /><code>-e&nbsp;&lt;charset&gt;</code></a></td>
<td>Specifies the character set encoding of the source code files PMD is reading.
The valid values are the standard character sets of <code>java.nio.charset.Charset</code>.</td>
<td><code>UTF-8</code></td>
<td></td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="-skip-duplicate-files"><code>--skip-duplicate-files</code></a></td>
<td>Ignore multiple copies of files of the same name and length in comparison.</td>
<td><code></code></td>
<td></td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="-exclude"><code>--exclude&nbsp;&lt;path&gt;</code></a></td>
<td>Files to be excluded from the analysis</td>
<td><code></code></td>
<td></td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="-non-recursive"><code>--non-recursive</code></a></td>
<td>Don't scan subdirectories. By default, subdirectories are considered.</td>
<td><code></code></td>
<td></td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="-skip-lexical-errors"><code>--skip-lexical-errors</code></a></td>
<td>Skip files which can't be tokenized due to invalid characters instead of aborting CPD.
By default, CPD analysis is stopped on the first error.</td>
<td><code></code></td>
<td></td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="-format"><code>--format&nbsp;&lt;format&gt;</code><br /><code>-f&nbsp;&lt;format&gt;</code></a></td>
<td>Output format of the analysis report. The available formats
are described <a href="#available-report-formats">here</a>.</td>
<td><code>text</code></td>
<td></td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="-relativize-paths-with"><code>--relativize-paths-with&nbsp;&lt;path&gt;</code><br /><code>-z&nbsp;&lt;path&gt;</code></a></td>
<td>Path relative to which directories are rendered in the report. This option allows
shortening directories in the report; without it, paths are rendered as mentioned in the
source directory (option "--dir").
The option can be repeated, in which case the shortest relative path will be used.
If the root path is mentioned (e.g. "/" or "C:\\"), then the paths will be rendered
as absolute.</td>
<td><code></code></td>
<td></td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="-%5Bno-%5Dfail-on-violation"><code>--[no-]fail-on-violation</code></a></td>
<td>Specifies whether CPD exits with non-zero status if violations are found.
By default CPD exits with status 4 if violations are found.
Disable this feature with <code>--no-fail-on-violation</code> to exit with 0 instead and just output the report.</td>
<td><code></code></td>
<td></td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="-ignore-literals"><code>--ignore-literals</code></a></td>
<td>Ignore literal values such as numbers and strings when comparing text.
By default, literals are not ignored.</td>
<td><code></code></td>
<td>Java</td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="-ignore-literal-sequences"><code>--ignore-literal-sequences</code></a></td>
<td>Ignore sequences of literals such as list initializers.
By default, such sequences of literals are not ignored.</td>
<td><code></code></td>
<td>C#, C++, Lua</td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="-ignore-identifiers"><code>--ignore-identifiers</code></a></td>
<td>Ignore names of classes, methods, variables, constants, etc. when comparing text.
By default, identifier names are not ignored.</td>
<td><code></code></td>
<td>Java</td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="-ignore-annotations"><code>--ignore-annotations</code></a></td>
<td>Ignore language annotations (Java) or attributes (C#) when comparing text.
By default, annotations are not ignored.</td>
<td><code></code></td>
<td>C#, Java</td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="-ignore-sequences"><code>--ignore-sequences</code></a></td>
<td>Ignore sequences of identifier and literals.
By default, such sequences are not ignored.</td>
<td><code></code></td>
<td>C++</td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="-ignore-usings"><code>--ignore-usings</code></a></td>
<td>Ignore <code>using</code> directives in C# when comparing text.
By default, using directives are not ignored.</td>
<td><code></code></td>
<td>C#</td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="-no-skip-blocks"><code>--no-skip-blocks</code></a></td>
<td>Do not skip code blocks matched by <code>--skip-blocks-pattern</code></td>
<td><code></code></td>
<td>C++</td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="-skip-blocks-pattern"><code>--skip-blocks-pattern</code></a></td>
<td>Pattern to find the blocks to skip. It is a string property and contains of two parts,
separated by <code>|</code>. The first part is the start pattern, the second part is the ending pattern.</td>
<td><code>#if&nbsp;0|#endif</code></td>
<td>C++</td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="-uri"><code>--uri&nbsp;&lt;uri&gt;</code><br /><code>-u&nbsp;&lt;uri&gt;</code></a></td>
<td>Database URI for sources. One of <code>--dir</code>,
<code>--file-list</code> or <code>--uri</code> must be provided.</td>
<td><code></code></td>
<td>PLSQL</td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="-help"><code>--help</code><br /><code>-h</code></a></td>
<td>Print help text</td>
<td><code></code></td>
<td></td>
</tr>
</table>
<h3 id="examples">Examples</h3>
<p>Minimum required options: Just give it the minimum duplicate size and the source directory:</p>
<div class="text-left">
<ul class="nav nav-tabs" role="tablist">
<li class="nav-item" role="presentation">
<a class="nav-link active" id="linux-tab-basic" data-toggle="tab" href="#linux-basic" role="tab" aria-controls="linux" aria-selected="true">Linux / macOS</a>
</li>
<li class="nav-item" role="presentation">
<a class="nav-link" id="windows-tab-basic" data-toggle="tab" href="#windows-basic" role="tab" aria-controls="windows" aria-selected="false">Windows</a>
</li>
</ul>
<div class="tab-content border">
<div class="tab-pane fade show active" id="linux-basic" role="tabpanel" aria-labelledby="linux-tab">
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="gp">~ $ </span><span class="s2">pmd</span> cpd --minimum-tokens 100 --dir src/main/java
</code></pre></figure>
</div>
<div class="tab-pane fade" id="windows-basic" role="tabpanel" aria-labelledby="windows-tab">
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="gp">C:\&gt; </span><span class="s2">pmd.bat</span> cpd --minimum-tokens 100 --dir src\main\java
</code></pre></figure>
</div>
</div>
</div>
<p>You can also specify the language:</p>
<div class="text-left">
<ul class="nav nav-tabs" role="tablist">
<li class="nav-item" role="presentation">
<a class="nav-link active" id="linux-tab-lang" data-toggle="tab" href="#linux-lang" role="tab" aria-controls="linux" aria-selected="true">Linux / macOS</a>
</li>
<li class="nav-item" role="presentation">
<a class="nav-link" id="windows-tab-lang" data-toggle="tab" href="#windows-lang" role="tab" aria-controls="windows" aria-selected="false">Windows</a>
</li>
</ul>
<div class="tab-content border">
<div class="tab-pane fade show active" id="linux-lang" role="tabpanel" aria-labelledby="linux-tab">
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="gp">~ $ </span><span class="s2">pmd</span> cpd --minimum-tokens 100 --dir src/main/cpp --language cpp
</code></pre></figure>
</div>
<div class="tab-pane fade" id="windows-lang" role="tabpanel" aria-labelledby="windows-tab">
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="gp">C:\&gt; </span><span class="s2">pmd.bat</span> cpd --minimum-tokens 100 --dir src\main\cpp --language cpp
</code></pre></figure>
</div>
</div>
</div>
<p>You may wish to check sources that are stored in different directories:</p>
<div class="text-left">
<ul class="nav nav-tabs" role="tablist">
<li class="nav-item" role="presentation">
<a class="nav-link active" id="linux-tab-multiple" data-toggle="tab" href="#linux-multiple" role="tab" aria-controls="linux" aria-selected="true">Linux / macOS</a>
</li>
<li class="nav-item" role="presentation">
<a class="nav-link" id="windows-tab-multiple" data-toggle="tab" href="#windows-multiple" role="tab" aria-controls="windows" aria-selected="false">Windows</a>
</li>
</ul>
<div class="tab-content border">
<div class="tab-pane fade show active" id="linux-multiple" role="tabpanel" aria-labelledby="linux-tab">
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="gp">~ $ </span><span class="s2">pmd</span> cpd --minimum-tokens 100 --dir src/main/java --dir src/test/java
</code></pre></figure>
</div>
<div class="tab-pane fade" id="windows-multiple" role="tabpanel" aria-labelledby="windows-tab">
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="gp">C:\&gt; </span><span class="s2">pmd.bat</span> cpd --minimum-tokens 100 --dir src\main\java --dir src\test\java
</code></pre></figure>
</div>
</div>
</div>
<p><em>There is no limit to the number of <code class="language-plaintext highlighter-rouge">--dir</code>, you may add.</em></p>
<p>You may wish to ignore identifiers so that more duplications are found, that only differ in naming:</p>
<div class="text-left">
<ul class="nav nav-tabs" role="tablist">
<li class="nav-item" role="presentation">
<a class="nav-link active" id="linux-tab-ignore_identifiers" data-toggle="tab" href="#linux-ignore_identifiers" role="tab" aria-controls="linux" aria-selected="true">Linux / macOS</a>
</li>
<li class="nav-item" role="presentation">
<a class="nav-link" id="windows-tab-ignore_identifiers" data-toggle="tab" href="#windows-ignore_identifiers" role="tab" aria-controls="windows" aria-selected="false">Windows</a>
</li>
</ul>
<div class="tab-content border">
<div class="tab-pane fade show active" id="linux-ignore_identifiers" role="tabpanel" aria-labelledby="linux-tab">
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="gp">~ $ </span><span class="s2">pmd</span> cpd --minimum-tokens 100 --dir src/main/java --ignore-identifiers
</code></pre></figure>
</div>
<div class="tab-pane fade" id="windows-ignore_identifiers" role="tabpanel" aria-labelledby="windows-tab">
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="gp">C:\&gt; </span><span class="s2">pmd.bat</span> cpd --minimum-tokens 100 --dir src\main\java --ignore-identifiers
</code></pre></figure>
</div>
</div>
</div>
<p>And if youre checking a C source tree with duplicate files in different architecture directories
you can skip those using <code class="language-plaintext highlighter-rouge">--skip-duplicate-files</code>:</p>
<div class="text-left">
<ul class="nav nav-tabs" role="tablist">
<li class="nav-item" role="presentation">
<a class="nav-link active" id="linux-tab-duplicates" data-toggle="tab" href="#linux-duplicates" role="tab" aria-controls="linux" aria-selected="true">Linux / macOS</a>
</li>
<li class="nav-item" role="presentation">
<a class="nav-link" id="windows-tab-duplicates" data-toggle="tab" href="#windows-duplicates" role="tab" aria-controls="windows" aria-selected="false">Windows</a>
</li>
</ul>
<div class="tab-content border">
<div class="tab-pane fade show active" id="linux-duplicates" role="tabpanel" aria-labelledby="linux-tab">
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="gp">~ $ </span><span class="s2">pmd</span> cpd --minimum-tokens 100 --dir src/main/cpp --language cpp --skip-duplicate-files
</code></pre></figure>
</div>
<div class="tab-pane fade" id="windows-duplicates" role="tabpanel" aria-labelledby="windows-tab">
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="gp">C:\&gt; </span><span class="s2">pmd.bat</span> cpd --minimum-tokens 100 --dir src\main\cpp --language cpp --skip-duplicate-files
</code></pre></figure>
</div>
</div>
</div>
<p>You can also specify the encoding to use when parsing files:</p>
<div class="text-left">
<ul class="nav nav-tabs" role="tablist">
<li class="nav-item" role="presentation">
<a class="nav-link active" id="linux-tab-encoding" data-toggle="tab" href="#linux-encoding" role="tab" aria-controls="linux" aria-selected="true">Linux / macOS</a>
</li>
<li class="nav-item" role="presentation">
<a class="nav-link" id="windows-tab-encoding" data-toggle="tab" href="#windows-encoding" role="tab" aria-controls="windows" aria-selected="false">Windows</a>
</li>
</ul>
<div class="tab-content border">
<div class="tab-pane fade show active" id="linux-encoding" role="tabpanel" aria-labelledby="linux-tab">
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="gp">~ $ </span><span class="s2">pmd</span> cpd --minimum-tokens 100 --dir src/main/java --encoding utf-16le
</code></pre></figure>
</div>
<div class="tab-pane fade" id="windows-encoding" role="tabpanel" aria-labelledby="windows-tab">
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="gp">C:\&gt; </span><span class="s2">pmd.bat</span> cpd --minimum-tokens 100 --dir src\main\java --encoding utf-16le
</code></pre></figure>
</div>
</div>
</div>
<p>You can also specify a report format - here were using the XML report:</p>
<div class="text-left">
<ul class="nav nav-tabs" role="tablist">
<li class="nav-item" role="presentation">
<a class="nav-link active" id="linux-tab-report" data-toggle="tab" href="#linux-report" role="tab" aria-controls="linux" aria-selected="true">Linux / macOS</a>
</li>
<li class="nav-item" role="presentation">
<a class="nav-link" id="windows-tab-report" data-toggle="tab" href="#windows-report" role="tab" aria-controls="windows" aria-selected="false">Windows</a>
</li>
</ul>
<div class="tab-content border">
<div class="tab-pane fade show active" id="linux-report" role="tabpanel" aria-labelledby="linux-tab">
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="gp">~ $ </span><span class="s2">pmd</span> cpd --minimum-tokens 100 --dir src/main/java --format xml
</code></pre></figure>
</div>
<div class="tab-pane fade" id="windows-report" role="tabpanel" aria-labelledby="windows-tab">
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="gp">C:\&gt; </span><span class="s2">pmd.bat</span> cpd --minimum-tokens 100 --dir src\main\java --format xml
</code></pre></figure>
</div>
</div>
</div>
<p>The default format is a text report, but there are <a href="#available-report-formats">other supported formats</a></p>
<p>Note that CPDs memory usage increases linearly with the size of the analyzed source code; you may need to give Java more memory to run it, like this:</p>
<div class="text-left">
<ul class="nav nav-tabs" role="tablist">
<li class="nav-item" role="presentation">
<a class="nav-link active" id="linux-tab-memchange" data-toggle="tab" href="#linux-memchange" role="tab" aria-controls="linux" aria-selected="true">Linux / macOS</a>
</li>
<li class="nav-item" role="presentation">
<a class="nav-link" id="windows-tab-memchange" data-toggle="tab" href="#windows-memchange" role="tab" aria-controls="windows" aria-selected="false">Windows</a>
</li>
</ul>
<div class="tab-content border">
<div class="tab-pane fade show active" id="linux-memchange" role="tabpanel" aria-labelledby="linux-tab">
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="gp">~ $ </span><span class="s2">export</span> PMD_JAVA_OPTS=-Xmx512m
<span class="gp">~ $ </span><span class="s2">pmd</span> cpd --minimum-tokens 100 --dir src/main/java
</code></pre></figure>
</div>
<div class="tab-pane fade" id="windows-memchange" role="tabpanel" aria-labelledby="windows-tab">
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="gp">C:\&gt; </span><span class="s2">set</span> PMD_JAVA_OPTS=-Xmx512m
<span class="gp">C:\&gt; </span><span class="s2">pmd.bat</span> cpd --minimum-tokens 100 --dir src\main\java
</code></pre></figure>
</div>
</div>
</div>
<p>If you specify a source directory but dont want to scan the sub-directories, you can use the non-recursive option:</p>
<div class="text-left">
<ul class="nav nav-tabs" role="tablist">
<li class="nav-item" role="presentation">
<a class="nav-link active" id="linux-tab-nonrecursive" data-toggle="tab" href="#linux-nonrecursive" role="tab" aria-controls="linux" aria-selected="true">Linux / macOS</a>
</li>
<li class="nav-item" role="presentation">
<a class="nav-link" id="windows-tab-nonrecursive" data-toggle="tab" href="#windows-nonrecursive" role="tab" aria-controls="windows" aria-selected="false">Windows</a>
</li>
</ul>
<div class="tab-content border">
<div class="tab-pane fade show active" id="linux-nonrecursive" role="tabpanel" aria-labelledby="linux-tab">
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="gp">~ $ </span><span class="s2">pmd</span> cpd --minimum-tokens 100 --dir src/main/java --non-recursive
</code></pre></figure>
</div>
<div class="tab-pane fade" id="windows-nonrecursive" role="tabpanel" aria-labelledby="windows-tab">
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="gp">C:\&gt; </span><span class="s2">pmd.bat</span> cpd --minimum-tokens 100 --dir src\main\java --non-recursive
</code></pre></figure>
</div>
</div>
</div>
<h3 id="exit-status">Exit status</h3>
<p>Please note that if CPD detects duplicated source code, it will exit with status 4 (since 5.0).
This behavior has been introduced to ease CPD integration into scripts or hooks, such as SVN hooks.</p>
<table>
<tr><td>0</td><td>Everything is fine, no code duplications found.</td></tr>
<tr><td>1</td><td>CPD exited with an exception.</td></tr>
<tr><td>2</td><td>Usage error. Command-line parameters are invalid or missing.</td></tr>
<tr><td>4</td><td>At least one code duplication has been detected unless <code>--no-fail-on-violation</code> is set.</td></tr>
</table>
<h2 id="logging">Logging</h2>
<p>PMD internally uses <a href="https://www.slf4j.org/">slf4j</a> and ships with slf4j-simple as the logging implementation.
Logging messages are printed to System.err.</p>
<p>The configuration for slf4j-simple is in the file <code class="language-plaintext highlighter-rouge">conf/simplelogger.properties</code>. There you can enable
logging of specific classes if needed. The <code class="language-plaintext highlighter-rouge">--debug</code> command line option configures the default log level
to be “debug”.</p>
<h2 id="supported-languages">Supported Languages</h2>
<ul>
<li>C#</li>
<li>C/C++</li>
<li><a href="pmd_languages_coco.html">Coco</a></li>
<li>Dart</li>
<li>EcmaScript (JavaScript)</li>
<li>Fortran</li>
<li><a href="pmd_languages_gherkin.html">Gherkin</a> (Cucumber)</li>
<li>Go</li>
<li>Groovy</li>
<li><a href="pmd_languages_html.html">Html</a></li>
<li><a href="pmd_languages_java.html">Java</a></li>
<li><a href="pmd_languages_jsp.html">Jsp</a></li>
<li><a href="pmd_languages_julia.html">Julia</a></li>
<li><a href="pmd_languages_kotlin.html">Kotlin</a></li>
<li>Lua</li>
<li>Matlab</li>
<li>Modelica</li>
<li>Objective-C</li>
<li>Perl</li>
<li>PHP</li>
<li><a href="pmd_languages_plsql.html">PL/SQL</a></li>
<li>Python</li>
<li>Ruby</li>
<li><a href="pmd_languages_apex.html">Salesforce.com Apex</a></li>
<li>Scala</li>
<li>Swift</li>
<li>T-SQL</li>
<li><a href="pmd_languages_js_ts.html">TypeScript</a></li>
<li><a href="pmd_languages_visualforce.html">Visualforce</a></li>
<li>vm (Apache Velocity)</li>
<li><a href="pmd_languages_xml.html">XML</a>
<ul>
<li>POM (Apache Maven)</li>
<li>XSL</li>
<li>WSDL</li>
</ul>
</li>
</ul>
<h2 id="available-report-formats">Available report formats</h2>
<ul>
<li>text : Default format</li>
<li>xml (and xslt)</li>
<li>csv</li>
<li>csv_with_linecount_per_file</li>
<li>vs</li>
</ul>
<p>For details, see <a href="pmd_userdocs_cpd_report_formats.html">CPD Report Formats</a>.</p>
<h2 id="ant-task">Ant task</h2>
<p>Andy Glover wrote an Ant task for CPD; heres how to use it:</p>
<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;path</span> <span class="na">id=</span><span class="s">"pmd.classpath"</span><span class="nt">&gt;</span>
<span class="nt">&lt;fileset</span> <span class="na">dir=</span><span class="s">"/home/joe/pmd-bin-7.0.0-SNAPSHOT/lib"</span><span class="nt">&gt;</span>
<span class="nt">&lt;include</span> <span class="na">name=</span><span class="s">"*.jar"</span><span class="nt">/&gt;</span>
<span class="nt">&lt;/fileset&gt;</span>
<span class="nt">&lt;/path&gt;</span>
<span class="nt">&lt;taskdef</span> <span class="na">name=</span><span class="s">"cpd"</span> <span class="na">classname=</span><span class="s">"net.sourceforge.pmd.ant.CPDTask"</span> <span class="na">classpathref=</span><span class="s">"pmd.classpath"</span> <span class="nt">/&gt;</span>
<span class="nt">&lt;target</span> <span class="na">name=</span><span class="s">"cpd"</span><span class="nt">&gt;</span>
<span class="nt">&lt;cpd</span> <span class="na">minimumTokenCount=</span><span class="s">"100"</span> <span class="na">outputFile=</span><span class="s">"/home/tom/cpd.txt"</span><span class="nt">&gt;</span>
<span class="nt">&lt;fileset</span> <span class="na">dir=</span><span class="s">"/home/tom/tmp/ant"</span><span class="nt">&gt;</span>
<span class="nt">&lt;include</span> <span class="na">name=</span><span class="s">"**/*.java"</span><span class="nt">/&gt;</span>
<span class="nt">&lt;/fileset&gt;</span>
<span class="nt">&lt;/cpd&gt;</span>
<span class="nt">&lt;/target&gt;</span>
</code></pre></div></div>
<!-- TODO avoid duplicating the descriptions! -->
<h3 id="attribute-reference">Attribute reference</h3>
<table>
<tr>
<th>Attribute</th>
<th>Description</th>
<th>Default</th>
<th>Applies to</th>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="minimumtokencount"><code>minimumtokencount</code></a></td>
<td><span class="label label-primary">Required</span> A positive integer indicating the minimum duplicate size.</td>
<td><code></code></td>
<td></td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="encoding"><code>encoding</code></a></td>
<td>The character set encoding (e.g., UTF-8) to use when reading the source code files, but also when
producing the report. A piece of warning, even if you set properly the encoding value,
let's say to UTF-8, but you are running CPD encoded with CP1252, you may end up with not UTF-8 file.
Indeed, CPD copy piece of source code in its report directly, therefore, the source files
keep their encoding.<br />
If not specified, CPD uses the system default encoding.</td>
<td><code></code></td>
<td></td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="format"><code>format</code></a></td>
<td>The format of the report (e.g. <code>csv</code>, <code>text</code>, <code>xml</code>).</td>
<td><code>text</code></td>
<td></td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="ignoreLiterals"><code>ignoreLiterals</code></a></td>
<td>if <code>true</code>, CPD ignores literal value differences when evaluating a duplicate
block. This means that <code>foo=42;</code> and <code>foo=43;</code> will be seen as equivalent. You may want
to run PMD with this option off to start with and then switch it on to see what it turns up.</td>
<td><code>false</code></td>
<td>Java</td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="ignoreIdentifiers"><code>ignoreIdentifiers</code></a></td>
<td>Similar to <code>ignoreLiterals</code> but for identifiers; i.e., variable names, methods names, and so forth.</td>
<td><code>false</code></td>
<td>Java</td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="ignoreAnnotations"><code>ignoreAnnotations</code></a></td>
<td>Ignore annotations. More and more modern frameworks use annotations on classes and methods,
which can be very redundant and trigger CPD matches. With J2EE (CDI, Transaction Handling, etc)
and Spring (everything) annotations become very redundant. Often classes or methods have the
same 5-6 lines of annotations. This causes false positives.</td>
<td><code>false</code></td>
<td>Java</td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="ignoreUsings"><code>ignoreUsings</code></a></td>
<td>Ignore using directives in C#.</td>
<td><code>false</code></td>
<td>C#</td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="skipDuplicateFiles"><code>skipDuplicateFiles</code></a></td>
<td>Ignore multiple copies of files of the same name and length in comparison.</td>
<td><code>false</code></td>
<td></td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="skipLexicalErrors"><code>skipLexicalErrors</code></a></td>
<td>Skip files which can't be tokenized due to invalid characters instead of aborting CPD.</td>
<td><code>false</code></td>
<td></td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="skipBlocks"><code>skipBlocks</code></a></td>
<td>Enables or disabled skipping of blocks like a pre-processor. See also option skipBlocksPattern.</td>
<td><code>true</code></td>
<td>C++</td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="skipBlocksPattern"><code>skipBlocksPattern</code></a></td>
<td>Configures the pattern, to find the blocks to skip. It is a string property and contains of two parts,
separated by <code>|</code>. The first part is the start pattern, the second part is the ending pattern.</td>
<td><code>#if&nbsp;0|#endif</code></td>
<td>C++</td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="language"><code>language</code></a></td>
<td>Flag to select the appropriate language (e.g. <code>c</code>, <code>cpp</code>, <code>cs</code>, <code>java</code>, <code>jsp</code>, <code>php</code>, <code>ruby</code>, <code>fortran</code>
<code>ecmascript</code>, and <code>plsql</code>).</td>
<td><code>java</code></td>
<td></td>
</tr>
<!-- Row of the CLI reference table, describing an option -->
<!-- Rows can be linked to the name of the option (without leading dash) -->
<!-- Argument summary: -->
<!-- options: comma separated list of aliases for the option.-->
<!-- option_arg: optional name for the argument of the option, eg 'arg', will be formatted eg to '<arg>'-->
<!-- description: description, you can use "some" inline markdown -->
<!-- required: whether the option is required, if specified, whatever the value, it's considered required -->
<!-- languages: languages to which the option applies -->
<!-- default: default value -->
<!-- fragment id in the page -->
<tr>
<td><a style="pointer-events: none; cursor: default;" name="outputfile"><code>outputfile</code></a></td>
<td>The destination file for the report. If not specified the console will be used instead.</td>
<td><code></code></td>
<td></td>
</tr>
</table>
<p>Also, you can get verbose output from this task by running ant with the <code class="language-plaintext highlighter-rouge">-v</code> flag; i.e.:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ant -v -f mybuildfile.xml cpd
</code></pre></div></div>
<p>Also, you can get an HTML report from CPD by using the XSLT script in pmd/etc/xslt/cpdhtml.xslt. Just run
the CPD task as usual and right after it invoke the Ant XSLT script like this:</p>
<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nt">&lt;xslt</span> <span class="na">in=</span><span class="s">"cpd.xml"</span> <span class="na">style=</span><span class="s">"etc/xslt/cpdhtml.xslt"</span> <span class="na">out=</span><span class="s">"cpd.html"</span> <span class="nt">/&gt;</span>
</code></pre></div></div>
<p>See <a href="pmd_userdocs_cpd_report_formats.html#xslt">section “xslt” in CPD Report Formats</a> for more examples.</p>
<h2 id="gui">GUI</h2>
<p>CPD also comes with a simple GUI. You can start it through the unified CLI interface provided in the <code class="language-plaintext highlighter-rouge">bin</code> folder:</p>
<div class="text-left">
<ul class="nav nav-tabs" role="tablist">
<li class="nav-item" role="presentation">
<a class="nav-link active" id="linux-tab-gui" data-toggle="tab" href="#linux-gui" role="tab" aria-controls="linux" aria-selected="true">Linux / macOS</a>
</li>
<li class="nav-item" role="presentation">
<a class="nav-link" id="windows-tab-gui" data-toggle="tab" href="#windows-gui" role="tab" aria-controls="windows" aria-selected="false">Windows</a>
</li>
</ul>
<div class="tab-content border">
<div class="tab-pane fade show active" id="linux-gui" role="tabpanel" aria-labelledby="linux-tab">
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="gp">~ $ </span><span class="s2">pmd</span> cpd-gui
</code></pre></figure>
</div>
<div class="tab-pane fade" id="windows-gui" role="tabpanel" aria-labelledby="windows-tab">
<figure class="highlight"><pre><code class="language-bash" data-lang="bash"><span class="gp">C:\&gt; </span><span class="s2">pmd.bat</span> cpd-gui
</code></pre></figure>
</div>
</div>
</div>
<p>Heres a screenshot of CPD after running on the JDK 8 java.lang package:</p>
<figure><img class="docimage" src="images/userdocs/screenshot_cpd.png" alt="CPD Screenshot after running on the JDK 8 java.lang package" /></figure>
<h2 id="suppression">Suppression</h2>
<p>Arbitrary blocks of code can be ignored through comments on <strong>Java</strong>, <strong>C/C++</strong>, <strong>Dart</strong>, <strong>Go</strong>, <strong>Groovy</strong>, <strong>Javascript</strong>,
<strong>Kotlin</strong>, <strong>Lua</strong>, <strong>Matlab</strong>, <strong>Objective-C</strong>, <strong>PL/SQL</strong>, <strong>Python</strong>, <strong>Scala</strong>, <strong>Swift</strong> and <strong>C#</strong> by including the keywords <code class="language-plaintext highlighter-rouge">CPD-OFF</code> and <code class="language-plaintext highlighter-rouge">CPD-ON</code>.</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">public</span> <span class="nc">Object</span> <span class="nf">someParameterizedFactoryMethod</span><span class="o">(</span><span class="kt">int</span> <span class="n">x</span><span class="o">)</span> <span class="kd">throws</span> <span class="nc">Exception</span> <span class="o">{</span>
<span class="c1">// some unignored code</span>
<span class="c1">// tell cpd to start ignoring code - CPD-OFF</span>
<span class="c1">// mission critical code, manually loop unroll</span>
<span class="n">goDoSomethingAwesome</span><span class="o">(</span><span class="n">x</span> <span class="o">+</span> <span class="n">x</span> <span class="o">/</span> <span class="mi">2</span><span class="o">);</span>
<span class="n">goDoSomethingAwesome</span><span class="o">(</span><span class="n">x</span> <span class="o">+</span> <span class="n">x</span> <span class="o">/</span> <span class="mi">2</span><span class="o">);</span>
<span class="n">goDoSomethingAwesome</span><span class="o">(</span><span class="n">x</span> <span class="o">+</span> <span class="n">x</span> <span class="o">/</span> <span class="mi">2</span><span class="o">);</span>
<span class="n">goDoSomethingAwesome</span><span class="o">(</span><span class="n">x</span> <span class="o">+</span> <span class="n">x</span> <span class="o">/</span> <span class="mi">2</span><span class="o">);</span>
<span class="n">goDoSomethingAwesome</span><span class="o">(</span><span class="n">x</span> <span class="o">+</span> <span class="n">x</span> <span class="o">/</span> <span class="mi">2</span><span class="o">);</span>
<span class="n">goDoSomethingAwesome</span><span class="o">(</span><span class="n">x</span> <span class="o">+</span> <span class="n">x</span> <span class="o">/</span> <span class="mi">2</span><span class="o">);</span>
<span class="c1">// resume CPD analysis - CPD-ON</span>
<span class="c1">// further code will *not* be ignored</span>
<span class="o">}</span>
</code></pre></div></div>
<p>Additionally, <strong>Java</strong> allows to toggle suppression by adding the annotations
<strong><code class="language-plaintext highlighter-rouge">@SuppressWarnings("CPD-START")</code></strong> and <strong><code class="language-plaintext highlighter-rouge">@SuppressWarnings("CPD-END")</code></strong>
all code within will be ignored by CPD.</p>
<p>This approach however, is limited to the locations were <code class="language-plaintext highlighter-rouge">@SuppressWarnings</code> is accepted.
It is legacy and the new comment based approach should be favored.</p>
<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">//enable suppression</span>
<span class="nd">@SuppressWarnings</span><span class="o">(</span><span class="s">"CPD-START"</span><span class="o">)</span>
<span class="kd">public</span> <span class="nc">Object</span> <span class="nf">someParameterizedFactoryMethod</span><span class="o">(</span><span class="kt">int</span> <span class="n">x</span><span class="o">)</span> <span class="kd">throws</span> <span class="nc">Exception</span> <span class="o">{</span>
<span class="c1">// any code here will be ignored for the duplication detection</span>
<span class="o">}</span>
<span class="c1">//disable suppression</span>
<span class="nd">@SuppressWarnings</span><span class="o">(</span><span class="s">"CPD-END"</span><span class="o">)</span>
<span class="kd">public</span> <span class="kt">void</span> <span class="nf">nextMethod</span><span class="o">()</span> <span class="o">{</span>
<span class="o">}</span>
</code></pre></div></div>
<p>Other languages currently have no support to suppress CPD reports. In the future,
the comment based approach will be extended to those of them that can support it.</p>
<h2 id="credits">Credits</h2>
<p>CPD has been through three major incarnations:</p>
<ul>
<li>
<p>First we wrote it using a variant of Michael Wises Greedy String Tiling algorithm (our variant is described
<a href="http://www.onjava.com/pub/a/onjava/2003/03/12/pmd_cpd.html">here</a>).</p>
</li>
<li>
<p>Then it was completely rewritten by Brian Ewins using the
<a href="http://dogma.net/markn/articles/bwt/bwt.htm">Burrows-Wheeler transform</a>.</p>
</li>
<li>
<p>Finally, it was rewritten by Steve Hawkins to use the
<a href="http://www.nist.gov/dads/HTML/karpRabin.html">Karp-Rabin</a> string matching algorithm.</p>
</li>
</ul>
<div class="tags">
<b>Tags: </b>
<a href="tag_userdocs.html" class="btn btn-outline-secondary navbar-btn cursorNorm" role="button">userdocs</a>
</div>
</div>
<footer>
<hr />
<div>
This documentation is written in markdown. <br />
If there is something missing or can be improved, edit this page on
github and create a PR:
<a
target="_blank"
href="https://github.com/pmd/pmd/blob/master/docs/pages/pmd/userdocs/cpd/cpd.md"
role="button"
><i class="fab fa-github fa-lg"></i> Edit on GitHub</a
>
</div>
<hr />
<div class="row">
<div class="col-lg-12 footer">
&copy;2024 PMD Open Source Project. All rights
reserved. <br />
<span>Page last updated:</span>
August 2023 (7.0.0)<br /> Site last generated: Feb 29, 2024 <br />
<p>
<img src="images/logo/pmd-logo-70px.png" alt="PMD
logo"/>
</p>
</div>
</div>
</footer>
</div>
<!-- /.row -->
</div>
<!-- /.container -->
</div>
<!-- Sticky TOC column -->
<div class="toc-col">
<div id="toc"></div>
</div>
<!-- /.toc-container-wrapper -->
</div>
</div>
<script type="application/javascript" src="assets/jquery-3.5.1/jquery-3.5.1.min.js"></script>
<script type="application/javascript" src="assets/anchorjs-4.2.2/anchor.min.js"></script>
<script type="application/javascript" src="assets/navgoco-0.2.1/src/jquery.navgoco.min.js"></script>
<script type="application/javascript" src="assets/bootstrap-4.5.2-dist/js/bootstrap.bundle.min.js"></script>
<script type="application/javascript" src="assets/Simple-Jekyll-Search-1.0.8/dest/jekyll-search.js"></script>
<script type="application/javascript" src="assets/jekyll-table-of-contents/toc.js"></script>
<script type="application/javascript" src="js/tabstate.js"></script>
<script type="application/javascript" src="js/customscripts.js"></script>
</body>
</html>