-
Notifications
You must be signed in to change notification settings - Fork 39
refactor: Remove Xerces/xml-apis dependencies and modernize SAX readers #223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
refactor: Remove Xerces/xml-apis dependencies and modernize SAX readers #223
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR successfully removes the Xerces and xml-apis dependencies from the codebase, replacing them with a custom AbstractXMLReader base class that provides standard JDK XMLReader functionality. This modernization eliminates classloader conflicts, reduces security risks, and decreases artifact size by leveraging the built-in Java XML parser.
Key changes:
- Introduced
AbstractXMLReaderto centralize XMLReader interface boilerplate and provide strict SAX2 compliance - Refactored 4 custom reader classes (DelimitedReader, EDIReader, ER7Reader, NCPDPReader) to extend
AbstractXMLReaderinstead of Xerces SAXParser - Removed xercesImpl and xml-apis JARs from both client and server dependencies
Reviewed changes
Copilot reviewed 8 out of 12 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| server/src/org/openintegrationengine/engine/plugins/datatypes/AbstractXMLReader.java | New base class providing XMLReader interface implementation with handler management and SAX2 feature compliance |
| server/src/com/mirth/connect/plugins/datatypes/ncpdp/NCPDPReader.java | Refactored to extend AbstractXMLReader, added ensureHandlerSet() call, updated license header |
| server/src/com/mirth/connect/plugins/datatypes/hl7v2/ER7Reader.java | Refactored to extend AbstractXMLReader, added ensureHandlerSet() call, updated license header |
| server/src/com/mirth/connect/plugins/datatypes/edi/EDIReader.java | Refactored to extend AbstractXMLReader, added ensureHandlerSet() call, updated license header |
| server/src/com/mirth/connect/plugins/datatypes/delimited/DelimitedReader.java | Refactored to extend AbstractXMLReader, added ensureHandlerSet() call, removed unused ContentHandler import, updated license header |
| server/docs/thirdparty/THIRD-PARTY-README.txt | Removed documentation for xml-apis and Xerces dependencies |
| server/.classpath | Removed xercesImpl-2.12.2.jar and xml-apis-1.4.01.jar entries |
| client/.classpath | Removed xercesImpl-2.12.2.jar and xml-apis-1.4.01.jar entries |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return; | ||
| } | ||
| else if ("http://xml.org/sax/features/namespace-prefixes".equals(name)) { | ||
| if (value == true) { |
Copilot
AI
Dec 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use value instead of value == true for consistency with boolean checks. This is the more idiomatic way to check boolean values in Java.
Change from:
if (value == true) {To:
if (value) {| if (value == true) { | |
| if (value) { |
server/src/org/openintegrationengine/engine/plugins/datatypes/AbstractXMLReader.java
Outdated
Show resolved
Hide resolved
| @Override | ||
| public void setFeature(String name, boolean value) throws SAXNotRecognizedException, SAXNotSupportedException { | ||
| if ("http://xml.org/sax/features/namespaces".equals(name)) { | ||
| if (value == false) { |
Copilot
AI
Dec 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use !value instead of value == false for consistency with the value == true check on line 221. This is the more idiomatic way to check boolean values in Java.
Change from:
if (value == false) {To:
if (!value) {| if (value == false) { | |
| if (!value) { |
mgaffigan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll approve after I can kick the tires a bit - nothing obvious apart from the bool comparisons copilot already called out.
I assume this is a breaking change for any data type plugin. Should be called out in release notes.
Yes, it would be a breaking change for anything that specifically imports from packages below I am going to have follow-up PRs which are dependent on this one to remove other outdated xml dependencies, but this one had to go first since the standalone xerces release only supports JAXP version 1.4, which is what was bundled with Java 6. The internal implementation has been at JAXP 1.6 since Java 8. With this jar on the classpath, standard To test this I created channels where the source inbound type was the target datatype and the outbound type was XML. The destination reversed this and went from xml to the target type. I created a sample message and verified the transformations to and from xml so the final output was the same as the original. I did this for delimited, hl7v2, and edi. I did not know where to find a sample message to test NCPDP, but I assume it's fine since the others all worked. The bool comparison thing I went back and forth on before I opened the PR. I decided to keep it with the equality comparison since the parameter represents a value to be set, and is not meant for flow control, so it made more sense to me to do the explicit comparison as I would do for any value in a key/val pair that wasn't a boolean. I don't have a strong preference either way, though. |
mgaffigan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested by creating type to xml channels on Mirth 4.4.0 and this branch. Compared results given same source messages.
Here's a NCPDP Telecom Sample File. Note: File includes non-printable characters.
kayyagari
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Built with Java 17 and tested using a channel with both inbound and outbound datatypes set to a combination of Hl7v2.x and XML.
NicoPiel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the motivation includes “correctly report namespace support to downstream transformers,” can we add a regression test that runs a reader through the actual transformation path (SAXSource/Transformer) to confirm no additional feature/property probes break execution?
| */ | ||
| @Override | ||
| public Object getProperty(String name) throws SAXNotRecognizedException, SAXNotSupportedException { | ||
| throw new SAXNotRecognizedException("Property not recognized: " + name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some transformer pipelines probe standard SAX properties (e.g., lexical handler). Should we recognize a minimal set to reduce compatibility risk, even if unsupported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to LexicalHandler, support for this is optional.
If the reader does not report lexical events, it will throw a SAXNotRecognizedException when you attempt to register the handler.
| */ | ||
| @Override | ||
| public void setProperty(String name, Object value) throws SAXNotRecognizedException, SAXNotSupportedException { | ||
| throw new SAXNotRecognizedException("Property not recognized: " + name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some transformer pipelines probe standard SAX properties (e.g., lexical handler). Should we recognize a minimal set to reduce compatibility risk, even if unsupported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to LexicalHandler, support for this is optional.
If the reader does not report lexical events, it will throw a SAXNotRecognizedException when you attempt to register the handler.
5403e8f
Removes the dependency on `xercesImpl` and `xml-apis` to resolve
classloader conflicts and reduce security risks in Java 17+
environments. Refactors custom Reader classes to use the standard JDK
`XMLReader` interface instead of extending the Apache Xerces
`SAXParser`.
Changes:
- **Dependency Removal:** Removed `xercesImpl` and `xml-apis` from the
build configuration.
- **New Base Class:** Introduced `AbstractXMLReader` to centralize
common `XMLReader` logic (handler management, feature/property
compliance) and reduce code duplication across plugins.
- **Reader Refactoring:** Updated `DelimitedReader`, `EDIReader`,
`ER7Reader`, and `NCPDPReader`:
- Replaced inheritance (`extends SAXParser`) with `extends AbstractXMLReader`.
- Added standard `parse(String)` overload via the base class.
- Implemented strict SAX2 compliance for `getFeature`/`setFeature`
to correctly report namespace support to downstream transformers.
- **Code Cleanup:** Removed unused imports of
`org.apache.xerces.parsers.SAXParser`.
- **Documentation Cleanup:** Removed references to removed libraries in
`THIRD-PARTY-README.txt`
This change ensures the application uses the built-in JDK XML parser,
reducing the artifact size and aligning with modern Java best practices.
Signed-off-by: Tony Germano <tony@germano.name>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Tony Germano <39179051+tonygermano@users.noreply.github.com>
5403e8f to
bf0392d
Compare
Signed-off-by: Tony Germano <tony@germano.name>
Signed-off-by: Tony Germano <tony@germano.name>
According to the javadoc for ContentHandler.startElement, if there are no attributes, then an empty Attributes object should be passed rather than null. For both startElement and endElement if qualified names are not available then the qname parameter should be an empty string. It was already being used this way in most instances, but a few were passing null instead. Signed-off-by: Tony Germano <tony@germano.name>
bf0392d to
3a291f7
Compare
"correctly report namespace support to downstream transformers" is alluding to the fact that the XMLReader.setFeature method states:
These features affect the argument values used when the The main purpose of this PR was to remove the imports of
I pushed some additional updates that now include tests for the new class and some additional changes after reading more of the interface javadocs to make sure the plugins follow the contract (mainly passing an empty When merging I plan to squash the first 4 commits and keep the last one separate. |
Removes the dependency on
xercesImplandxml-apisto resolve classloader conflicts and reduce security risks in Java 17+ environments. Refactors custom Reader classes to use the standard JDKXMLReaderinterface instead of extending the Apache XercesSAXParser.Changes:
xercesImplandxml-apisfrom the build configuration.AbstractXMLReaderto centralize commonXMLReaderlogic (handler management, feature/property compliance) and reduce code duplication across plugins.DelimitedReader,EDIReader,ER7Reader, andNCPDPReader:extends SAXParser) withextends AbstractXMLReader.parse(String)overload via the base class.getFeature/setFeatureto correctly report namespace support to downstream transformers.org.apache.xerces.parsers.SAXParser.THIRD-PARTY-README.txtThis change ensures the application uses the built-in JDK XML parser, reducing the artifact size and aligning with modern Java best practices.