Skip to content

acryl-spark-lineage jar contains multiple copies of dependencies #14989

@umartin

Description

@umartin

Describe the bug
acryl-spark-lineage jar contains both relocated dependencies and non-relocated dependencies. Sometimes with multiple versions. The non-relocated dependencies can cause class loading conflict with other spark extensions.

To Reproduce
Three copies of antlr is packaged in the jar.

$ zipinfo -1 metadata-integration/java/acryl-spark-lineage/build/libs/acryl-spark-lineage_2.12-1.3.0rc1-SNAPSHOT.jar | grep  'antlr*\/$'
org/antlr/
io/acryl/shaded/st4hidden/org/antlr/
antlr/
META-INF/maven/org.antlr/

A more complete listing of the jar content

$ zipinfo -1 metadata-integration/java/acryl-spark-lineage/build/libs/acryl-spark-lineage_2.12-1.3.0rc1-SNAPSHOT.jar | grep  '^[[:alnum:]]*\/[[:alnum:]]*\/$'
io/openlineage/
datahub/spark/
io/datahubproject/
datahub/client/
io/acryl/
org/publicsuffix/
com/linkedin/
datahub/event/
org/apache/
legacyPegasusSchemas/com/
pegasus/com/
com/datahub/
io/confluent/
software/amazon/
common/message/
datahub/spark2/
io/micrometer/
io/github/
nonapi/io/
org/eclipse/
jakarta/json/
com/google/
org/codehaus/
org/antlr/
io/swagger/
com/eclipsesource/
org/reactivestreams/
antlr/actions/
antlr/ASdebug/
antlr/build/
antlr/collections/
antlr/debug/
antlr/preprocessor/
aix/ppc64/
darwin/aarch64/
freebsd/amd64/
freebsd/i386/
linux/aarch64/
linux/amd64/
linux/arm/
linux/i386/
linux/loongarch64/
linux/mips64/
linux/ppc64/
linux/ppc64le/
linux/riscv64/
linux/s390x/
win/aarch64/
win/amd64/
win/x86/
net/jpountz/
org/xerial/

Expected behavior
Non-relocated dependencies is fine in an executable binary. For a library it is a nightmare since dependencies are hidden from any resolution mechanism in dependency resolvers. Class loading is not deterministic. Antlr 4.5 in acryl-spark-lineage:0.2.18 might get loaded before the antlr version used by delta-lake which causes non recoverable errors.

A library should not package any dependencies or relocate all of it's packaged dependencies.

  • OS: any
  • Version: acryl-spark-lineage 0.2.18 and also current main branch

Metadata

Metadata

Assignees

Labels

bugBug report

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions