The is the officila graphframes-py PyPI package, which is a Python wrapper for the Scala GraphFrames library. This package is maintained by the GraphFrames project and is available on PyPI.
For instructions on GraphFrames, check the project README.md.
See Installation and Quick-Start for the best way to install and use GraphFrames.
pip install graphframes-pyNOTE! Python distribution does not include JVM-core. You need to add it to your cluster or Spark-Connect server!
You should use GraphFrames via the --packages argument to pyspark or spark-submit, but this package is helpful in development environments.
# Interactive Python, Spark 3.5.x
$ pyspark --packages io.graphframes:graphframes-spark3_2.12:0.9.2
# Interactive Python, Spark 4.0.x
$ pyspark --packages io.graphframes:graphframes-spark4_2.13:0.9.2GraphFrames PySpark is choosing connect or classic implementation implicitly based on the result of is_remote().
To enforce usage of connect-based implementation, you may export this variable SPARK_CONNECT_MODE_ENABLED=1
