Hi,
When computing connectedComponentsusing the graphframes algorithm I get the following error:
File "/root/.ivy2/jars/graphframes_graphframes-0.5.0-spark2.1-s_2.11.jar/graphframes/graphframe.py", line 279, in connectedComponents
File "/usr/spark-2.1.1-bin-hadoop2.7/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
File "/usr/spark-2.1.1-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/sql/utils.py", line 69, in deco
pyspark.sql.utils.AnalysisException: u'Unable to infer schema for Parquet. It must be specified manually.;'
Looking at the code, I guess this is due to saving an empty checkpoint in parquet format. Because there are similar issues with spark when trying to load empty parquet files. So maybe more a spark issue, however in the meantime, checking if the ee dataframe is empty before saving it to parquet could help ?.
Thomas.