Right now the ConnectedComponents algorithm has a hard limit on the number of edges as an implementation detail: https://github.com/graphframes/graphframes/blob/master/src/main/scala/org/graphframes/lib/ConnectedComponents.scala#L371-L374
At Software Heritage the graph on which we run this algorithm currently has >100B edges with an exponential growth. We think this 200B threshold might be crossed in 1-2 years.
Would it be possible to increase this limit so that the algorithm is able to handle larger graphs?
Thanks!