Skip to content

feat: change the return result for PageRank from GraphFrame to DataFrame #627

@SemyonSinchenko

Description

@SemyonSinchenko

Is your feature request related to a problem? Please describe.
At the moment all the algorithms from lib return DataFrame and only PageRank returns a new GraphFrame where vertices has an additional column.

Describe the solution you would like

  1. Add an argument to PySpark wrappers like legacy_mode: bool that is true by default to avoid breaking changes
  2. Based on this argument, if legacy mode we are returning from PageRank a GraphFrame, but if it is false only a DataFrame of scores
  3. Raise a deprecation warning if the legacy_mode is true with a messages that in future version PageRank will return only DataFrame with scores
  4. Add the same argument to scala parts

Component

  • Scala Core Internal
  • Scala API
  • Spark Connect Plugin
  • Infrastructure
  • PySpark Classic
  • PySpark Connect

Additional context

We can plan it for the future 1.0 release to follow a semantic versioning. Feel free to ask me about details.

Are you planning on creating a PR?

  • I'm willing to make a pull-request

Metadata

Metadata

Assignees

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions