3 of 3: Documentation cleanup and update. Added a motif finding tutorial.#511
3 of 3: Documentation cleanup and update. Added a motif finding tutorial.#511
Conversation
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## rjurney/motif-tutorial-code-min #511 +/- ##
==================================================================
Coverage ? 90.73%
==================================================================
Files ? 18
Lines ? 896
Branches ? 103
==================================================================
Hits ? 813
Misses ? 83
Partials ? 0 ☔ View full report in Codecov by Sentry. |
…s.txt and split out requirements-dev.txt. Version bumps.
…ney/build-upgrades
…ney/build-upgrades
…mes/graphframes into rjurney/motif-tutorial-docs
|
Okay @bjornjorgensen @SauronShepherd @SemyonSinchenko I am done here. I would maybe suggest reviewing it by reading docs/README.md and running |
…graphframes into rjurney/motif-tutorial-docs
|
@SemyonSinchenko this one is hard to review... lots of text changes. I have double checked it, but appreciate any review you can give :) |
SemyonSinchenko
left a comment
There was a problem hiding this comment.
LGTM overall, thank you! The only important comment is about licenses. Other comments are minor.
| API for motif finding. The user also benefits from DataFrame performance optimizations | ||
| within the Spark SQL engine. | ||
| This is a package for DataFrame-based graphs on top of Apache Spark. Users can write highly expressive queries by leveraging the DataFrame API, combined with a new API for network motif finding. The user also benefits from DataFrame performance optimizations within the Spark SQL engine. GraphFrames works in Java, Scala, and Python. | ||
|
|
There was a problem hiding this comment.
Let's add a note about deprecation of GraphX in Spark 4.0? Something like "With deprecation of GraphX in Apache Spark Core GraphFrames is the recommended approach for the distributed graphs processing."?
There was a problem hiding this comment.
@SemyonSinchenko for the moment - and we can get your own PR to do this if you want - I prefer to pretend that didn't happen. It isn't good for the project and we have a ways to go before merging with Spark.
There was a problem hiding this comment.
Where is this one taken? I can assume that it is from a scientific paper that may have a license like Creative Commons or a proprietary one. I think it is important to maintain IP clearance if we are going to move GraphFrames to the Spark core one day. Could you please add a source or even better to add a NOTICE.md file to the root of project with mentioning of the source of that picture and how is it licensed?
There was a problem hiding this comment.
@SemyonSinchenko It's from this paper, which the caption and image itself lead to if you click: https://www.computer.org/csdl/journal/tb/2017/06/07501518/13rRUxcsYKf
The thing is... there is no open license image of graphlets, let alone directed graphlets. What if I include it now and seek permission in the meanwhile? Then I can figure out some alternative and in the meanwhile people can still learn about graphlets? Worst case I can generate them all in networkx or something.
There was a problem hiding this comment.
I reached out to the authors, asking for permission. It would be best to secure their permission. In reading this guide to fair use, I believe this image may qualify: https://www.copyrightlaws.com/simple-guide-fair-use/
There was a problem hiding this comment.
I got permission to use the image :)
Co-authored-by: Sem <ssinchenko@apache.org>
Co-authored-by: Sem <ssinchenko@apache.org>
This is a documentation only change, the docs part of #473. The docs build and look okay when I serve them.
What changes were proposed in this pull request?
I added a network motif tutorial and updated / cleaned the docs generally. I added the new google group and the new logo to the images folder.
Why are the changes needed?
The docs were wrong in a few ways and there weren't even instructions in the README, which stopped new users from onboarding.