Skip to content

Implement Graph Data Structures and Search Iterators#299

Closed
lafita wants to merge 7 commits into
biojava:masterfrom
lafita:minor
Closed

Implement Graph Data Structures and Search Iterators#299
lafita wants to merge 7 commits into
biojava:masterfrom
lafita:minor

Conversation

@lafita

@lafita lafita commented Jul 14, 2015

Copy link
Copy Markdown
Member

Since there was already a Graph interface and an implementation for Undirected Graphs in biojava (symmetry project) and I was using a lot graph searching algorithms and directed graphs for symmetry, I created some new classes to represent different types of Graphs and the two basic searching algorithms as iterators for them. New classes:

  • DirectedGraph
  • DirectedAcyclicGraph (DAG)
  • Depth First Searching (DFS) iterator
  • Breath First Searching (BFS) iterator
  • Test for their correctness

The idea is to have a set of basic and simple graph support, without importing a whole external library. The code is now in structure.symmetry.utils, would it make sense to create a new package for it? It is general enough so that it doesn't have to be inside the structure package.

Another possible extension for graph features would be to include Weighted Graphs, but I haven't done anything about that since I haven't need them until now.

@sbliven

sbliven commented Jul 14, 2015

Copy link
Copy Markdown
Member

I'm ambivalent about including Graph code in BioJava. There are many packages for this already (e.g. jgrapht) that provide much more complete and feature rich implementations. On the other hand, I am reluctant to add another big dependency, even if just to the structure module.

Are we OK with the status quo (SimpleGraph, buried in the symmetry package but slowly gaining additional features such as this pull)? Or should we at least make a separate package for it?

@andylaw

andylaw commented Jul 14, 2015

Copy link
Copy Markdown

Playing devil�s advocate, what is the aversion to �importing a whole external library� when that external library will have solved your problem already? Why re-invent this wheel?

Later,

Andy

On 14 Jul 2015, at 16:51, Aleix Lafita notifications@github.com wrote:

Since there was already a Graph interface and an implementation for Undirected Graphs in biojava (symmetry project) and I was using a lot graph searching algorithms and directed graphs for symmetry, I created some new classes to represent different types of Graphs and the two basic searching algorithms as iterators for them. New classes:

� DirectedGraph
� DirectedAcyclicGraph (DAG)
� Depth First Searching (DFS) iterator
� Breath First Searching (BFS) iterator
� Test for their correctness
The idea is to have a set of basic and simple graph support, without importing a whole external library. The code is now in structure.symmetry.utils, would it make sense to create a new package for it? It is general enough so that it doesn't have to be inside the structure package.

Another possible extension for graph features would be to include Weighted Graphs, but I haven't done anything about that since I haven't need them until now.

You can view, comment on, or merge this pull request online at:

#299

Commit Summary

� Test all String output Writer methods of MultipleAlignment
� Implementation for a Directed Graph following Graph interface
� Fix bugs and extend DirectedGraph implementation
� Implement a DAG graph data structure
� Implement a depth first search DFS iterator for graphs
� Implement breath first search BFS iterator for graph search
� Test for all new Graph implementations
File Changes

� M biojava-structure-gui/src/main/java/org/biojava/nbio/structure/gui/util/SelectMultiplePanel.java (2)
� M biojava-structure/src/main/java/org/biojava/nbio/structure/align/multiple/mc/MultipleMcOptimizer.java (12)
� M biojava-structure/src/main/java/org/biojava/nbio/structure/align/util/RotationAxis.java (28)
� A biojava-structure/src/main/java/org/biojava/nbio/structure/symmetry/utils/DirectedAcyclicGraph.java (62)
� A biojava-structure/src/main/java/org/biojava/nbio/structure/symmetry/utils/DirectedGraph.java (283)
� M biojava-structure/src/main/java/org/biojava/nbio/structure/symmetry/utils/Edge.java (18)
� M biojava-structure/src/main/java/org/biojava/nbio/structure/symmetry/utils/Graph.java (191)
� A biojava-structure/src/main/java/org/biojava/nbio/structure/symmetry/utils/GraphIteratorBFS.java (67)
� A biojava-structure/src/main/java/org/biojava/nbio/structure/symmetry/utils/GraphIteratorDFS.java (105)
� M biojava-structure/src/main/java/org/biojava/nbio/structure/symmetry/utils/SimpleGraph.java (52)
� M biojava-structure/src/test/java/org/biojava/nbio/structure/align/multiple/TestMultipleAlignmentWriter.java (259)
� M biojava-structure/src/test/java/org/biojava/nbio/structure/align/multiple/TestSampleGenerator.java (46)
� A biojava-structure/src/test/java/org/biojava/nbio/structure/symmetry/TestGraphIterators.java (161)
� A biojava-structure/src/test/resources/testMSTA1.fasta (6)
� A biojava-structure/src/test/resources/testMSTA1.fatcat (14)
� A biojava-structure/src/test/resources/testMSTA1.transforms (26)
� A biojava-structure/src/test/resources/testMSTA1.xml (74)
� A biojava-structure/src/test/resources/testMSTA1_alnres.tsv (44)
� A biojava-structure/src/test/resources/testMSTA2.fasta (8)
� A biojava-structure/src/test/resources/testMSTA2.fatcat (16)
� A biojava-structure/src/test/resources/testMSTA2.transforms (34)
� A biojava-structure/src/test/resources/testMSTA2.xml (76)
� A biojava-structure/src/test/resources/testMSTA2_alnres.tsv (48)
Patch Links:

https://github.com/biojava/biojava/pull/299.patch
https://github.com/biojava/biojava/pull/299.diff

Reply to this email directly or view it on GitHub.

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

@andreasprlic

Copy link
Copy Markdown
Member

We recently added the vecmath dependency to the structure modules. Why not also add a dependency for a graph library?

@andylaw

andylaw commented Jul 14, 2015

Copy link
Copy Markdown

On 14 Jul 2015, at 17:13, Spencer Bliven notifications@github.com wrote:

I'm ambivalent about including Graph code in BioJava. There are many packages for this already (e.g. jgrapht) that provide much more complete and feature rich implementations. On the other hand, I am reluctant to add another big dependency, even if just to the structure module.

Are we OK with the status quo (SimpleGraph, buried in the symmetry package but slowly gaining additional features such as this pull)? Or should we at least make a separate package for it?

If there is a genuine aversion to extra dependencies, make them �optional�. Then the end-user can include them explicitly if she/he needs them.

Later,

Andy

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

@sbliven

sbliven commented Jul 14, 2015

Copy link
Copy Markdown
Member

@andylaw The most straightforward way of executing a downstream tool requires listing all dependencies in the classpath. So adding a dependency means one more jar to ship with the tool, and one more -cp entry at the command line (yes, I am aware there are easier ways of distributing, but I believe that is still the recommended procedure).

@andreasprlic Vecmath is different because it comes bundled with most JREs (either in java3d or as part of the core JRE)

@sbliven

sbliven commented Jul 14, 2015

Copy link
Copy Markdown
Member

@andylaw Is there a clean way of making dependencies optional? Other than just omitting the jar and hoping you don't get a ClassNotFound at runtime?

@andylaw

andylaw commented Jul 14, 2015

Copy link
Copy Markdown

I believe that true in the dependency declaration in the pom file is the way to do it.

Personally, I�d just slap the dependency in as a normal one in any case and leave those who are worried about their jar sizes to exclude it if it causes them problems.

Later,

Andy

On 14 Jul 2015, at 17:27, Spencer Bliven notifications@github.com wrote:

@andylaw Is there a clean way of making dependencies optional? Other than just omitting the jar and hoping you don't get a ClassNotFound at runtime?


Reply to this email directly or view it on GitHub.

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

@heuermh

heuermh commented Jul 14, 2015

Copy link
Copy Markdown
Member

I have no problems with external dependencies as long as they are available from the Maven Central repository.

For graphs I often use the Blueprints APIs
https://github.com/tinkerpop/blueprints/wiki

so that the implementation is pluggable (e.g., in-memory, Neo4j, Titan). Such might be overkill for this specific use case though.

@lafita

lafita commented Jul 14, 2015

Copy link
Copy Markdown
Member Author

I think it would be great to have some sort of Graph library for biojava, even if it is a basic one, since most problems are fundamentally graph searching problems. So from my ignorance on the dependency problems I vote to add a graph library.

My idea was to put the most basic graph algorithms in the same place so that they can be reused, as simple as possible, since I was rewriting them in every method that needed a graph.

@josemduarte

Copy link
Copy Markdown
Contributor

I'd definitely support using an external graph library, personally I've used jgrapht a lot lately and I'm very happy with it. It can be used directly through maven central. Besides graph data structures and algorithms are likely to be needed elsewhere in BioJava sooner or later.

Outsourcing things to an external library has only advantages in my view: clean, well-thought interfaces, well-tested and debugged code, more people contributing to it... Dependencies are easy to sort out with maven, that shouldn't be an issue.

@pwrose

pwrose commented Jul 14, 2015

Copy link
Copy Markdown
Member

If jgrapht can easily replace the current implemention than I'm all for it.

On Tue, Jul 14, 2015 at 11:09 AM, Jose Manuel Duarte <
notifications@github.com> wrote:

I'd definitely support using an external graph library, personally I've
used jgrapht http://jgrapht.org/ a lot lately and I'm very happy with
it. It can be used directly through maven central. Besides graph data
structures and algorithms are likely to be needed elsewhere in BioJava
sooner or later.

Outsourcing things to an external library has only advantages in my view:
clean, well-thought interfaces, well-tested and debugged code, more people
contributing to it... Dependencies are easy to sort out with maven, that
shouldn't be an issue.


Reply to this email directly or view it on GitHub
#299 (comment).

Peter Rose, Ph.D.
Site Head, RCSB Protein Data Bank West (http://www.rcsb.org)
San Diego Supercomputer Center (http://bioinformatics.sdsc.edu)
University of California, San Diego
+1-858-822-5497

@lafita

lafita commented Aug 14, 2015

Copy link
Copy Markdown
Member Author

I close this because the decision was including a graph library as a dependency. I will keep the branch for some time just in case, and I will open an issue as a TODO for the graph library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants