Skip to content

Commit d106b03

Browse files
dicknetherlandsandreasprlic
authored andcommitted
Change to wiki page
1 parent f705f31 commit d106b03

File tree

2 files changed

+7
-58
lines changed

2 files changed

+7
-58
lines changed

_wikis/BioJava3_Proposal.md

Lines changed: 5 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -33,48 +33,27 @@ Proposal
3333

3434
- Analyse how BioJava is being used by the community. See the
3535
[UsageAnalysis](UsageAnalysis "wikilink") page.
36-
37-
<!-- -->
38-
3936
- To start from scratch, creating a number of smaller jars as
4037
sub-projects within an umbrella BioJava3 project. Each jar would
4138
provide tools for a specific purpose. Additional jars would provide
4239
cross-purpose tools such as format converters or text-to-object
4340
interfaces.
44-
45-
<!-- -->
46-
4741
- Although starting from scratch, much existing code could be reused
4842
or refactored to suit the new design.
49-
50-
<!-- -->
51-
5243
- We would take full advantage of Java 6, including generics,
5344
(@)annotations, the built-in property change support. Everything
5445
would be a bean - absolutely everything.
55-
56-
<!-- -->
57-
5846
- We would aim to be fully Java EE compliant, with the majority of
5947
components fully reusable as a bean in any other application, just
6048
like Spring's components are.
61-
62-
<!-- -->
63-
6449
- We would write a JUnit test for every single class, writing the test
6550
first then the class afterwards. If other test frameworks are out
6651
there we could investigate these too - one suggestion is
6752
[TestNG](http://testng.org/doc/). We would also write documentation
6853
for every single class with additional full documentation for each
6954
separate jar.
70-
71-
<!-- -->
72-
7355
- We would adhere rigidly to a common coding style and heavily comment
7456
the code.
75-
76-
<!-- -->
77-
7857
- We should make it able to focus on any aspect the user requires and
7958
keep its efficiency, removing its dependency on everything being
8059
sequence-related.
@@ -92,54 +71,38 @@ Data structure
9271
etc. etc.. It has a RecordFormat which reads/writes Records to/from
9372
the RecordSource. It provides an iterator over Records which match a
9473
given RecordSearch.
95-
96-
<!-- -->
97-
9874
- A RecordFormat is version-specific to the format, as are the Record
9975
objects it produces.
100-
101-
<!-- -->
102-
10376
- RecordSearch defines search criteria to be applied to a RecordSource
10477
(or group thereof). It provides an iterator which returns all the
10578
combined Records from all RecordSources the RecordSearch was applied
10679
to. It uses RDF or something similar to map fields between different
10780
kinds of Records and the search parameters.
108-
109-
<!-- -->
110-
11181
- Record is a piece of data in any format, as a bean. It should be as
11282
lightweight as possible - lazyloading of all non-key data would be
11383
ideal. Each different kind of Record has an object structure
11484
suitably matched to the RecordFormat that produced it - e.g. Genbank
11585
Record objects should be structured internally in almost exactly the
11686
same way as the Genbank file. This allows minimal loss of
11787
information and maximum flexibility.
118-
119-
<!-- -->
120-
12188
- RecordConverters convert Record objects between different formats,
12289
e.g. Genbank Record to FASTA Record. They allow sensible defaults to
12390
be provided where one format does not supply enough info to satisfy
12491
the minimum requirements of another. Some kind of bean conversion
12592
system based on RDF would be suitable for this.
126-
127-
<!-- -->
128-
12993
- A set of tools for converting flat data (e.g. sequence strings,
13094
taxononmy strings) into BioJava-like objects (e.g. SymbolLists,
13195
NCBITaxon). These BioJava-like objects could then be used for more
132-
advanced applications.
133-
134-
<!-- -->
135-
96+
advanced applications. One possible candidate would be
97+
[Dozer](http://dozer.sourceforge.net/).
13698
- A set of tools for manipulating the BioJava-like objects.
13799

138100
Action plan
139101
-----------
140102

141-
1. Please modify this page as you see fit in order to flesh out details
142-
and/or make new points.
103+
1. Please modify this page and the [Talk
104+
page](Talk:BioJava3_Proposal "wikilink") as you see fit in order to
105+
flesh out details and/or make new points.
143106
2. Tentative Singapore meeting to get the ball rolling on the final
144107
design and initial coding front.
145108

_wikis/BioJava3_Proposal.mediawiki

Lines changed: 2 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -14,44 +14,30 @@ It is suggested that development stop on the existing BioJava/BioJavaX/BioJava2
1414
* The only database support is for BioSQL, which uses Hibernate but not in a fully flexible manner (i.e. cannot connect to more than one db at a time).
1515
* It is sequence-focused. Users have moved on.
1616
17-
1817
==Proposal==
1918

2019
* Analyse how BioJava is being used by the community. See the [[UsageAnalysis]] page.
21-
2220
* To start from scratch, creating a number of smaller jars as sub-projects within an umbrella BioJava3 project. Each jar would provide tools for a specific purpose. Additional jars would provide cross-purpose tools such as format converters or text-to-object interfaces.
23-
2421
* Although starting from scratch, much existing code could be reused or refactored to suit the new design.
25-
2622
* We would take full advantage of Java 6, including generics, (@)annotations, the built-in property change support. Everything would be a bean - absolutely everything.
27-
2823
* We would aim to be fully Java EE compliant, with the majority of components fully reusable as a bean in any other application, just like Spring's components are.
29-
3024
* We would write a JUnit test for every single class, writing the test first then the class afterwards. If other test frameworks are out there we could investigate these too - one suggestion is [http://testng.org/doc/ TestNG]. We would also write documentation for every single class with additional full documentation for each separate jar.
31-
3225
* We would adhere rigidly to a common coding style and heavily comment the code.
33-
3426
* We should make it able to focus on any aspect the user requires and keep its efficiency, removing its dependency on everything being sequence-related.
3527
3628
* SymbolLists and Alphabets to be rethought as these are the most common stumbling block.
3729
3830
==Data structure==
3931

4032
* RecordSource is an object which provides data. It can represent a file, a directory of files, a database, a web search engine, etc. etc. etc.. It has a RecordFormat which reads/writes Records to/from the RecordSource. It provides an iterator over Records which match a given RecordSearch.
41-
4233
* A RecordFormat is version-specific to the format, as are the Record objects it produces.
43-
4434
* RecordSearch defines search criteria to be applied to a RecordSource (or group thereof). It provides an iterator which returns all the combined Records from all RecordSources the RecordSearch was applied to. It uses RDF or something similar to map fields between different kinds of Records and the search parameters.
45-
4635
* Record is a piece of data in any format, as a bean. It should be as lightweight as possible - lazyloading of all non-key data would be ideal. Each different kind of Record has an object structure suitably matched to the RecordFormat that produced it - e.g. Genbank Record objects should be structured internally in almost exactly the same way as the Genbank file. This allows minimal loss of information and maximum flexibility.
47-
4836
* RecordConverters convert Record objects between different formats, e.g. Genbank Record to FASTA Record. They allow sensible defaults to be provided where one format does not supply enough info to satisfy the minimum requirements of another. Some kind of bean conversion system based on RDF would be suitable for this.
49-
50-
* A set of tools for converting flat data (e.g. sequence strings, taxononmy strings) into BioJava-like objects (e.g. SymbolLists, NCBITaxon). These BioJava-like objects could then be used for more advanced applications.
51-
37+
* A set of tools for converting flat data (e.g. sequence strings, taxononmy strings) into BioJava-like objects (e.g. SymbolLists, NCBITaxon). These BioJava-like objects could then be used for more advanced applications. One possible candidate would be [http://dozer.sourceforge.net/ Dozer].
5238
* A set of tools for manipulating the BioJava-like objects.
5339
5440
==Action plan==
5541

56-
# Please modify this page as you see fit in order to flesh out details and/or make new points.
42+
# Please modify this page and the [[Talk:BioJava3_Proposal|Talk page]] as you see fit in order to flesh out details and/or make new points.
5743
# Tentative Singapore meeting to get the ball rolling on the final design and initial coding front.

0 commit comments

Comments
 (0)