-
-
Notifications
You must be signed in to change notification settings - Fork 102
Some more bibliography improvements #2294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…sform
Edge case: an article title for, say a rewiew, may contain a book title
in italic (About _reviewed book_ by John Doe) and that book title may
start is double quotes ("Something": Other).
Rare, but not uncommon, some authors start with such quotations.
With a style using quotes=true, quotes are added and the existing quotes
have to be shifted ("About _‘Something’: Other_ by John Doe”).
…files Fields 'ids', 'keywords', 'related' and 'xdata' can consist in a single value, or multiple comma-separated values. We now always resolve them to a list, for easier use afterwards. We also check at entry resolution that the 'related' field points to existing keys, and warn if it isn't the case.
BREAKING CHANGE: The new CSL-based bibliography engine was introduced in SILE 0.15.7 at the end of Nov. 2024, foreshadowing the eventual deprecation of the earlier home-grown legacy implemention of (a subset) of the Chicago author-date style. It's now removed, so we can move forward simplifying the code base and extend the package with new features. The bibtex.style setting is however kept, so as not to break documents or modules that were setting it (to 'csl' or 'chicago'), but it does nothing now, and the CSL implementation is always used.
It started as a refactor, but ends up as a feature: Bibliography processing is now available in the CslProcessor class. The latter may be used outside of SILE's regular processing flow (that is, using SILE as a Lua-on-steroids toolkit with ICU, etc. but without going through the typesetter etc. - for instance, I am using it to generate HTML versions of my bibliographies. It's still rough, but paves the way for better abstractions). The package itself now just implements commands for SILE as a typesetting engine, delegating to that processor and just containing the very specific code layer for processing in documents.
3e75e8b to
4b4290c
Compare
Some simple built-in filters are provided, allowing to filter the complete bibliography by entry type, keywords or issue date. N.B. Some code also refactored and notably in-code LDoc comments and annotations have been improved.
…locks Introducing <bibParagraph> wrapping entries, instead of a <par> in-between. The way it's implemented does not change anything as far as SILE processing is concerned, but makes it easier for other output processors (e.g. to HTML where one would want explicit start-end constructs to mark paragraphs and translate them to anything suitable (<p> or <div>) with appropriate CSS.
… entries It's fairly interesting to be able to list related entries in an indented block under the main entries: typically reviews, translations, etc.
5858d0b to
5188da2
Compare
Introduces a push/pop stack to preserve state across multiple bibliography invocations. This avoids rare issues with author substitution (e.g., dashes) in per-chapter bibliographies (e.g. same authors end one and start the next), or more frequent issues when "related" references are enabled and disrupt sequences due to their "nested" behavior. There weren't many internal variables (mainly current authors) to save and restore, and save the day. But using a stack is cleaner and more future-proof.
…iographies We had the "superfolding" on terms implemented in the CSL engine class, but not on ordinal numbers (whose terms where left unprocessed). Code refactored to move that logic in the CSL locale class earlier, and this applied to all term values.
…rder There were cases where depending on citation style the sorting function could lead to identical keys, and not provide a strict weak ordering, resulting in a random exception. The code is also slightly refactored and simplified, since PR 2105 was merged since SILE 0.15.6.
…liographies Enabled by defaut: These fields might be problematic for justification, and the default line breaking algorithm may not recognize these dashes as word boundaries when occurring in the middle of the number.
Correct bib-box-for-indent support (used with styles such as ieee.csl, with second field alignment).
…ographies In the commits in PR 2230, that went in SILE 0.15.13, I tried to fix an issue regarding the suppression of empty macros (impacting substitutes in names). It turned out to cause other issues (2283) and badly break some previously working styles (Chicago, MLA) with some entries... This is an attempt at repairing them. But something still eludes me in the CSL specifications, and while it does seem to work now with the styles we previously tested, using the new Chicago styles recently updated in the CSL repository shows that something is still amiss. These updated styles have a lot more nested macro with conditionals than previously, and I haven't pinpointed where the misinterpretation or raw bug could lie. Yet, the code as now refactored makes more sense, with less side effects and better scoping, so at least it's worth having, albeit perhaps as a partial fix...
|
For the record: |
It allows one to skip entries in grouped citations, while keeping them in the bibliography of cited works, without much rewriting.
Since the removal of the legacy bibliography implementation, these two tests were deprecated. One (bug-2054.bib) is clearly obsolote (checking a side-effect of fluent in bibligraphy internationalization, but the new implementation uses CSL) The other leads to a wholly different output (really following Chicago Author-Date now) but would have been hard to adapt, and was also fairly uncomplete. A different testing approach would be required, as we move to a more layered and modular approach.
The newer CSL-based implementation use, by definition, the terms and rules from the bibliography style and locale.
Main objectives achieved, and PR moved to Ready. (I had two other very secondary objectives in my wish-list. I'll probably open dedicated issues for them. This PR here is already a nice baby.) (Le Dragon de Brume will soon communicate regarding a book update, and an online web site. Both were made with SILE, and the code from this PR.) I am suggesting this should go in 0.15.14, nearly one year after the work that eventually went in 0.15.7 was initiated. True, it's a slightly breaking assuming some people might have relied on the legacy bibtex implementation in anything serious... I cannot believe such users really exist; and even so, the update should be quite transparent, or at least easy and understandable. |
Work in progress - progressively back-porting upstream some things from my (private) fork of the bibliography components.An Overview Of The Changes So Far...
(Slightly) Breaking changes
The legacy bibliography implementation is removed.
The CSL-based bibliography engine, introduced in SILE 0.15.7 at the end of Nov. 2024, was foreshadowing the eventual deprecation of the earlier home-grown legacy implementation of (a very limited and wrong at places subset) of the Chicago author-date style.
It's now the only one supported, so the code base can be simplified and extended with new features (see below).
This also encompasses the removal of the i18n keys (in language files) that were used by the legacy implementation.
Features
\printbibliographysupports new cool options:filter: A space separated list of filters to apply to the bibliography, whencitedis false.It allows filtering by entry type (ex.
type-bookornot-type-book), by keywords (ex.keyword-linguistics), or issue date (ex.issued-2020for entries issued in 2020, orissued-2023-2025for entries issued between 2023 and 2025).Entry types and keywords are to be understood in terms of CSL here (not BibTeX).
Besides those simple built-in filters (which, in my experience at least, cover most use cases), a (very and somewhat internal) low-level Lua API is provided to allow users to implement their own named filters.
(The reasons for these filters being allowed only when
citedis false are that they are meant to filter the complete bibliography. Filtering cited entries would make tracking and resetting more complex, for little gain in normal use cases, in my opinion - at least for now, with things still in the flux.)related: A boolean option to include "related" entries in the bibliography (in an indented block under the main entries).With a Bib(La)TeX bibliography, it correspond to entries which have a
relatedfield (a comma-separated list of entry keys). So, say, if you have a book B and an article A (e.g. possibly a review, commentary, etc.) with arelatedfield containing the key of B, A will be listed nested under B in the bibliography (independently of being cited or not, on its own).By default, ISBNs and ISBNs in bibliographies can now break at dashes, which is useful for justification and line breaking (as these elements are quite long).
This can be disabled by setting the
breakISBNoption to false in the bibliography options.Bib(La)TeX support improvements:
The
idsfield for citation key aliasing is now supported.Several fields (
ids,keywords,related, andxdata) are interpreted as lists of comma-separated values, as they should be.\nociteelements are now accepted in\citesconstructs for grouped citation, e.g.,\cites{\cite[key1]\nocite[\key2]-- it's a convenience, allowing one to "skip" entries in grouped citations, while keeping them in thebibliography of cited works, without much rewriting of that block.
Fixes
Sorting is more consistent in its attempt at a strict weak ordering, avoiding random exceptions.
Superscript "folding" is now applied to ordinal numbers in bibliographies, as it was already done for other terms.
State is better handled across multiple bibliography runs, avoiding issues with author substitution (e.g., dashes) in per-chapter bibliographies (or when the new "related" references are enabled).
Smart straight quotes transformation plays better with italic text (when this extension is enabled, which is the default).
Rework how sorting and substitutions are done in bibliographies. This is an attempt at fixing Bibliography CSL "substitute" element does not always work as expected #2283 but it still seems that something is amiss with certain styles. Well, it might be partial, either due to a misinterpretation of the specification or a real bug. I haven't been able to pinpoint it, despite hours of attempts 😿 -- But it does repair something that I broke in SILE 0.15.13, and the code should be much better as new refactored.
Other changes
Welcome the new
CslProcessorclass.It's a big refactor, splitting most of the CSL processing logic out of the
bibtexSILE package, which now just implements the command "layer" for SILE as a typesetting engine, delegating the work to that processor for the bibliography processing.This also allows using the CSL processing logic outside of SILE's regular processing flow (e.g., using SILE as a "Lua-on-steroids" toolkit with ICU, etc. but without going through the typesetter, PDF output, etc.)1
Improved in-code documentation, with better LDoc annotations.
Closes #2126
Closes #2085
Regarding #2283 - the changes here might be a partial / imperfect fix, as noted above, but the code makes more sense now.
Footnotes
For instance, I am using it to generate HTML versions of my bibliographies, soon to be released. It's still a bit rough, but it paves the way for better abstractions. ↩