Conversation
|
Curvenote Preview
|
fwkoch
left a comment
There was a problem hiding this comment.
Overall, the approach to citations defined in this RFC is great.
Regarding your open questions, I already addressed the Reference node naming.
For CiTO vocabulary, there are ~100 existing intents - this is so many but also narrowing to a subset feels a little arbitrary. I'm inclined to allow all CiTO intents; they are just descriptive strings that help readers understand and follow citations. I don't see big differences in behavior between different intents.
For ORCID, I agree inclusion is useful, and there isn't an exceptionally slick solution. My immediate thought is to allow oxa metadata on the Reference, alongside csl. This would be structured identically to article metadata, so we are not adding a new schema for OXA renderers. Ideally a citation would include full csl and full oxa representation, but it could also have only one or the other or partial versions of each... (I think this is close to your "separate mapping" suggestion.)
| prefix?: [Inline]; | ||
| suffix?: [Inline]; | ||
| display?: 'author' | 'date' | 'full'; | ||
| locator?: string; |
There was a problem hiding this comment.
| locator?: string; | |
| locator?:[Inline]; |
locator is for human-readable display text. This feels equivalent to prefix/suffix (as opposed to fields like xref/intent/url which only work as strings).
| children?: [Inline]; | ||
| prefix?: [Inline]; | ||
| suffix?: [Inline]; | ||
| display?: 'author' | 'date' | 'full'; |
There was a problem hiding this comment.
Two concerns with display field.
First, the description of this field states default (i.e. no display) behavior falls back to default rendering, either author-date or number. I propose implicit behavior should have an explicit counterpart, i.e. leaving display off is the same as using display: 'author-date', as suggested above.
| display?: 'author' | 'date' | 'full'; | |
| display?: 'author' | 'date' | 'full' | 'author-date'; |
| - `'author'` — abbreviated author only, e.g. "Jones et al." (equivalent to natbib `\citeauthor`, CSL `author-only`) | ||
| - `'date'` — date only, e.g. "1990" (equivalent to natbib `\citeyear`, CSL `suppress-author`) | ||
| - `'full'` — display the full bibliographic reference inline, as it would appear in a reference list; useful for first-mention expansions or in-body reference rendering without requiring the reader to navigate to the bibliography | ||
| - If omitted, the default citation rendering applies — `author-date` (e.g. "Jones et al., 1990" or "Jones et al. (2022)" depending on narrative/parenthetical context), or numeric (e.g. "[1]") depending on the citation style of the renderer. |
There was a problem hiding this comment.
Second, maybe less of a concern, but a point that should be emphasized:
We are relying on the renderer to understand and respect the intricacies of the OXA spec. Now, I would assume most "numeric" citation renderers to just see a cite node and replace it with the number. This doesn't work for author/date citation kinds.
e.g. for something like:
{type: Cite, kind: author} stated one fact in {type: Cite, kind: date}; {type: Cite, kind: undefined} disagreed.
authors would need to rely on their content getting rendered like so:
_Koch_ stated one fact in _2026_; [2] disagreed
rather than the lazier:
[1] stated one fact in [1]; [2] disagreed
Is this our expectation? Setting the bar higher for everyone involved?
(I think the alternative is to put the burden entirely on the authors to understand how their content will be rendered and adjust the wording and citation structure accordingly.)
|
|
||
| Using inline arrays for `children`, `prefix`, and `suffix` (rather than plain strings) allows rich formatting — for example, italicized titles or emphasized author names within citation text. | ||
|
|
||
| `Cite` is an inline node and can appear directly within a `Paragraph` or other inline container. A standalone `Cite` node (not wrapped in a `CiteGroup`) represents a **narrative** citation — one that is grammatically part of the sentence (equivalent to `\citet` in natbib or a bare `@key` in Pandoc). |
There was a problem hiding this comment.
A standalone Cite node is completely equivalent to a narrative CiteGroup with a single Cite child. It feels a little bad to have two different ways to do the same thing, but I think the only way to avoid that is to require Cite nodes to only exist inside CiteGroups. This gives us consistency but extra boilerplate for the simplest case.
My personal preference is consistency, i.e. always require a CiteGroup. This also makes narrative vs. parenthetical always explicit (since those are only defined on CiteGroup).
|
|
||
| #### Why `kind` Over Boolean Flags | ||
|
|
||
| Using `kind: 'narrative' | 'parenthetical'` rather than `parenthetical: boolean` allows for future extension. Additional citation styles may be introduced (e.g. `'numeric'`, `'note'`) without breaking the existing schema. The nomenclature follows established citation-studies terminology. |
There was a problem hiding this comment.
Regarding note - it's probably worth a small peek ahead to how OXA may handle footnotes.
In JATS, citations use xref -> ref structures, where xref has a specific @xref-type and ref is differentiated by the content it holds (e.g. mixed-citation). There is also note which is an element for non-citation content in a bibliographic ref (JATS best practice is to avoid using this element - https://jats.nlm.nih.gov/archiving/tag-library/1.4/element/note.html). Finally, there are fn footnotes. These use the same xref node in the text, but have a different @xref-type and point to fn nodes in a fn-group.
I like the proposal that Cite will extend a more-generic, not-yet-defined CrossReference node. I think Footnote will be defined similarly, as another extension of CrossReference? Then, there will also be the node type these xref fields point to (FootnoteDefinition or something).
|
|
||
| ```typescript | ||
| interface Reference extends Parent { | ||
| type: 'Reference'; |
There was a problem hiding this comment.
Naming for this node is challenging. In isolation, "Reference" makes sense - these are the article's references.
However, in the context of other nodes, it gets muddy: CrossReferences "reference" other nodes, but CiteGroup "references" a "Reference" ...
When things get hairy like this, I'm always inclined to introduce patterns that make node types more verbose but make relationships clearer. For example, if we extend the thinking to generic xrefs, citations, and footnotes, we could have:
In-line reference -> block-level target
---------------------------------------
CrossReference -> <any node>
CiteReference -> CiteDefinition
FootnoteReference -> FootnoteDefinition
Or maybe just Cite and Footnote for the in-line nodes.
All that said, I'm also perfectly fine declaring that "Reference" (capitalized noun) is this specific node type, regardless of how much we use the verb "reference" elsewhere. However - if we do leave this as Reference, we need to consider the footnote case... is it just FootnoteDefinition and we don't worry about similar naming conventions for these similar nodes?
| **Fields:** | ||
|
|
||
| - `identifier` — the unique key used by `Cite` nodes via `xref` to reference this entry (e.g. `"jones2022"`). The node's `identifier` MUST match the `citation-key` field in the `csl` object. | ||
| - `children` — optional inline content for the rendered display of this reference (e.g. `[{ type: 'Text', value: '1' }]` for numeric styles, or formatted author-year text); if omitted, renderers generate display text from `csl` |
There was a problem hiding this comment.
I'm a little confused by children: There's the rendered representation of the entire Reference (i.e. text in the reference list) and there's the short representation when it is referenced by Cite.
Here, "inline content" implies we are talking about the latter case? However, don't Cite nodes themselves define inline rendering? Is Reference.children just another fallback if Cite.display and Cite.children are both undefined and the renderer is unopinionated? I feel like the Reference should not concern itself with inline rendering.
However, maybe my understanding is wrong and children refers to the representation of the full reference. In that case, I understand and agree; we just need to change the wording here.
| - `children` — optional inline content for the rendered display of this reference (e.g. `[{ type: 'Text', value: '1' }]` for numeric styles, or formatted author-year text); if omitted, renderers generate display text from `csl` | ||
| - `csl` — a single CSL-JSON item object conforming to the [CSL-JSON schema](https://resource.citationstyles.org/schema/v1.0/input/json/csl-data.json). This is a well-established, widely-supported standard for representing bibliographic data, used by Zotero, Mendeley, Pandoc, Typst, CrossRef, and most citation-processing tools. | ||
|
|
||
| #### CSL-JSON |
|
|
||
| The node's `identifier` and the CSL item's `citation-key` MUST be kept in sync; they are the shared key that connects `Cite` nodes (via `xref`) to their bibliographic data. | ||
|
|
||
| `Reference` nodes are typically collected in a reference list at the end of a document, analogous to the `<ref-list>` in JATS or a bibliography section in LaTeX. This RFC does not describe where these are collected or placed in a document similar to JATS `<back>` section; that may be introduced in a future RFC. |
There was a problem hiding this comment.
👍 I agree we need not prescribe a collection where References are listed. They are simply block-level content that can be handled at render time, in whatever manner is appropriate for the context.
| kind: 'parenthetical', | ||
| children: | ||
| [ | ||
| { type: 'Cite', xref: 'jones2022', prefix: [{ type: 'Text', value: 'see ' }] }, |
There was a problem hiding this comment.
I wonder if there's any benefit to prefix/suffix at the CiteGroup level... The prefix in this example "see" refers to both citations in the group. I think this is probably overcomplexity, and prefix/suffix on Cite only is fine.
Introduces citation node types.
Some open questions on this:
Referencethe best name, or should it beBibliographicEntry,CitationRecord, or something closer to schema.org'sCreativeWork?Referenceis concise and maps naturally to JATS<ref>, but could be confused with general cross-references in some contexts.xrefas a initial conventionorcid. Should ORCIDs and similar identifiers be stored in a separate mapping outside thecslobject, or should they be added to the CSL name objects as extensions?