📄 Citations, Groups and References by rowanc1 · Pull Request #5 · oxa-dev/rfc

rowanc1 · 2026-03-25T18:53:17Z

Introduces citation node types.

Some open questions on this:

Reference node naming — Is Reference the best name, or should it be BibliographicEntry, CitationRecord, or something closer to schema.org's CreativeWork? Reference is concise and maps naturally to JATS <ref>, but could be confused with general cross-references in some contexts.
xref as a initial convention
CiTO vocabulary — Should we define a recommended subset of CiTO intents, or leave it open? A closed set improves interoperability; an open set accommodates domain-specific needs.
ORCID and extended author metadata — CSL-JSON name objects do not natively include fields like orcid. Should ORCIDs and similar identifiers be stored in a separate mapping outside the csl object, or should they be added to the CSL name objects as extensions?

github-actions · 2026-03-25T18:53:41Z

Curvenote Preview

Directory	Preview	Checks	Updated (UTC)
content/RFC0005	🔍 Inspect	✅ 22 checks passed (2 optional)	Mar 25, 2026, 7:04 PM

fwkoch

Overall, the approach to citations defined in this RFC is great.

Regarding your open questions, I already addressed the Reference node naming.

For CiTO vocabulary, there are ~100 existing intents - this is so many but also narrowing to a subset feels a little arbitrary. I'm inclined to allow all CiTO intents; they are just descriptive strings that help readers understand and follow citations. I don't see big differences in behavior between different intents.

For ORCID, I agree inclusion is useful, and there isn't an exceptionally slick solution. My immediate thought is to allow oxa metadata on the Reference, alongside csl. This would be structured identically to article metadata, so we are not adding a new schema for OXA renderers. Ideally a citation would include full csl and full oxa representation, but it could also have only one or the other or partial versions of each... (I think this is close to your "separate mapping" suggestion.)

fwkoch · 2026-03-25T21:49:35Z

content/RFC0005/index.md

+  prefix?: [Inline];
+  suffix?: [Inline];
+  display?: 'author' | 'date' | 'full';
+  locator?: string;


Suggested change

locator?: string;

locator?:[Inline];

locator is for human-readable display text. This feels equivalent to prefix/suffix (as opposed to fields like xref/intent/url which only work as strings).

fwkoch · 2026-03-25T22:03:31Z

content/RFC0005/index.md

+  children?: [Inline];
+  prefix?: [Inline];
+  suffix?: [Inline];
+  display?: 'author' | 'date' | 'full';


Two concerns with display field.

First, the description of this field states default (i.e. no display) behavior falls back to default rendering, either author-date or number. I propose implicit behavior should have an explicit counterpart, i.e. leaving display off is the same as using display: 'author-date', as suggested above.

Suggested change

display?: 'author' | 'date' | 'full';

display?: 'author' | 'date' | 'full' | 'author-date';

fwkoch · 2026-03-25T22:36:31Z

content/RFC0005/index.md

+  - `'author'` — abbreviated author only, e.g. "Jones et al." (equivalent to natbib `\citeauthor`, CSL `author-only`)
+  - `'date'` — date only, e.g. "1990" (equivalent to natbib `\citeyear`, CSL `suppress-author`)
+  - `'full'` — display the full bibliographic reference inline, as it would appear in a reference list; useful for first-mention expansions or in-body reference rendering without requiring the reader to navigate to the bibliography
+  - If omitted, the default citation rendering applies — `author-date` (e.g. "Jones et al., 1990" or "Jones et al. (2022)" depending on narrative/parenthetical context), or numeric (e.g. "[1]") depending on the citation style of the renderer.


Second, maybe less of a concern, but a point that should be emphasized:

We are relying on the renderer to understand and respect the intricacies of the OXA spec. Now, I would assume most "numeric" citation renderers to just see a cite node and replace it with the number. This doesn't work for author/date citation kinds.

e.g. for something like:

{type: Cite, kind: author} stated one fact in {type: Cite, kind: date}; {type: Cite, kind: undefined} disagreed.

authors would need to rely on their content getting rendered like so:

_Koch_ stated one fact in _2026_; [2] disagreed

rather than the lazier:

[1] stated one fact in [1]; [2] disagreed

Is this our expectation? Setting the bar higher for everyone involved?

(I think the alternative is to put the burden entirely on the authors to understand how their content will be rendered and adjust the wording and citation structure accordingly.)

fwkoch · 2026-03-26T07:39:29Z

content/RFC0005/index.md

+
+Using inline arrays for `children`, `prefix`, and `suffix` (rather than plain strings) allows rich formatting — for example, italicized titles or emphasized author names within citation text.
+
+`Cite` is an inline node and can appear directly within a `Paragraph` or other inline container. A standalone `Cite` node (not wrapped in a `CiteGroup`) represents a **narrative** citation — one that is grammatically part of the sentence (equivalent to `\citet` in natbib or a bare `@key` in Pandoc).


A standalone Cite node is completely equivalent to a narrative CiteGroup with a single Cite child. It feels a little bad to have two different ways to do the same thing, but I think the only way to avoid that is to require Cite nodes to only exist inside CiteGroups. This gives us consistency but extra boilerplate for the simplest case.

My personal preference is consistency, i.e. always require a CiteGroup. This also makes narrative vs. parenthetical always explicit (since those are only defined on CiteGroup).

fwkoch · 2026-03-26T08:02:17Z

content/RFC0005/index.md

+
+#### Why `kind` Over Boolean Flags
+
+Using `kind: 'narrative' | 'parenthetical'` rather than `parenthetical: boolean` allows for future extension. Additional citation styles may be introduced (e.g. `'numeric'`, `'note'`) without breaking the existing schema. The nomenclature follows established citation-studies terminology.


Regarding note - it's probably worth a small peek ahead to how OXA may handle footnotes.

In JATS, citations use xref -> ref structures, where xref has a specific @xref-type and ref is differentiated by the content it holds (e.g. mixed-citation). There is also note which is an element for non-citation content in a bibliographic ref (JATS best practice is to avoid using this element - https://jats.nlm.nih.gov/archiving/tag-library/1.4/element/note.html). Finally, there are fn footnotes. These use the same xref node in the text, but have a different @xref-type and point to fn nodes in a fn-group.

I like the proposal that Cite will extend a more-generic, not-yet-defined CrossReference node. I think Footnote will be defined similarly, as another extension of CrossReference? Then, there will also be the node type these xref fields point to (FootnoteDefinition or something).

fwkoch · 2026-03-26T08:19:54Z

content/RFC0005/index.md

+
+```typescript
+interface Reference extends Parent {
+  type: 'Reference';


Naming for this node is challenging. In isolation, "Reference" makes sense - these are the article's references.

However, in the context of other nodes, it gets muddy: CrossReferences "reference" other nodes, but CiteGroup "references" a "Reference" ...

When things get hairy like this, I'm always inclined to introduce patterns that make node types more verbose but make relationships clearer. For example, if we extend the thinking to generic xrefs, citations, and footnotes, we could have:

In-line reference -> block-level target --------------------------------------- CrossReference -> <any node> CiteReference -> CiteDefinition FootnoteReference -> FootnoteDefinition

Or maybe just Cite and Footnote for the in-line nodes.

All that said, I'm also perfectly fine declaring that "Reference" (capitalized noun) is this specific node type, regardless of how much we use the verb "reference" elsewhere. However - if we do leave this as Reference, we need to consider the footnote case... is it just FootnoteDefinition and we don't worry about similar naming conventions for these similar nodes?

fwkoch · 2026-03-26T08:28:22Z

content/RFC0005/index.md

+**Fields:**
+
+- `identifier` — the unique key used by `Cite` nodes via `xref` to reference this entry (e.g. `"jones2022"`). The node's `identifier` MUST match the `citation-key` field in the `csl` object.
+- `children` — optional inline content for the rendered display of this reference (e.g. `[{ type: 'Text', value: '1' }]` for numeric styles, or formatted author-year text); if omitted, renderers generate display text from `csl`


I'm a little confused by children: There's the rendered representation of the entire Reference (i.e. text in the reference list) and there's the short representation when it is referenced by Cite.

Here, "inline content" implies we are talking about the latter case? However, don't Cite nodes themselves define inline rendering? Is Reference.children just another fallback if Cite.display and Cite.children are both undefined and the renderer is unopinionated? I feel like the Reference should not concern itself with inline rendering.

However, maybe my understanding is wrong and children refers to the representation of the full reference. In that case, I understand and agree; we just need to change the wording here.

fwkoch · 2026-03-26T08:29:45Z

content/RFC0005/index.md

+- `children` — optional inline content for the rendered display of this reference (e.g. `[{ type: 'Text', value: '1' }]` for numeric styles, or formatted author-year text); if omitted, renderers generate display text from `csl`
+- `csl` — a single CSL-JSON item object conforming to the [CSL-JSON schema](https://resource.citationstyles.org/schema/v1.0/input/json/csl-data.json). This is a well-established, widely-supported standard for representing bibliographic data, used by Zotero, Mendeley, Pandoc, Typst, CrossRef, and most citation-processing tools.
+
+#### CSL-JSON


👍 Leveraging CSL-JSON is great.

fwkoch · 2026-03-26T08:32:45Z

content/RFC0005/index.md

+
+The node's `identifier` and the CSL item's `citation-key` MUST be kept in sync; they are the shared key that connects `Cite` nodes (via `xref`) to their bibliographic data.
+
+`Reference` nodes are typically collected in a reference list at the end of a document, analogous to the `<ref-list>` in JATS or a bibliography section in LaTeX. This RFC does not describe where these are collected or placed in a document similar to JATS `<back>` section; that may be introduced in a future RFC.


👍 I agree we need not prescribe a collection where References are listed. They are simply block-level content that can be handled at render time, in whatever manner is appropriate for the context.

fwkoch · 2026-03-26T08:36:24Z

content/RFC0005/index.md

+        kind: 'parenthetical',
+        children:
+          [
+            { type: 'Cite', xref: 'jones2022', prefix: [{ type: 'Text', value: 'see ' }] },


I wonder if there's any benefit to prefix/suffix at the CiteGroup level... The prefix in this example "see" refers to both citations in the group. I think this is probably overcomplexity, and prefix/suffix on Cite only is fine.

📄 Citations, Groups and References

7a977ec

rowanc1 added the active An active RFC label Mar 25, 2026

rowanc1 added 2 commits March 25, 2026 12:56

non-failing URL

a4bd98d

formatting

f3c362f

oxa-dev deleted a comment from github-actions bot Mar 25, 2026

fwkoch reviewed Mar 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

📄 Citations, Groups and References#5

📄 Citations, Groups and References#5
rowanc1 wants to merge 3 commits intomainfrom
rfc-citations

rowanc1 commented Mar 25, 2026

Uh oh!

github-actions bot commented Mar 25, 2026 •

edited

Loading

Uh oh!

fwkoch left a comment

Uh oh!

fwkoch Mar 25, 2026

Uh oh!

fwkoch Mar 25, 2026

Uh oh!

fwkoch Mar 25, 2026

Uh oh!

fwkoch Mar 26, 2026

Uh oh!

fwkoch Mar 26, 2026

Uh oh!

fwkoch Mar 26, 2026

Uh oh!

fwkoch Mar 26, 2026

Uh oh!

fwkoch Mar 26, 2026

Uh oh!

fwkoch Mar 26, 2026

Uh oh!

fwkoch Mar 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	display?: 'author' \| 'date' \| 'full';
	display?: 'author' \| 'date' \| 'full' \| 'author-date';


		Using inline arrays for `children`, `prefix`, and `suffix` (rather than plain strings) allows rich formatting — for example, italicized titles or emphasized author names within citation text.

		`Cite` is an inline node and can appear directly within a `Paragraph` or other inline container. A standalone `Cite` node (not wrapped in a `CiteGroup`) represents a narrative citation — one that is grammatically part of the sentence (equivalent to `\citet` in natbib or a bare `@key` in Pandoc).


		#### Why `kind` Over Boolean Flags

		Using `kind: 'narrative' \| 'parenthetical'` rather than `parenthetical: boolean` allows for future extension. Additional citation styles may be introduced (e.g. `'numeric'`, `'note'`) without breaking the existing schema. The nomenclature follows established citation-studies terminology.


		The node's `identifier` and the CSL item's `citation-key` MUST be kept in sync; they are the shared key that connects `Cite` nodes (via `xref`) to their bibliographic data.

		`Reference` nodes are typically collected in a reference list at the end of a document, analogous to the `<ref-list>` in JATS or a bibliography section in LaTeX. This RFC does not describe where these are collected or placed in a document similar to JATS `<back>` section; that may be introduced in a future RFC.

Conversation

rowanc1 commented Mar 25, 2026

Uh oh!

github-actions bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fwkoch left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions bot commented Mar 25, 2026 •

edited

Loading