Citations Online
Architecture Development

J. WILLIAMS, Ph.D. | BIO | UPDATED date

The objective of this research is to derive a convenient architecture for citations and references in online documents that

Terminology  

An online document is a human-readable document with an advertised URL. Some of an online document's pages may be restricted, for example, by placing them behind a paywall. eBooks are available online but are not online documents in that they do not have advertised URLs and thus cannot be linked to.

We will be discussing a citing document whose citations and references are under discussion. Citations (in-text citations) occur in the citing document's main text. Its references occur in its reference section (works cited section). Each citation points to a reference that in turn points to cited content. Typically, the cited content is on a web page that also includes uncited content — advertisements, for example.

A reader is someone who views Internet content. A reader need is anything that helps readers assess and absorb content.

Contributors are people involved in creating, presenting, and maintaining content, including any research involved in creating content. Contributors are conveniently divided into content providers and secondary contributors. Authors are content providers who accept responsibility for content.

A contributor need is anything that contributes to the successful creation, organization, and presentation of content to readers. Contributor needs in this sense may differ from requirements imposed by publishers.

Method  

This work addresses five questions. The first two have to do with existing style guides. Questions 3 moves beyond existing guidance to address additional communication needs. Question 4 requests a citation-and-reference format that meets the identified needs. This request has different answers depending on whether user needs take precedence over provider needs. The final question asks how to bridge the gap between user needs and provider needs.

  1. What guidance is provided by existing style guides?
  2. What communication needs are addressed by existing guidance?
  3. What additional needs exist in the online environment?
  4. What citation and reference formats address the needs of readers and providers?
  5. How can we automatically transform a provider format into a reader format?  

1. Existing Guidance  

The following style guides were selected for study:

In addition to studying these guides, we will also look briefly at webliographies.

The order of the elements in these guides differs but the topics covered are largely the same. However, the organization of the guides varies significantly.

The NLM guide splits immediately into 25 chapters covering individual cases — books journals, databases, and so forth. The NLM stands alone in not devoting any chapters to general principles.

The Chicago guide covers 100 general points before breaking out into special cases — books, journals, and so forth.

In the APA guide, references are broken out into four main elements and each is discussed prior to getting into special cases — religious works, works in a foreign language, and so forth.

The main body of the MLA is devoted entirely to general principles. Special cases are relegated to a voluminous Appendix 2.

Citations

The selected guides provide four basic approaches to in-text citations.

Author, date, and page range. The APA syntax looks like this: (Henry 2027, p. 14)[APA, citation]. The Chicago author-date system looks like this: (Henry 2027, 14)[cmos, author-date]. The NLM also mentions a similar name-year system[NLM, intro].

Author and page range. The MLA syntax looks like this: (Henry 14)[MLA, citation-guide].

Citation number and page range. The  Chicago notes-and-bibliography system uses in-text superscripts along with footnotes that can provide page ranges[cmos, note-bibliography].

Citation number. The NLM citation-sequence system uses a numeric superscript: 1, as does the NLM citation-name system[NLM, intro].

Authors, Roles, and Affiliations

Guidance on how many authors to list varies significantly. The NLM says to try to list every author[NLM, authors-article]. The APA says to list up to twenty authors[APA, authors]. The Chicago guide says up to ten[cmos, authors]. The MLA says list up to two[MLA, authors].

Interestingly, in the guides selected for study, if a referenced document fails to specify authorship, the author element is simply omitted. This convention differs, for example, from date elements where missing information is explicitly noted as "n. d."

All four guides provide for secondary contributor roles. Role names always follow contributor names[NLM, contributors] [cmos, contributors] [APA, roles] [MLA, author-roles]. However, only a limited set of roles is mentioned, primarily the author, editor, illustrator, and translator roles. The MLA goes farther, listing conductor, creator, director, narrator, and performer as additional roles[MLA, author-roles].

The NLM guide asks for authors' institutional affiliations[NLM, affiliation-books] [NLM, affiliation-articles] [NLM, affiliation-web-pages].

Personal Communications

The guides are mainly silent on how to cite personal communications. The NLM says to mention recipients when referencing emails[NLM, email]. The APA guide says to mention personal communications only in the main text if readers do not have a way of accessing them[APA, personal]. The Chicago style says to treat the recipient as part of the title when a personal communication is listed in the bibliography[cmos, personal].

Titles, Portions, and Editions

All guides provide for giving the title's Language if it is not in English and giving an English translation, in effect providing two titles in this case.

The guides provide conventions for titling a portion of a more extensive work — for example, an article section. In the NLM, a reference to a perceived whole usually precedes the reference to the part[NLM, article-part] but not always[NLM, listservs]. The MLA says to cite the part first[MLA, parts]. The APA says to cite only the parent document because that is what readers would retrieve to find the part[APA, citation]. The Chicago style places the chapter title before the book title[cmos, chapter].

The MLA considers the case of nested containers where one text is contained in a larger body of work which may be contained in a third[MLA, containers]. It gives a dozen container examples[MLA, figure-5.29].

Titles are often accompanied by edition/version information. Examples include the weekend edition, revised edition, British edition, sixth edition. Version designations immediately follow the title[NLM, book-edition] [MLA, version]. Similar auxiliary information occurs in connection with journals[NLM, date-augmentation] [NLM, volume-augmentation] [NLM, issue-augmentation] [APA, edition]. References to newspaper articles may include a column number; a section letter, number, or name; a part title; or a page range[NLM, newspaper].

Publishers

Guidance varies as to whether to include the publisher's city. The MLA says to include the city only if the book was published before 1900[MLA, 1900].

In addition to traditional book publishing houses, there are several variants.

Newspapers: Substitute the name of the newspaper for the publisher[NLM, newspaper].

Journal articles: Substitute the name of the journal.

Web pages: Substitute the page's website[NLM, website-publisher] [APA, web-page-reference]. Substitute the title and owner of the web page's website[cmos, websites] [cmos, citing-websites]. Substitute the website's title[MLA, containers].

A publication can have multiple publishers, and a publisher's name can change over time. The Chicago guidance is to use the name of the publisher of the work at the time it was published[cmos, publisher].

Dates

All style guides provide for the date of the last update or revision as well as a publication date. The NLM style also requires the date when an author most recently accessed the referenced page[NLM, accessed-date]. The APA guide also includes this requirement under various circumstances[APA, retrieval-date]. Secondary guidance includes examples of dates that do not merit inclusion.

URLs

There is a fair amount of variation. The NLM advises the use of permalinks. Several guides advise using DOIs in place of URLs. Only the NLM guide entertains the possibility of providing more than one URL for a source. It suggests using either one or else all available URLs[NLM, URLs-books] [NLM, URLs-journals] [NLM, URLs-databases].

eBooks

The Chicago style specifically calls out eBooks as requiring additional attention because special software is needed to read them[cmos, online-formats].

Content Type

The NLM style provides for an optional "content type" or "article type"[NLM, article-type-journal] [NLM, content-type-book] [NLM, content-type-bibliography]. Some of the examples are about content, while others are about how it is presented. Content examples include book review, editorial, interview, demographic map, and bibliography. Presentation examples include poster, email, letter, and conversation.

Notes

The NLM guide includes a notes section used to indicate error notices, retractions, epub dates, accompanying auxiliary materials, and so forth[NLM, notes-books] [NLM, notes-reports].

Webliographies

Webliography references differ substantially from what the guides call for. The most familiar webliographies are those produced automatically by search engines. Google News provides attractive curated webliographies. Website home pages sometimes include webliographies that present the site's principal web pages. Since webliographies are not from the ink-and-paper world, they offer insights into how the reference concept will evolve.

All webliographies contain a title and a link to the referenced page. They normally include a description of the page's content. They may also have a favicon calling attention to the publisher, hosting website, or the page's corresponding root page. Favicons are the small icons seen at the top of browser tabs in a web browser. Webliographies sometimes include an image indicative of the page's content. The most common ordering of webliography elements is this:

Notice that the first two required elements are the title and a link to the referenced page. Webliographies include annotations because they are stand-alone documents; in-text citations are not available to stimulate interest in webliography entries.

2. Needs Addressed  

Little has been published on needs addressed by the existing guidelines aside from the need to avoid plagiarism. So this question is answered by inference from the guidance provided. Sixteen communication needs are implicated.

Needs Related to Citations

All of the studied styles meet the need to link citations and references. But they vary with regard to three other needs.

Readability of the Main Text. In-text citations need to avoid interfering with reading the main text. The numeric superscript citations excel in this regard, as does the Chicago notes-and-bibliography system. However, the popular NLM citation-sequence system complicates the authoring process. To add a new reference that's needed somewhere in the middle of the citing document, an author must renumber all later citations and their corresponding references.

Mnemonic Value of Citations. A citation that includes an author's name may remind the reader of which reference is being cited. The footnotes of the Chicago notes-and-bibliography system have excellent mnemonic value. By contrast, the NLM citation-sequence system has no mnemonic value; the curated presentations put out by PubMed correct for this NLM deficiency by showing the corresponding reference when the reader mouses over a citation. The NLM mouseover notes take the place of the notes in the Chicago notes-and-bibliography system.

Page Ranges. Each of the non-numeric citation styles provides a way to include page ranges in the citation. The Chicago notes-and-bibliography style stands alone in providing both numeric superscripts and a way to specify page ranges.

We have a discrepancy. The NLM numeric citations are good for the readability of the main text but not for authoring or including page ranges. This discrepancy is the starting point for developing a two-format approach to citations and references — a provider format for content providers and a transformed reader's view for readers.

Needs Related to Reference Sections

Of the needs relating to references, three have to do with the overall organization of references. Ten more have to do with specific reference elements.

Coherence. For example, in the NLM guide, reference elements of primary interest to the reader are scattered in various locations. The title is the 8th element, and URLs are the 17th. But, judging from the webliography style, these are the essential elements. Title-related elements are not necessarily placed together. For example, a part number may follow the date rather than the title[NLM, date-augmentation]. Booksellers place title-related information directly after the title. There needs to be a strongly defensible ordering of the reference elements.

Punctuation. As one guide put it: "Punctuation, dates, and page numbers depend on the type of reference cited, so follow the examples with care[IEEE]." Content providers need to focus on content. The APA straightforwardly requires that most elements in a reference end with a period[APA, punctuation]. There needs to be a simple, consistent approach to punctuation.

Abbreviations. All of the studied guides provide detailed guidance on abbreviations. The NLM, for example, devotes five appendices to the subject. This may have made sense for print documents because paper is incomparably more expensive than bits and bytes. But extensive use of abbreviations can distract from readability, and finding the approved abbreviation for a term can be time consuming. There is no pressing need to use abbreviations.

Authors Worth Mentioning. How many authors should be listed? Up to two, up to ten, up to nineteen, or perhaps every author? Is there any objective evidence showing that historians, psychologists, and medical researchers legitimately differ in how many authors they need to see? Perhaps. Medical research papers are often too obscure to attract a broad audience. So the likely readership of the citing document may well consist largely of the authors of the referenced articles. In this case, listing every author makes sense. There needs to be flexibility in the number of authors to list.

Citing Subsidiary Content. Existing citation guidance recognizes subsidiary content in at least two cases. Page ranges are subsidiary to the book. Individual works are subsidiary to the anthology. But none of the cited styles make a provision for citing portions of web pages. There is no reason to assume that a reader's interest in specific content found on a web page implies an interest in other co-occurring content on the same page. There need to be additional provisions for referencing subsidiary content.

Author Credibility. The NLM guide specifies the inclusion of an author affiliation. This requirement may assist in judging the credibility of authors in a research environment. However, in a world with 28 million web developers[galov] and many more content providers, an institutional affiliation is not a reliable indication of author credibility. The ability to assess author credibility is essential to the goal of supporting evidence-based communication. Additional ways of documenting author credibility are needed.

Personal Communications. The APA guide says not to cite personal communications unless they have a citable source[APA, personal]. However, some personal communications do have independent confirmation, as in the case of an 1803 letter from Thomas Jefferson to James Madison[jeff-letter]. In addition, genealogies often contain personal letters that get posted online. The treatment of personal communications needs to be addressed.

What is Publishing? To carry this element forward into an online world, we must ask what constitutes an act of publishing. Public libraries distribute content to the public but do not publish; they often have an online presence, as with the Library of Congress. YouTube, Twitter, and Reddit provide platforms for unedited contributions, which they also do not publish. Used book stores do not publish. What do these examples lack?

Newspapers and journals publish in the above sense. These examples suggest that it is appropriate for a reference to designate as publishers the institutions that carry out the above-described publishing functions. For example, the publishing function for The Brave by Nicholas Evans[brave] was performed by G. P. Putnam's Sons, which is an imprint of Penguin Random House rather than an actual publisher[putnam]. The need addressed by knowing who performs the publishing function is for the reader to better understand the published information's validity.

What remains is the issue of who publishes online content. The guides studied tend to assume that content found on a web page is published by the page's website, even if the content was published a hundred years ago, as in the case of Charles Dickens' A Christmas Carol[dickens]. The usual guidance is to use the website's title in place of a traditionally recognized publisher. The NLM differs in requiring what is essentially a complete reference for the page's website. And the MLA asks for the website's publisher, though there is no evidence that the site's publisher is actively involved in publishing cited content. The publishing function for a web page needs to be identified according to the above three publishing criteria.

At issue is the fact that the Web holds great quantities of self-published material. Facebook posts and Twitter tweets are self-published. Personal websites are frequently devoted to self-published material. The closest analogues in the print world to these websites are the commercial book printers such as BookBaby, Barnes & Noble Press, and Techniprint. Facebook, in particular, lacks some publishing characteristics. Self-published material needs to be clearly identified.

Which Publishers to List in a Reference. It can easily happen that there have been multiple publishers for the same work. The publisher most involved in bringing a work to fruition is the one that is most relevant to the work's credibility. In the case of multiple publishers, there may be a need to look at which publisher had the most to do with a referenced work's credibility.

Date Utility. Several dates might be associated with cited content, some more valuable than others. Suppose there is a discrepancy between the date of the page and the date of its content. In that case, the content date is more relevant to the citing document. For example, an appropriate date for A Christmas Carol might be 1911[dickens]. The date on which the Library of Congress acquired its online copy is less relevant to understanding that novel's credibility. However, in some cases, the date of a web page can be a fair estimate of the date of its primary content.

Some of the guides still require an access date, but there are good reasons not to include this.

Permalinks and DOIs. DOIs are not interpreted by web browsers. They require an extra translation step. To translate a DOI to a URL, prepend the DOI with  http://doi.org/[DOI-article]. Many organizations prefer to develop their own systems of permalinks, perhaps due in part to cost. More examples of how to get a permalink are needed.

Hyperlinks Versus URLs. When the Internet was young, URLs tended to be readable, typable, and less than one line in length, but none of this tends to be true today. Reader convenience demands the use of hyperlinks rather than URLs. A reader who wants the actual URL can use the browser's "copy link" feature.

Title Appendages. Some types of title information are unimportant in the case of online works. Classifying articles by volume, issue, and page range is relevant to finding articles in the library stacks, but not so much online.

There is also less need for advertised versions. Why? Before the Internet, digital information was printed on physical media and went through advertised versions in much the same way as print documents. Then in 1991, Tim Berners-Lee introduced the World Wide Web[WWW]. It soon became a royal highway for malware and other malevolent content. In response, vendors now release frequent updates distributed over the web without fanfare, making the release of advertised versions and editions less frequent than was once the case. The guide's handling of this "appendage" information needs to be simple.

3. Additional Needs  

This section identifies twelve additional needs that must be addressed to provide a broadly applicable reference architecture that supports evidence-based communication.

Simplicity and Ease of Learning. The first sentence of this document lists the goal of being widely applicable beyond research communities. Consequently, the developed guide needs to be easy to use by content providers who have little or no experience writing citations and references. This need strongly affects what does not go into the guide.

Any decision that can be safely left to authors does not belong. It is reasonable to expect authors to achieve consistency within their documents. But it is not practical to develop a guide that achieves consistency across all documents meeting the guide's requirements.

It is easier to learn a general principle than to learn many special cases, even if the general principle has exceptions. This is true even if it is necessary to introduce new terminology to state the general principle. These facts call for a uniform approach to such things as syntax, for example. The organization of the Citations Online guide needs to rely on general principles — the exact opposite of how the NLM guide is organized.

Why Readers Read References. From the reader's perspective, there are two primary goals, first to assess whether the referenced content is of interest, and second to assess the danger of being misled. References need to be written in such a way that these twin goals are easily distinguished.

Author Credibility. An answer to Question 2 pointed out that, in a general setting, an author's institutional affiliation is not a strong indication of author credibility. Better indications of author credibility are needed.

Honorific titles such as Ph.D., MD, or LCSW help readers to assess author credibility[schema, honorific-suffix ]. For example, an author of an article on effective marriage counseling might be an LMFT. The guide should not inhibit the use of such titles.

An emerging trend in online articles is to provide a bio. Having a biography helps to assess the authors' credibility. Bios can also make for enjoyable reading. There needs to be provision for including links to bios.

Content as Opposed to Presentation. The same content can have many different presentations. This is especially true on the Internet where the same article may be formatted differently on different devices or on different websites. So a guide that relates to content can be more useful than a guide that pertains solely to presentation.

From the perspective of evidence-based communication, original documents do have some additional claim to credibility. But outside of a research environment, there is relatively little interest in original sources. For example, it is generally agreed that  E   mc2 even though few have read Einstein's famous paper[einstein].

Consequently, a reference needs to be about referenced content, as opposed to presentations of content. At the same time, hyperlinks in a reference do link to presentations of content. There can be several different presentations of the same content.

Overall Structure of Reference Sections. Imagine a paper that consisted of fifty one-sentence paragraphs. Nobody writes this way. But that's precisely what one sees in a traditional reference section. Such an organization provides a disincentive to reading reference sections. There is a need for additional structure. This is a reader need rather than a provider need.

Color and Visual Appeal. It stands to reason that improving the visual appeal of a reference section increases its usefulness by increasing its probability of being read. A related fact is that readers of web content rely heavily on visual appeal in judging a website's credibility[credibility1] [credibility2].

The importance of color can be judged by looking at how it has influenced other areas of communication. Color TV became popular in the 1960s[color-TV]. Color in newspapers became popular in the 1980s[color-news]. Favicons add color to web browsers. Bharat Shyam invented them as a last-minute addition to Internet Explorer 5 tabs in 1999[favicon] [ie5].

Linked Citations. From the reader's perspective, the most convenient way to get from a citation to its corresponding reference is to follow a hyperlink from one to the other. But it is inconvenient for a content provider to provide such a service. This is another key discrepancy between reader needs and provider needs.

Hyperlinks from citations to references accomplish the main purpose of citations, which is to associate a reference with a particular spot in the main text. From this perspective, all that is needed is to place a mark in the main text and hyperlink it to the cited reference.

Subsidiary Content. An answer to Question 2 pointed out that traditional guides provide two mechanisms for citing a portion of a more extensive work. One is to use page ranges, and the other is to create a separate reference to the portion itself. For online works, there are three possibilities.

Page ranges and fragment links both describe portions of a referenced document. Fragment links and subsidiary-document links both take the reader directly to presentations of subsidiary content. There needs to be greater support for handling subsidiary content.

Online Books and eBooks. From the standpoint of writing references, a crucial distinction needs to be made as to whether parts of a document can be directly referenced.

The guide Citing Medicine is an online book[NLM]. The Chicago Manual of Style is available as an online book with restricted pages[cmos]. The Publication Manual of the American Psychological Association is available as an eBook[APA]. The MLA Handbook 9th is available as a Kindle eBook[MLA]. The NLM and Chicago guides qualify as online books. The other two guides do not.

The new guide needs to clearly distinguish between documents that are online and documents that are merely available online.

Portrayals of Alternate Content. In addition to hyperlinks that point to presentations of content, there are other portrayals — reviews, abstracts, corrections, clarifications, addenda. These portrayals are important for offline documents because there can be no hyperlink to the documents themselves. There needs to be support for portrayal links, and there needs to be a way to distinguish between portrayals and presentations.

Reference Clarification. This need is alluded to already in the NLM notion of a content type. A reference may need clarification in either of the following ways:

Format: Some readers avoid PDF files or Power Point documents.

Content: It helps to know whether a referenced work is a book review, clinical trial, consensus document, exposition, large-scale study, literature review, news article, and so forth.

Multiple Reachable Hyperlinks. Readability demands that a referenced work be at least partially available to and readable by the author's intended audience. There are several issues to consider.

Impermanence. A post that has gone viral may have a lengthy persistence even though its persistence in any given location is far from assured.

Access Cost. Availability may hinge partly on the reader's wealth or influence — an article that's free to professors may cost $60 to others. In many cases, a free abstract contains all the information a reader needs.

Language Barrier. If the same work is published in several languages, several URLs may be needed to achieve the desired availability.

In the case where multiple links are needed, it makes sense to link to the most credible representation first.

Provider Roles and Contributor Roles. An answer to Question 1 mentioned limited types of author roles. Having enough roles to accurately describe authors and other contributors is central to the use of roles in assessing author credibility.

The relative lack of roles in the studied guides conflicts with other guidance on user roles. The CASRAI project has identified fourteen contributor roles related to scientific, scholarly output[CASRAI]. Matthew Childers has described roles that are prominent in the creation of movies and comic books. His article convincingly illustrates the distinction between roles and role names[Childers]. Schema.org[schema] identifies some content-provider roles[schema, role] [schema, honorific-suffix]. So does the U.S. patent office[patents]. IOP Publishing discusses five authoring roles associated with scientific research[IOPScience]. The International Committee of Medical Journal Editors mentions roles in addressing the distinction between author and non-author contributors[ICMJE]. Collectively, these sources identify or suggest many roles involved in the authoring process.

Content Providers
Secondary Contributors

4. Architecture  

We are now at the point of designing an architecture who's complexity is commensurate with the needs of readers and content providers. The need for simplicity and learnability demands a strong emphasis on uniform structure and general principles. Our tasks are to design

The following subsections have an alternating paragraph structure. A paragraph recalls motivations gleaned in answering Questions 2 and 3. And then an indented paragraph presents the resulting guidance.

Citations

An answer to Question 2 notes a tension between the need to cite subsidiary content and the need to not interfere with the readability of the main text and notes the superiority of the Chicago notes-and-bibliography system in this regard. Another answer to Question 2 notes that PubMed's mouseover notes serve a similar purpose.

An answer to Question 3 notes that there are now three significant ways in which one referenced work can be subsidiary to another. The subsidiary portion could be a page range, a named section, or another page. In all three cases, the subsidiary portion can be specified by citing the root reference and providing a designation of subsidiary content. These observations suggest the following citation architecture.

Citations can point to references in one of two ways.

A root citation identifies content that, within the context of the citing document, is not part of some larger cited body of work.

A subsidiary citation identifies content that is logically subsidiary to content that is identified by a root citation.

The organization of cited content into root content and subsidiary content is performed at the discretion of the content providers. The subsidiary content may be found in a section of the cited root document, another document, or a section of another document. The other document may have an unrelated URL.

To resolve the discrepancy between authoring needs and reader needs, each of these citation ways has both a provider format and a reader's view. And the provider format has to be machine parsable.

A provider root citation has the form \[root], where the root identifier is chosen at the convenience of content providers. A provider subsidiary citation has the form \[root, part-name], where the part-name is a short name for the subsidiary content. To prevent a bracketed passage from being treated as a citation, escape its left bracket with a back slash, like so: \\[ protected passage ].

The provider-to-reader transformation transforms a each citation into the form " † ", where the dagger is the anchor of a hyperlink to its corresponding reference.

The provider-to-reader transformation also equips the citation with a mouseover note that displays its reference. In the case of a subsidiary reference, the mouseover note also contains the title of the corresponding root reference.

The Reference Section

An answer to Question 3 noted a lack of structure in reference sections. A noticeable improvement is to collect all references with the same root identifier into a paragraph or subsection beginning with the root reference for that root identifier.

For the reader, references are automatically ordered as follows:

Root Reference Structure

Overall Design of the Reference Elements. An answer to Question 2 talks about the partially met need for coherence. An answer to question 3 talks about the need for simplicity. Another answer to Question 3 states that readers have two primary needs, suggesting that reference elements be organized into two categories. The first category helps the reader decide whether the referenced content is of interest, and the second helps the reader determine whether the referenced content is credible.

A root provider reference has five reference elements — seven counting the optional favicon and the root identifier.

Assessment of interest

Favicon. Optionally, an indicated icon that helps identify the page or website containing the cited work.

Root identifier. An identifier used to identify citations pointing to the reference.

Title element. A compound element consisting of the work's title and other title-related information

Link element. A list of hyperlinks

Assessment of  credibility

Contributor element. A list of key contributor names and associated credibility-related information

Publishing element. The organization or organizations that performed publishing activities for the referenced work

Date element. The date the work was published or updated

The five core elements of this design are comparable to the four elements of the APA style[APA, elements]. The most significant difference is the addition of hyperlinks as a new core element.

Core Element   APA Element
Title Title
Links Source
Contributors Authors
Publishers Source
Date Date

Punctuation. The ability to transform provider references into reader references demands that references be machine parsable. An answer to Question 2 notes that with existing guidance, getting the punctuation right may require careful study. An answer to Question 3 noted the need for simplicity and, where appropriate, uniformity.

Each reference element ends with sentence-ending punctuation followed by white space. If a period, question mark, or exclamation point followed by white space is necessary inside a reference element, escape it with a backslash. For example, "Yo\! Yes?" transforms to Chris Raschka's story, Yo! Yes? [yo-yes].

Abbreviations. An answer to Question 2 identifies the use of abbreviations as detracting from readability, contributing excess complexity, and being motivated by a need to economize on paper in print documents.

There are no required abbreviations. This convention differs from the ink-and-paper world, where there is a need to conserve paper. As an example, the JoVE is better expressed as the Journal of Visualized Experiments.

Omitted Elements. The importance of the five core reference elements does not hinge on what providers choose to share about their work. If the providers of a referenced work omit any of the five main reference elements, the omission is as significant to the reader as its inclusion would have been.

Titles. Any work worthy of mentioning is worthy of a short titular summary, even if the citing document must provide it.

URLs. If a work seriously has no presence on the Internet, the reader needs to know that looking for it could be a fool's errand. 

Contributors. Any work that can be shared has somehow been generated. Simply omitting this information would compromise the overall structure of a reference and could potentially hide the fact that helpful information is being hidden from the reader.

Publishing. If a generated work can be known to both authors and readers, it has been conveyed somehow. Information on how that happened is an additional way for the reader to assess the credibility and perhaps find additional similar content. Outside of the dark web, this information is always visible to some extent.

Dates. Omitting the date is disastrous because the date is what allows readers to know if a referenced page has been rewritten after the citing document was written.

Reference Elements

Favicon. An answer to Question 3 notes that the visual appeal of traditional reference sections could stand improvement. The answer further states that some webliographies address visual appeal by including graphical content. The favicon addresses the lack of visual content with the following guidance, guidance that is given more completely in the favicon section of the guide.

By default, the transformation algorithm gets a favicon from the reference's first linked web page. This default can be overridden by explicitly specifying the icon, like this: icon image-path. The algorithm then uses the provided image path.

Reference Identifier.

Reference identifiers are modeled directly after the citations. For example, the root citation \[twar] refers to the reference whose identifier is twar.

Title Element. An answer to Question 1 notes that a work in a language other than English effectively has two titles, the second being an English translation. 

An answer to Question 2, in discussing coherence, suggests that all title-related information be placed with the title. Another answer to Question 2 observes that some of a title's "appendage" information may no longer interest readers in an online world.

An answer to Question 3 notes that titles may need clarification with respect to both presentation and content.

An answer to Question 2 notes the need for specifying the recipient of a personal communication.

The first answer to Question 3 notes the need for simplicity. This need is clearly implicated in the rules for the use of italics. They're somewhat intricate, they are not entirely consistent across guidelines, and they relate to esthetics rather than content. A uniform approach to italicization is needed.

These observations motivate the following guidance, guidance that is presented more completely in the Title Element section of the guide:

Provide a preferred title; optional alternate titles; any needed title appendages such as version, edition, or series numbers; and optional title clarifications.

An alternate title is appropriate in the following cases:

an English translation of a foreign work,
a popular title different from the original,
an opaque title in need of rephrasing

For sake of reader convenience, when there are multiple titles, the first title should be the one readers are most likely to appreciate.

A title appendage is anything needed to clarify what the title refers to. Examples include version, build, volume, issue, part, release, and edition, supplement.

Title clarifications are placed in square brackets. Typical clarifications include: book review, clinical trial, consensus document, exposition, large-scale study, literature review, news article.  The recipient of a personal communication may be treated as a title clarification. A title clarification can also be used to mention the film that is the subject of a film review.

The provider-to-reader transformation italicizes the titles of root references by default. An author who wishes to override the default can do so by making explicit use of italics. The use of italics in titles is a matter of taste.

Link Element. The studied guides emphasize the importance of permalinks and advise the use of DOIs as a superior alternative. An answer to Question 2 points out that DOIs don't substitute for URLs and identifies the solution. Another answer to question 2 points out that explicit URLs are relics of times past.

An answer to Question 3 mentions the need to distinguish between presentation links and portrayal links. Another answer to Question 3 argues the need to provide multiple links for the same reference and suggests that the first link should be to the most credible representation. Yet another answer to question 3 emphasizes the crucial distinction between online books and eBooks.

These answers lead to the following guidance, guidance that is presented much more completely in the Link Element section of the guide:

Provide a list of one or more links to representations of the referenced content. 

The representations are of two kinds: presentations of referenced content and portrayals of related content. A link's clickable anchor is responsible for distinguishing presentations from portrayals.

Anchors for presentations include article, blog post, conversation, email, English translation, online book, painting, PDF file. 

Anchors for portrayals include abstract, comment, commentary, errata, related text, review, vendor.

Put presentations before portrayals. Put more credible representations before less credible representations. For added clarity, presentations are separated from portrayals by a semicolon if both are present.

Try to choose URLs that qualify as permalinks.

Facebook posts have permalinks associated with them. To get the permalink, click on the date following the name of the post author. The permalink will appear in the browser's address bar.

For Twitter, click on the tweet itself.

Translate DOIs (Document Object Identifiers) to permalinks by prefixing 'https://oi.org/' to the DOIs[DOI-article].

If a reference has no links, use "unlinked" for the link element. The provider-to-reader transformation sets such references in a gray color.

Contributor Element. An answer to Question 2 suggests that the number of authors to list is related to the referenced work's likely audience. An answer to Question 3 addresses author credibility by introducing biographies and author pedigrees as welcome supplements to an author's institutional affiliation. Another answer to Question 3 provides a detailed analysis of contributor roles, a subject that is only weakly addressed in existing guides. The need for simplicity motivates giving a uniform treatment to authors and contributors.

Putting these thoughts together, we have the following guidance, guidance that is given more completely in the Contributor Element section of the guide.

List key contributors with authors and other content providers first. A contributor's name may include honorific titles or pedigrees, for example, LCSW, MD, librarian. A contributor can be an individual, a group, or an organization.

The number of contributors to list depends on the likely audience for the citing document. For example, if the likely audience consists mainly of authors of the referenced works, list as many authors as possible. If not every key contributor is listed, end the contributor list with "and others" or "et al."

If no person or organization acknowledges responsibility for creating the work's content, use "unattributed" for the contributor element.

Each contributor is optionally followed by a parenthesized list of contributor attributes such as a contributor's role or roles, organizational affiliations, or a bio. Give enough role information to distinguish authors from other contributors. An included bio, history, or about page can be given via a hyperlink. 

Publishing Element. An answer to Question 2 addresses what constitutes publishing and concludes that newspapers, journals, websites, publishers, and their imprints all may engage in acts of publishing. 

Another answer to Question 2 suggests that, in the case of multiple publishers, the preferred publisher to list is the one that's most credible. And because of the way we constructed the link element, that would be the publisher of the first representation. We have the following guidance, guidance that is given more completely in the Publishing Element section of the guide.

Give the organization or organizations involved in publishing activities such as review, acceptance, final editing, marketing, and distribution to an intended audience. Likely examples include traditional publishers or their independently operating imprints, newspapers, journals, and websites. In each case, the title of the publishing entity suffices. If the work was published before 1900, include the publisher's city. In many cases, the cited content is contemporaneous with the web page on which it appears, and the likely publisher is the page's website.

If there are several publishers, use the publishers of the first link in the link element or use "multiple publishers" for the publishing element. If a work has been placed online without significant publishing activities or if no publisher can be found, list the title of the hosting website as the publishing element. The website's title is the title of its home page.

Date Element. An answer to Question 2 provides four reasons as to why a date of access is never appropriate. The same answer indicates that the date of interest to the reader is probably the date of the referenced content rather than the date of the page itself. Another answer to Question 2 points out that online platforms such as Twitter and Facebook do not really publish posts and tweets. These answers lead to the following guidance, guidance that is given more completely in the Date Element section of the guide.

Say how and when the content has been created or placed online. Regarding how, there are several possibilities:

Posted: Posted without significant publishing activities.

Published: Established as part of a publishing process.

Updated: Modified and reposted.

It is acceptable to use more than one of the above choices. For example, a document may have been posted online before being officially published. In the event that content has both online and offline publication dates, the earlier date is more appropriate.

In addition to the above, there are several dates that are typically not included in a reference because they are not properties of the cited content: a document review date, dates of comments on the cited content, the date on which a content provider last accessed the cited content, the date on which a website posted the content. In the case of books, the copyright date is a plausible though inaccurate guess as to the date of publication.

The format of the date element is up to the authors of the citing document.

Subsidiary References

Subsidiary references are like root references but with a few key differences.

A subsidiary reference omits the page icon.

The first element of a subsidiary reference consists of a root identifier and a part name. The part name is a brief indication of the subsidiary work's title.

Because a subsidiary reference will become part of a segment that begins with the corresponding root reference, there is no reason for a subsidiary reference to include redundant information from its root reference. Replace redundant elements with tildes.

Finally, a subsidiary reference must point to subsidiary content. Ideally, this can be done with a URL. If not, the reference for the root content must be supplemented with a page range or a title that defines the subsidiary content within the root content.

5. Transformation Algorithm  

Good practice in software development involves the identification of design requirements, provision of worked examples, a proposed design, a reference implementation, and documentation.

Design Requirements

The algorithm must translate provider citations and references to reader citations and references. The transformation must perform in real-time, which is to say that when a document created in the provider format moves from an editor to a browser, it automatically opens in the format designed for readers.

The transformation algorithm resolves several esthetic Questions:

A technical note: The daggers are padded with hair spaces to improve mouse-ability: " † " as opposed to "".

A Worked Example

The following example is given in the provider format. Click on the blue button to see the corresponding reader's view.

Citations

Katalin Karikó devoted forty years to developing the mRNA vaccine technology[Karikó]. Yet, she remains relatively unknown[wealth]. Doctor Anthony Fauci has led the public fight against infectious diseases for forty years[niaid]. His accolades include the well-publicized threat of a public beheading[beheading].

References

Karikó. Katalin Karikó’s research led to COVID-19 vaccines. Exposition. @ambasciatausa. U.S\. Embassy & Consulates in Italy. Updated 2021 March 9.

wealth. This scientist’s decades of mRNA research led to both COVID-19 vaccines. News article. Dana Kennedy (bio). The New York Post. Updated 2020 December 5.

beheading. Twitter bans Steve Bannon for video suggesting violence against Fauci, FBI Director Wray. News article. Jaclyn Peiser (about). Seattle Times. Updated 2020 November 6.

niaid. Anthony S\. Fauci, M.D., NIAID Director. Exposition. No author. National Institute of Allergies and Infectious Diseases. No date.

Example 1. A Worked Example

As this example illustrates, it is possible for a document to have multiple reference sections. None of the references in this example are found in this paper's main reference section.

Algorithm Design

The transformation that translates provider forms into corresponding reader forms does the following:

The handling of inconsistencies is accomplished as follows:

Using the Reference Implementation

A typical document structure is depicted below. The reference heading separates the main text from the reference section and shows its location relative to the overall document structure. The leading and trailing portions are separated from the portion of the document being actively processed. The reference section does not include any following sections.

<body...>
<div...> leading
<div...>
<p>epigraph</p> <h*>main heading</h*> ... <h*>References*</h*> ... <h*>following sections</h*>..
</div>
trailing
</div>
</body>

Conclusion  

There are seven crucial differences between traditional style guides and the approach growing out of this research.

The document Citations Online: Preliminary Guide for Content Providers presents the guidance and provides information on finding the various reference elements in a web page.

References  

WWW. The World Wide Web. Article section. History.com editors. history.com. Updated 2019 October 28.

IEEE. How to Cite References: IEEE Documentation Style. PDF examples. Unattributed. IEEE DataPort. Undated.

NLM. Citing Medicine, 2nd edition: the NLM style guide for authors, editors, and publishers. Open-access online book. Karen Patrias, Dan Wendling (technical editor). National Library of Medicine (US). Published 2007, updated 2020 August 5.

NLM, intro. Introduction. Introduction. ~. ~. Updated 2016 April 18.

NLM, article-type-journal. Article Type for Journal Articles on the Internet (optional). Chapter section. ~. ~. Updated 2018 May 18.

NLM, content-type-book. Content Type for Entire Books on the Internet (optional). Chapter section. ~. ~. Updated 2015 August 11.

NLM, newspaper. Newspaper Articles. Chapter 8. ~. ~. Updated 2015 August 11.

NLM, content-type-bibliography. Content Type for Bibliographies (optional). Chapter Section. ~. ~. Updated 2015 August 11.

NLM, article-part. Citation Rules with Examples for Parts of Journal Articles. Chapter section. ~. ~. Updated 2018 May 18.

NLM, listservs. Citation Rules with Examples for Listservs and Similar Discussion Lists. Chapter Section. ~. ~. Updated 2015 August 11.

NLM, affiliation-books. Author Affiliation for Entire Books. Chapter section. ~. ~. Updated 2015 August 11.

NLM, affiliation-articles. Author Affiliation for Journal Articles on the Internet. Chapter section. ~. ~. Updated 2018 May 18.

NLM, affiliation-web-pages. Citation Rules with Examples for Parts of Web Sites. Chapter section. ~. ~. Updated 2015 August 11.

NLM, authors-article. Author for Journal Articles on the Internet. Chapter section. ~. ~. Updated 2018 May 18.

NLM, contributors. Editor and other Secondary Authors for Entire Books. Chapter section. ~. ~. Updated 2015 August 11.

NLM, date-augmentation. Supplement/Part/Special Number to a Date for Journal Articles. Chapter section. ~. ~. Updated 2018 May 18.

NLM, volume-augmentation. Supplement/Part/Special Number to a Volume for Journal Articles. Chapter section. ~. ~. Updated 2018 May 18.

NLM, issue-augmentation. Supplement/Part/Special Number to an Issue for Journal Articles. Chapter section. ~. ~. Updated 2018 May 18.

NLM, book-edition. Edition for Entire Books. Chapter section. ~. ~. Updated 2015 August 11.

NLM, URLs-books. Multiple Book URLs (Box 67 Multiple URLs). Breakout. ~. ~. Updated 2015 August 11.

NLM, URLs-journals. Multiple Journal URLs (Box 59 Multiple URLs). Breakout. ~. ~. Updated 2018 May 18,

NLM, URLs-databases. Multiple Database URLs (Box 62 Multiple URLs). Breakout. ~. ~. Updatd 2018 May 18,

NLM, newspaper. Citation Rules with Examples for Newspaper Articles. Chapter section. ~. ~. Updated 2015 August 11.

NLM, notes-reports. Citation Rules with Examples for Entire Reports. Chapter section. ~. ~. Updated 2015 August 11.

NLM, accessed-date. Date of Citation for Homepages (required). Chapter section. ~. ~. Updated 2015 August 11.

NLM, notes-books. Notes for Entire Books (optional). Chapter section. ~. ~. Updated 2015 August 11.

NLM, website-publisher.  Sample Citation and Introduction to Citing Homepages. Chapter section. ~. ~. Updated 2015 August 11.

NLM, email. Citation Rules with Examples for Electronic Mail. Chapter section. ~. ~. Updated 2015 August 11.

APA. Publication Manual of the American Psychological Association, 7th edition. eBook vendors. American Psychological Association. Self-published. Published 2019 October 1, copyright 2020.

APA, authors. 9.8 Format of the Author Element. Page 286; related APA blog. ~. ~. ~.

APA, punctuation. 9.5 Punctuation Within Reference List Entries. Page 283; related guidance. ~. ~. ~.

APA, personal. 8.9 Personal Communications. Page 259; similar APA text. ~. ~. ~.

APA, citation. 8.10 Author–Date Citation System. Page 261; APA article. ~. ~. ~.

APA, roles. 9.10 Identification of Specialized Roles. Page 287; related guidance. ~. ~. ~.

APA, retrieval-date. 9.16 Retrieval Dates. Page 290; APA article section. ~. ~. ~.

APA, elements. 9.4 Four Elements of a Reference. Page 283; APA article. ~. ~. ~.

APA, web-page-reference. Web Page on a Website References. APA examples. Unattributed. American Psychological Association. Published 2020.

APA, edition. 9.19 Format of the Title Element. Page 291; APA article section. ~. ~. ~.

icon icons/mla.png. MLA. MLA Handbook 9th. Kindle eBook. The Modern Language Association of America. Self-published. Published 2021 April 22.

MLA, authors. 5.8 Three or more authors. Page 168; related guidance. ~. ~. ~.

MLA, author-roles. 5.44 Labels describing the contributor's role. Pages 216-217; related guidance. ~. ~. ~.

MLA, parts. 5.101 Works in One Container. Page 278; related work. ~. ~. ~.

MLA, version. 5.50 Version: How t Style It. Page 225; related guidance. ~. ~. ~.

MLA, 1900. 5.67 City of publication. Page 249; related guidance. ~. ~. ~.

MLA, containers. 5.31 Title of Container: What it is. Page 195; related guidance. ~. ~. ~.

MLA, figure-5.29. Examples of containers (figure 5.29). Page 196, related infographic. ~. ~. ~.

MLA, citation-guide. 6\. Citing Sources in the Text. Page 314, related guidance. ~. ~. ~.

cmos. The Chicago Manual of Style Online. Online book. Multiple authors (history). University of Chicago Press. Published 2017 September 5.

cmos, author-date. 15.5: The author-date system — overview. Chapter section; related guidance. ~. ~. ~.

cmos, chapter. 14.106: Chapter in a single-author book. Chapter section. ~. ~. ~.

cmos, note-bibliography. 14.19: Notes and bibliography — an overview. Chapter section; related guidance. ~. ~. ~.

cmos, authors. 14.76: Two or more authors (or editors). Chapter section; related guidance. ~. ~. ~.

cmos, contributors. 14.105: Other contributors listed on the title page. Chapter section; related guidance. ~. ~. ~.

cmos, online-formats. 14.159: Books requiring a specific application or device (e-books). Chapter section; related guide. ~. ~. ~.

cmos, personal. 14.214: Personal communications. Chapter section. ~. ~. ~.

cmos, websites. 8.191: Titles of websites and web pages. Chapter section; related guide. ~. ~. ~.

cmos, citing-websites. 14.207: Citing web pages and websites. Chapter section; related guide. ~. ~. ~.

cmos, publisher. 14.133: Preferred form of publisher’s name. Chapter section. ~. ~. ~.

CASRAI. CRediT — Contributor Roles Taxonomy. Proposal. Unattributed. CASRAI.org. Undated.

schema. Welcome to Schema.org. Website. Multiple corporate authors. w3.org. Updated 2021 July 7.

schema, role. Role: A Schema.org Type. Web page. ~. ~. ~.

schema, honorific-suffix. honorificSuffix: A Schema.org Property. Web page. ~. ~. ~.

IOPScience. Author roles and responsibilities default. Opinion. Unattributed. IOP Science. Undated.

ICMJE. Defining the Role of Authors and Contributors. Opinion. Unattributed. International Committee of Medical Journal Editors. Undated.

patents. Patent basics. Exposition. Unattributed. United States Patent and Trademark Office. Updated 2021 July 1.

Childers. Different Roles in Comic Books. Resource. Matthew Childers (author, illustrator, writer; bio). matthewChilders.com. Updated 2020 November 4.

credibility1. How Do People Evaluate a Web Site’s Credibility? Large scale study. B J Fogg, PhD (bio); Cathy Soohoo; David Danielson; et al. simson.net. Updated 2002 November 11.

credibility2. How do users evaluate the credibility of Web sites? Abstract. B J Fogg, PhD (bio); Cathy Soohoo; David Danielson; et al. Association for Computing Machinery. Published 2003 June.

galov. A Dive Into the Ocean of Web Design Statistics. Literature survey. Nick Galov (bio). Hosting Tribunal. Updated 2021 January 20.

jeff-letter. Letter to James Madison. Photocopy. Thomas Jefferson (bio). Library of Congress. Written 1803 March 22.

brave. The Brave. Synopsis, book review. Nicholas Evans (bio). The Good Book Company. Published 2010 January 1.

yo-yes. Yo\! Yes?, multiple editions. Boardbook, hardcover; review. Chris Raschka (author, illustrator; bio). Multiple publishers. Published 1993 March 1 through 2020 April 21.

einstein. On the electrodynamics of moving bodies (Zur Elektrodynamik bewegter Körper[German]). English translation. Albert Einstein (bio). Annalen der Physik. 1905.

putnam. G P Putnam's Sons. Imprint description. Unattributed. Penguin. Undated.

icon icons/globeIcon.png. color-TV. The History of Television (or, How Did This Get So Big?). Course materials. Phoebe Sengers (bio). Cornell University. Undated.

color-news. The Trend Toward Color: Some Newspapers Want to Stay Just Plain Read. News article. David Shaw. Los Angeles Times. Published 1986 March 14.

favicon. How We Got the Favicon. Historical essay. Jay Hoffmann. History of the Web. Published 2017 July 24.

ie5. Internet Explorer 5. Article. WikiProjects Computing / Software, Microsoft, et al. Wikipedia. Updated 2021 October 11.

DOI-article. Factsheet: DOI® Resolution Documentation. Exposition. Unattributed. doi.org. Updated 2020 July 4.

icon icons/dickens.png. dickens. A Christmas Carol. Online book. Charles Dickens(bio). Hodder and Stoughton. Published 1911.