Vendor documentation for the HighWire DTD (arthw.dtd)

The bulk of this documentation is divided into the typical parts of an article--the art element, copyright and document head/topics/subjects, front matter, body matter, back matter and response. If you are looking for a particular element, you can use the list below to quickly jump to the correct spot:


 TBL 1
 TBL 2

  You can view a copy of the most recent DTD here. There is also a change log for viewing. We have assumed you know how to read a DTD, understand the function/format of SGML, and would simply like to obtain additional information on what the interpretations are for various elements of the HighWire DTD.

Content delivered in HighWire SGML can include either "HighWire model" tables or CALS model tables, as you have likely already discussed with your content developer. We only discuss HighWire table models in this document (if you are required to use CALS, there are better sources than us out there to explain this standard).

Another useful external document is the PDF documentation for version 4.1.0 of the Elsevier Science Full-Length Article DTD, on which the HighWire DTD was originally based. Throughout this document, if you see the words "Elsevier DTD", that's what we mean. If you are already familiar with the Elsevier DTD, you will find some similarities in the HighWire DTD--and some reinterpretations/additions of elements, so be sure to doublecheck before using the elements.

Finally, an entity set referenced in the DTD is ISOhw, which can be seen here.




Description: Article

art is the top-level element for every article in the HighWire DTD. At the highest level, an article can be split into the following parts: copyright information, document head, document topic(s), document subject(s), front matter, body matter, back matter, and response(s). Only the copyright element is mandatory.

The art element has a number of attributes that specify basic information about the article. These are: version (version number), jid (journal identifier), aid (article identifier), vol (volume number), issue (issue number), fpage (first-page number), and lpage (last-page number), all of which are required and should be self-explanatory.

version is a fixed attribute denoting the version of the DTD that is to be used with the article in question.

doi is an optional attribute to contain the DOI (digital object identifier) for the article, if one is available.

pmid is an optional attribute to contain a PubMed ID for the article, if one is already known.

sici is an optional attribute to contain the SICI (serial item and contribution identifier) for the article, if one is available.

The journal identifier jid must contain the lowercase HighWire code for the journal in question, exactly as it has been given to you. The article identifier aid, while not normally used by the parser, should match the name of the SGML file (without the file extension). The fpage attribute may contain a "subpage" letter, e.g. 10-a, in the case of multiple articles beginning on the same page.

The language attribute can be used to specify the main language of the piece. If none is specified, the default is en (english). Other options are de (german), es (spanish), fr (french), it (italian), pt (portuguese), or ru (russian).

Example of an <art> line:

<art version="4.2.12HW" jid="sjrnl" aid="1241197" vol="124" issue="6" fpage="1197" lpage="1207">




Description: Top-level article subelements

These seven elements appear at the start of the top-level art element, immediately preceding the front matter. They augment the art element by providing basic information about the article.

The copyright element specifies the issue date via its attributes, and (optionally) the copyright holder via its contents. In addition to the year attribute, the HighWire DTD adds the month attribute, which is optional but should contain the one-or-two-digit month in most cases, and the day attribute, which should contain the one-or-two-digit day if and only if the journal is published more than once a month. "Publish ahead of print" articles should not contain any of these attributes.

The optional coverdate element contains the cover date information. Currently, it should be used only as an enhancement to the usual copyright element (which takes a single number for month). As an example, if an issue uses a multiple month on it's cover, "May/June 2004", or designates a seasonal issue such as "Spring 2004", this additional information can be captured by using coverdate:

<coverdate mo="May/June" yr="2004">May/June 2004</coverdate>


<coverdate season="Spring" yr="2004">Spring 2004</coverdate>

The dochead element contains the article header or "type" that is often printed above the title of the document, e.g. "Rapid communication". While this element is optional, it should almost always be included, since it serves multiple important purposes.

If the article appears in a journal that is sectionalized by topic, the value of the doctopic element is the name of the section of the journal in which the article should appear, e.g. "Particles and fields." Multiple doctopic elements are supported by the DTD, but are used only rarely, for multi-level TOCs.

The HighWire DTD adds the docsubj element, which can be used to specify one or more subject names or codes for the article. docsubj elements are similar to keywords, except that they are generally drawn from a smaller list of more "coarse-grained" terms, and they are not displayed within the fulltext (as keywords generally are).

The sertitle element is used to identify a "series title" that the article relates to. The parser now supports this tag, placing it in $SERTITLE_txt iff $addSertitle <> 0. The information is typically stored and used to generate a different TOC appearance or use for these "series title" articles.

If this has not yet been discussed, and you are unsure as to how a particular title or article should break out into dochead/doctopic/docsubj/sertitle, please consult with your content developer.

The optional pdf element associates a PDF with an SGML file (helpful for journals that use different filenaming styles for PDF and SGML files, or for journals that have continuous make-up sections, where one common PDF is used for many SGML files). The element contains only one other element (link). The name of the PDF, without the ".pdf" file extension, will be the value of the locator attribute within link.


Front Matter



Description: Front Matter container

Within the front matter area, about a dozen elements are allowed. These include: supptitle (the title given to an issue or supplement issue), addart (which allows you to create a relationship between articles), atl (the title of the article), aug (the author group), re/rv/acc (the information on when the article was received/revised/accepted), abs (the abstract of the article), kwdg (the keyword or abbreviation list), and others. They are described in further detail in the following "pages".



Description: Marks access control removal for article

The type attribute with a value of "free" is used when a single article needs to be freely available, when the rest of the issue is under access control. The type attribute with a value of "openaccess" is used for an open access article, a subtle but important distinction.



Description: Front matter elements

addart (additional article reference), the first of a dozen allowable front matter elements in the DTD, comes to HighWire via the Atypon Keton DTD. It allows the specification of another article in the same journal (though not necessarily the same issue) to which the current article refers. Possible uses for this feature include errata (with the required type attribute set to err) and "in this issue" references (with type set to iti). The third type of rel, which is used to create general "related" functions by the journal system. The last type is ret, which can be used to create links to article retractions. The optional vol or iss attributes are used to specify the volume and issue numbers of the target article. At least one of the following identifying attributes should be used in every addart: pg (the first page number of the target article), doi (the doi of the target article), or pii (the other form of article identifier for the target article, if first page or doi do not apply).

When addart appears within the front matter, it should be left empty. However, note that addart is also allowed within the body text. In this context, if it contains a type iti link, then the parser will create a link from the contents of the element to the target article. For example: See <addart type="iti" pg="103">p. 103</addart> for more information (if vol is omitted, the current volume is implied).

The atl (article title) element consists of text and an optional sbt (subtitle) subelement.

The supptitle (supplement title) element can be used to describe the title of a special issue.



Description: Front matter elements

According to the Elsevier documentation, the prs (presented by) element contains a statement identifying the presenter of the article, the ded (dedication) contains the dedicatory text of the article, and the misc (miscellaneous history) element contains extra information such as the communicating editor. In practice, HPS places no formal restrictions on the usage of these elements, which are therefore available (via the formatting variables bearing their names) as containers for the various esoteric types of front matter that certain journals seem to love dishing out. For instance, some journal parsers have "borrowed" the misc element as a container for citation information within book reviews.

The remark element is more specific than misc and is used for any introductory text that may appear at the top of an article but which is clearly not part of the body of the article and which may require its own formatting for visual clarity and distinction.

The abs element contains the article abstract, if any. Its language attribute can be used to specify the main language of the abstract. If none is specified, the default is en (english). Other options are de (german), es (spanish), fr (french), it (italian), pt (portuguese), or ru (russian).

In most cases, while multiple abstracts per article body are allowed by the DTD, they are not currently supported by the parser. If a vendor discovers a case where multiple abstracts may be appropriate, they should raise this item with the content developer.

Note that certain journals conventionally end each abstract with a "slugline," which is a brief citation identifying the article. PubMed has asked that sluglines not be included within the abstracts we send them. Therefore, it has become the HighWire policy to request that the vendor eliminate sluglines from the SGML before sending the files.



Description: Author group

The aug (author group) element, which may occur more than once, consists of one or more "author blocks," followed by zero or more affiliation addresses. An "author block" consists of a collaboration or an author, followed by zero or more cross-references (typically to footnotes or affiliation elements), and an optional correspondence address.

The presentation of authors within the author group can be flexible. Following are a few examples; these are not meant to necessarily be followed exactly, but rather can help serve as a jumping off point for discussions with your content developer as to what style would be suited for the journal.

Here's an example of a single author group:

<au><fnm>Damien</fnm><snm>Sternberger</snm></au><cross-ref refid="AFF1" type="aff">1</cross-ref>
<au><fnm>Thierry</fnm><snm>Maisonobele</snm></au><cross-ref refid="AFF2" type="aff">2</cross-ref>
<au><fnm>Sophia</fnm><snm>Nicole</snm></au><cross-ref refid="AFF3" type="aff">3</cross-ref>
<au><fnm>Erika</fnm><snm>Langlay</snm></au><cross-ref refid="AFF1" type="aff">1</cross-ref>
<au><fnm>Nacira</fnm><snm>Tabtitto</snm></au><cross-ref refid="AFF3" type="aff">3</cross-ref>
<au><fnm>Bernard</fnm><snm>Hainquere</snm></au><cross-ref refid="AFF1" type="aff">1</cross-ref>
<au><fnm>Bertrand</fnm><snm>Fontaignere</snm></au><cross-ref refid="AFF3" type="aff">3</cross-ref>
<cor>Professor B. Fontaignere, INSERM U546, Faculté de Médecine Pitié-Salpêtrière, 105 Bd Hôpital, 75013 Paris, France E-mail: <inter-ref locator="" locator-type="email"></inter-ref></cor>
<aff id="AFF1"><no>1</no>Service de Biochimie BAP-HP, </aff>
<aff id="AFF2"><no>2</no>Laboratoire de Neuropathologie, Groupe Hospitalier Pitié-Salpêtrière, and </aff>
<aff id="AFF3"><no>3</no>Fédération de Neurologie and INSERM U546, Groupe Hospitalier and Faculté de Médecine Pitié-Salpêtrière</aff>

And here's how it might look once processed:

Damien Sternberger1, Thierry Maisonobele2, Sophia Nicole3, Erika Langlay1, Nacira Tabtitto3, Bernard Hainquere1 and Bertrand Fontaignere3

1 Service de Biochimie BAP-HP, 2 Laboratoire de Neuropathologie, Groupe Hospitalier Pitié-Salpêtrière, and 3 Fédération de Neurologie and INSERM U546, Groupe Hospitalier and Faculté de Médecine Pitié-Salpêtrière

Correspondence to: Professor B. Fontaignere, INSERM U546, Faculté de Médecine Pitié-Salpêtrière, 105 Bd Hôspital, 75013 Paris, France E-mail:

Example of using multiple author groups:

<aff>Addenbrooke's Hospital, Cambridge, UK</aff>
<aff>Institute of Neurology, London</aff>

And here's how it might look once processed:

E.A. Walburtonian

Addenbrooke's Hospital, Cambridge, UK

C.J. Moorehen, R.S.J. Frackiazowiak and K.J. Fristonpop

Institute of Neurology, London



Description: Author group subelements

The au (author) element may consist of the following elements, in this order: fnm (first name), snm (surname), jr (name suffix), and roles (roles or job titles). The degs (degrees) element may also appear any number of times in any position.

The collab (collaboration) element specifies a named group or cooperation. In addition to the obvious distinction between an individual author and a collaboration, note that individual authors are added to the citation file (and therefore used by the journal system to construct the table of contents and author index), while collaborations are not. This is desirable for collaborations that follow author names, such as those that begin "on behalf of...", but not so desirable if a collaboration is the sole author of an article, since in that case no author will be identified in the TOC or author index. In this rare case, the element corpauthor (corporate author) can be used, as in "<corpauthor>The Society for Better SGML</corpauthor>". This "author" will be inserted into the citation file, and into the PubMed abstract.

The cor (correspondence address) element specifies the correspondence address for the article, or otherwise identifies the corresponding author. Usually there is only one cor element in the entire front matter, though more are supported. The aff (affiliation) element consists of an optional number which contains the label of the affiliation, followed by the text of the affiliation address. Both cor and aff have an identifier attribute, id, which can be used for cross-referencing by means of cross-ref.

The oid (organizational id) element consists of a required ID. When an affilation string consists of multiple affiliations, but the string cannot be separated easily into distinct <aff> tags, multiple oid tags can appear in a single aff element to provide ID targets for the author cross-refs. For example, suppose an article has two authors from different departments of the same university. Typically, the two affiliations are combined into something like "Departments of Biology and Chemistry, Stanford University". Using oid, this would be tagged:


<au>...</au><cross-ref refid="AFF1" type="aff">1</cross-ref>

<au>...</au><cross-ref refid="AFF2" type="aff">2</cross-ref>


<aff>Departments of <oid id="AFF1"><no>1</no>Biology and <oid id="AFF2"><no>2</no>Chemistry, Stanford University</aff>

In short, oid allows one to put multiple IDs within a single aff datum.



Description: Number

Each of the objects that can be the target of a cross-reference (cross-ref) can have a no (number) element. This is used for capturing the label—the number (and often a prefix that indicates the object type)—as assigned to the object by the author of the document. The type text, e.g. "Fig", is tagged together with the identifier, e.g. "5a-c". Some examples of the no element in use (from page 31 of the Elsevier documentation):

<fig id="fig4"><no>Fig. 4</no>...</fig>

<fig id="fig5"><no>Fig. 5a–c</no>...</fig>

<fig id="dia17"><no>Diagram Q</no>...</fig>

<fig id="pla11"><no>Plate XI</no>...</fig>

<fn id="fn2"><no>2</no>...</fn>

<tblfn id="tblfn2"><no>**</no>...</tblfn>

<fd id="fd3"><no>(2′)</no>...</fd>

<bib id="bib7"><no>7</no>...</bib>

There are many elements in the DTD which technically may contain no elements. At this time, the parser is guaranteed to support no correctly only within the elements to which it currently supports cross-references. These include aff, cor, kwd, bib, fd, fig, fn, tbl, and tblfn. The remaining five elements, for which no may generate unexpected behavior, are dl, enun, l, sec, and textbox.

The HighWire parser expects that any desired formatting (such as boldface) will be indicated explicitly within the element. There are a few exceptions to this rule:

  • The contents of the no element will be superscripted automatically within the aff, fn and tblfn elements. Therefore, no explicit superscripting must be added.

  • By default, the bibliography is represented as an ordered list online, so that the contents of no will become the value attribute of an HTML li element. Since most browsers will accept only a numeric value for this attribute, all non-numeric characters (including leading and trailing puncutation) should be removed from the element. This also means that "subreference" numbers such as "7a" will not display as intended.

  • If a bibliography should not be a numbered list, but rather a list order by author names, then a no element should not be included.

  • In general, it is preferable to specify the formatting/include ending punctuation in the SGML for no within a fig or tbl element. For example:

    If a figure has a label of "Figure 1.", we would like to see the following in the SGML:

    <fig id="F1"><no><b>Figure 1.</b></no><caption><p>...

    If this turns out to be difficult for some reason, there are options for this no to be formatted post-process, but please consult with your content developer.

  • A space will be added automatically between the contents of the no element and the text that follows.


<RE>, <RV>, <ACC>, <EPUB>

Description: Received/revised/accepted dates and electronic publication date

re (received date), rv (revised date), acc (accepted date) and epub (electronic publication) are empty front-matter elements. Each includes three required numeric attributes—day, mo, and yr—which specify the corresponding dates. These are optional elements in the purest sense, in that these dates may also be specified as regular footnotes, acknowledgment text, etc. with no loss of functionality (though placement options clearly will be more limited in that case).

The exact wording, capitalization, and punctuation surrounding the re/rv/acc dates, which are specified during internal processing of the SGML, also vary considerably between journals. HighWire has already encountered a journal which uses "Received... Revised... Second revision on... Accepted..." In other words, multiple rv elements are present, formatted with different text. In this case the supplier would use one re element, two rv elements in the order as presented by the text, and one acc element.

Example of a standard received/revised/accepted line:

<re day="08" mo="11" yr="2000"><rv day="31" mo="01" yr="2001"><acc day="08" mo="02" yr="2001">

And here's how it might look once processed:

Received November 8, 2000. Revised January 31, 2001. Accepted February 8, 2001.

The electronic publication date is not often used, but can be put into the SGML to identify the electronic publication date for an article if one exists and should be listed. The format is the same as the above elements: <epub day="22" mo="10" yr="2002">



Description: Paragraph

The p (paragraph) element is the basic building block for running text within an article. While that aspect needs little additional discussion, it is useful to note that the parser acts at the paragraph level to introduce important side effects to the running text, one of which is:

  • The parser will attempt to create links into the GenBank accession number database. GenBank accession numbers consist of either one uppercase letter followed by five digits, or two uppercase letters followed by six digits. Since certain other identifiers (such as grant numbers) have the same format, the parser will link only identifiers that appear within the same sentence (i.e. with no intervening period) as the case-insensitive word "GenBank".



Description: Keyword group and keyword

The kwdg (keyword group) element consists of one or more kwd (keyword) elements. kwdg has one optional attribute, class, which identifies the type of keyword. kwd has one optional attribute, id, which allows keywords to be used as a target of the cross-ref element for the purpose of creating a glossary. Two classes are recognized: kwd (uncontrolled keyword, the default) and abr (abbreviation).

For the abr class, each abbreviation and definition should appear as part of a single kwd element, separated by some sort of appropriate punctuation.

Example of kwd class tagging:

<kwdg class="kwd"><kwd>basal ganglia</kwd><kwd>conversion</kwd><kwd>hysteria</kwd><kwd>neuroimaging</kwd><kwd>thalamus</kwd></kwdg>

And here's how it might look once processed:

Keywords: basal ganglia; conversion; hysteria; neuroimaging; thalamus

Example of abr class tagging:

<kwdg class="abr"><kwd>AOI = area of interest</kwd><kwd>BA = Brodmann area</kwd><kwd>ECD = ethylenecysteinate dimer</kwd><kwd>rCBF = regional cerebral blood flow</kwd><kwd>ROI = region of interest</kwd><kwd>SPECT = single photon emission computerized tomography</kwd><kwdg>

And here's how it might look once processed:

Abbreviations: AOI = area of interest; BA = Brodmann area; ECD = ethylenecysteinate dimer; rCBF = regional cerebral blood flow; ROI = region of interest; SPECT = single photon emission computerized tomography


Body Matter


<BDY>, <SEC>, <ST>

Description: Body matter container, section, section title

Within the bdy area you find the main article. Often this will be divided into multiple sections, and the sections will have section titles (such as "Methods", "Discussion", etc.). In order to create different levels of section headings, you will nest the sec/st within each other. The formatting for the different levels, such as bold or italics, can be set automatically through the internal processing at HighWire. However, an exception would be if a st should use small caps (or any other type of "partial" formatting), this will need to be delivered with the appropriate tagging in the SGML file.

For example, in order to get the output of:


You would need to include the scp tags within st:

<sec><st>M<scp>aterials and </scp>M<scp>ethods</scp></st>

Currently the parser programs will allow nesting of sec up to four levels deep; if the vendor discovers an article which contains further levels, they should alert the content developer.



Description: Miscellaneous text structures

The qd (displayed quotation) element contains an exact reproduction or paraphrase of a part of a document that is to be set off from the rest of the text. The enun (enunciation) element typically is used for theorems, proofs, definitions, and the like. The parser formats both elements via an HTML <BLOCKQUOTE>. As of this writing, parser support for enun is minimal: while its optional title is supported, its optional number is not. This may change in the future as HighWire receives new journals that make frequent use of enunciations. The enun element contains an optional id attribute.

The textbox element contains one or more sections or paragraphs of text that are to be displayed within a box set off from the main article. The default formatting after internal SGML processing is a bordered gray box, but this can be changed on a per-journal basis. Its optional number and caption subelements, and its optional id attribute, are not explicitly supported by the parser at this time. All textboxes currently are assumed to be "fixed", so there is no variable to control location.

The fn (footnote) element contains a note that documents the text, and corresponds to a reference, e.g. a number, in the text. Footnotes consist of an optional no element which contains the footnote mark, and one or more paragraphs. The id attribute allows the footnote to be the target of a cross-ref, which should also contain the footnote mark. The id attribute is optional, which may be useful in the case of certain front-matter text that may reasonably be represented as a footnote, but that is not linked to one specific point in the text. However, in most cases, id should be included, in which case the parser will automatically create a "backward" link from the footnote to its reference. If id is omitted, it generally makes sense to omit the no subelement as well.

Whenever one or more footnotes is present in an article, the parser will automatically create a footnote section. This footnote section can have a title specified within the parser, so it is not necessary for the vendor to supply a title.

The tblfn (table footnote) element uses exactly the same conventions as a regular footnote, except that it is linked to the table in whose scope it appears.



Description: Cross-reference

The cross-ref (cross-reference) element is a reference to an element within the same document. The required refid attribute specifies an identifier, which corresponds to an element elsewhere in the document. The HighWire implementation adds the type attribute, which specifies the type of the target element. Supported types include fig, tbl, table, bib, fd, aff, sec, box, cor, kwd, fn, and tblfn. The type attribute is required in the DTD.

While the parser generally uses the contents of the cross-ref element as the text of the HTML hyperlink (when one is needed), there are a few cases in which additional formatting may be added. First, in the case of type aff, fn, and tblfn, superscripting will be added automatically if it is not present in the SGML. Second, within the author group, commas will be added between multiple adjacent cross-ref elements of type aff and/or fn. Examples of these special cases, and an example of the variety possible in processing SGML per journal, are included below:

Example SGML Example HTML Notes
See <cross-ref refid="T4" type="tbl">Table 4</cross-ref> for ... See Table 4 for ... internal flag of $useArrows ignored
See Fig. 4<cross-ref refid="F4" type="fig"></cross-ref> for ... See Fig. 4 for ... internal flag of $useArrows=1
<p>Consider the following figure: <cross-ref refid="F4" type="fig"></cross-ref></p> Consider the following figure: internal flag of $useArrows=0. Note that no visible link appears, but Figure 4 is inserted immediately following the paragraph.
<cross-ref refid="R24" type="bib">Smith et al. (1996)</cross-ref> showed ... Smith et al. (1996) showed ... internal flag of $useRefArrows=0
Smith et al. <cross-ref refid="R24" type="bib"></cross-ref> showed ... or <cross-ref refid="R24" type="bib">Smith et al.</cross-ref> showed ... Smith et al. showed ... internal flag of $useRefArrows=1
<au><fnm>John</fnm><snm>Smith</snm></au><cross-ref refid="AFF1" type="aff">a</cross-ref><cross-ref refid="FN1" type="fn">1</cross-ref> John Smitha,1 Note automatic addition of superscripting and punctuation

One-to-many mappings, i.e. from a single cross-ref element to two or more objects, is not permitted.

cross-ref has an important side effect in online presentation for elements of type fig and tbl. These elements are considered to be "floating," and the parser automatically places them wherever they appear in the SGML, or immediately following the first paragraph which contains a reference to them, whichever comes first.



Description: Link to graphical element

The link element usually specifies that a local external object, i.e. an image file, should be "inserted" at this point in the document. It is used for figure images, scanned tables, and inline figures such as nonstandard "dingbat" characters and scanned formulas. When it occurs within the element pdf, it is a reference to the PDF file that should be associated with the particular article. The element is declared empty, i.e. it has only a start tag and no end tag. It has one attribute, locator, which should match the name of a graphic or PDF file (minus the file extension). Every effort should be made to make the filenames match (i.e., uppercase or lowercase) in order to avoid causing problems when the SGML is processed.



Description: Inter-document reference

The inter-ref element is an inter-document reference, in other words a reference to an external object that is not under control of the publisher. (See also addart, which should be used for references to other articles within the same journal.) It has two attributes, both required in the DTD: locator and locator-type.

The locator-type attribute specifies the type of locator used. The currently supported locator-types are: url (uniform resource locator), email (electronic mail address), pirdb (Protein Information Resource), mmdb (NCBI's structure database, Molecular Modeling Database), ec (Enzyme Collection), sprot (Swiss Protein Database, ExPASy Molecular Biology Server), pdb (Protein Data Bank, formerly Brookhaven Data Bank), gen (NCBI's nucleotide database), genpept (NCBI's protein database), omim (NCBI's Online Mendelian Inheritance in Man), ncbigeo (NCBI's GEO, Gene Expression Omnibus), and emblalign (EMBL-ALIGN).

The locator attribute specifies the target document: a URL, email address, or identifier for a protein, enzyme, etc. The parser supports URL's via a WWW hyperlink, email addresses via a "mailto:" link, and the remaining types via a call to a HighWire script. (In the case of URL's, the prefix "http://" will be added unless it or another protocol type has already been specified.) If the inter-ref element is not empty, its contents will be used as the link text; otherwise, the locator attribute will be used.

The external database link formats are provided mainly for the use of vendors who have the ability to create these links automatically. More information on external database linking at HighWire can be found here.



Description: CME (Continuing Medical Education) quiz information reference. Used only for those journals with CME and associated quizzes.

The exref tags surround the area of text that will be highlighted in the CME quiz (the correct answer). This piece of text is intended to be short and to the point - to show where the correct answer is found in context, not to show multiple paragraphs of information. If more than a few sentences are highlighted, the user is not helped any more than if they just went back to the article and scrolled through.

The element has an attribute that matches a link in the quiz (usually a question number). Example:

<exref qq="6">Text to be highlighted</exref>

In use, that means the link the user sees appearing near question #6 in the quiz will open a new window, find this section of the article text, and display it as highlighted.

Rules for using exref in the article:

bullet do not nest exref tags inside of each other. This will not parse out correctly as hyperlinks in the final HTML file that a user sees in his browser. HTML does not like overlapping hyperlinks.

bullet do not overlap other tags that become hyperlinks with the exref (such as figure or table cross-references, or any other tags that will become hyperlinks).

Exception: reference links seem to usually work when found inside the exref tags, but it's not a good idea. Different browsers could treat this differently, so avoid it if at all possible.

Example of overlapping hyperlinks:


<exref qq="6">but calcium has been found to protect against adenomas (see <cross-ref type="tbl" refid="T1">Table 1</cross-ref>.)</exref>


<exref qq="6">but calcium has been found to protect against adenomas</exref>

bullet the exref tag cannot be inside a table. You can, however, embed the exref tags inside the caption to a figure or table.

bullet do not designate more than one exref in the article per link in the quiz (e.g., qq="6" can only appear once).

By the same token, each question should only have one link element in the quiz. Consult the journal manager for details about quiz tagging - this is separate.


<L>, <LI>, <DL>, <DT>, <DD>

Description: List and definition list elements

The l (list) element consists of an optional number, an optional heading, and one or more list items. In practice, HighWire has not yet seen the number or heading in use, and the parser may handle them unpredictably.

The l element contains a required type attribute that specifies the type of the list, and an optional id attribute. Each of the standard HTML list types has a counterpart in the HighWire DTD. A sample item from each type of list follows:

  • unord (unordered, un-numbered, item label determined by nesting level)

  • tab (no item label, only tabbing)
  • disc
  • square
  • circle
  1. ord (ordered, numbered)
  2. letter (lowercase alpha)
  3. letterupper (uppercase alpha)
  4. roman (lowercase roman)
  5. romanupper (uppercase roman)

The li (list item) element consists of one or more paragraphs. Since paragraphs may in turn contain lists, nested lists are allowed.

The dl (definition list) element is a variation of the regular list, one which should be familiar to HTML users. It may begin with an optional number and/or heading, though again, the parser does not yet include explicit support for this formatting. The body of a definition list consists of one or more list items. Each list item in turn consists of a dt (definition term) element, which contains text, and an optional dd (definition description) element, which contains one or more paragraphs. The id attribute can be used for the dl element but not with the dt element.

Example of a tagged list in SGML:

   <l type=unord>
   <li><p>This is item one on the list.</p></li>
   <li><p>This is item two on the list.</p></li>
   <li><p>This is item three on the list.</p></li>

And this is how it would look once processed:

  • This is item one on the list.
  • This is item two on the list.
  • This is item three on the list.



Description: Table and subelements

This is part one of the documentation for tbl, which describes the structure of SGML tables in the HighWire DTD. For additional information about table presentation, see part two (below).

A tbl (table) element consists of an optional number element (containing the name of the table), an optional caption (containing one or more paragraphs of text), and one or more table bodies or links to external entities (i.e. scanned table images). Table footnotes can occur anywhere within the table.

The table has an identifier, which is given by the attribute id, and which can be referenced with cross-ref. In the DTD, the identifier is required for all tables, whether or not a corresponding cross-ref is present. The optional attribute loc is supported; see part two of the tbl documentation for an important note about using it to specify "displayed" tables.

The body of a table can be regarded as a rectangular object, consisting of cells arranged in rows and columns. In the DTD it is described as consisting of rows, where each row consists of cells. The tblbdy (table body) element consists of one or more r (row) elements, and has two optional attributes that determine the column stubs. (These are the rows which would be repeated if the table were split along a horizontal line in print, i.e. the "header" and "footer" rows.) top-stubs is the number of rows, counted from the top of the table, that constitute the top column stubs. bottom-stubs is the number of rows, counted from the bottom of the table, that constitute the bottom column stubs. The parser currently ignores bottom-stubs, but it adds a horizontal rule spanning the entire width of the table body under the last top column stub as indicated by top-stubs.

Each r element consists of one or more c (cell) elements. All rows must be of equal length, though this requirement can be interpreted in two different ways once spanning cells are taken into account (see the note about $killCells in part two of the tbl documentation). In principle, every cell can have the same content as a paragraph of text. Syntactically, a table cell consists of two optional border specifications, top-border and bottom-border, followed by the actual cell content. top-border and bottom-border are empty elements which, if present, tell the parser to add a horizontal rule at the top and/or bottom of the cell.

The c element has four optional attributes. cspan is the number of spanned columns, while rspan is the number of spanned rows. ca is the column alignment, which can be l (left, the default), c (center), or r (right). j (justified) and vmk (vertical markers present) cannot be represented in HTML and have been removed from the DTD. d (decimal) also cannot be represented in HTML, but is preserved in the DTD for possible future use. For now, specifying decimal alignment will result in centered alignment. ra is the row alignment, which can be t (top, the default), m (middle), or b (bottom). vj (justified) cannot be represented in HTML and has been removed from the DTD.


<TBL> (part two)

Description: Table

This is part two of the documentation for tbl, which describes the presentation of HighWire tables by the HighWire parsers. For additional information about SGML table structure, see part one.

The tbl element's optional loc attribute specifies the table type. With the default value of float, which specifies a "floating" table, position of the table callout is determined by the position of the tbl element, or by the placement of the first corresponding cross-ref element, whichever comes first. The display value may also be used to insert the table at exactly the point in the document where it occurs, but note that this will result in the entire table body (rather than a callout with a link to an expansion file) being inserted into the fulltext. Typically, the display type should be used only for acronym tables, and other small tables whose information is very closely linked to the fulltext.

There are some environment variables related to tables that are set during processing of the SGML at HighWire, which your content developer may adjust depending on the table format you choose. The information is contained here so you can understand the adjustments possible for table presentation.

The most important environment variable related to table presentation is $killCells, which is a flag set during the processing of the SGML. In some DTD's, it is assumed that cells which are "overlapped" by spanning cells will not be included explicitly as empty cells within the SGML. Other DTD's make the opposite assumption, so that the total number of cells in each row remains constant even when spanning cells are present. The HighWire parser supports both. With the default value of $killCells=1, the parser assumes that overlapped cells are included, and "kills" them. If $killCells=0, the parser assumes that overlapped cells are not included, and translates the table structure directly into HTML (which makes the same assumption).

Another important table-related environment variable is $addSpanRules. While some source SGML explictly specifies border information for table cells (either via attributes or via elements), other source SGML does not, and it is left to the processing application to create attractive (or at least legible) tables despite the missing information. By default, the parser addresses this issue by adding a horizontal rule beneath the header rows based on the value of top-stubs (see part one of the tbl documentation), and also beneath all cells that span columns. This works surprisingly well in most cases, but if the parser developer determines that it is resulting in too many extraneous rules, the spanning-cell rules may be "turned off".



Description: Figure, inline figure, and caption

The fig (figure) element consists of an optional number (containing the name of the figure, e.g. "Fig. 1", "Plate IV", "Diagram A"), an optional caption, and one or more links to external images (figure bodies). The use of the fig element normally results in the creation of a figure expansion file, coupled with a figure callout in the online presentation. The position of the callout online is determined by the position of the fig element, or by the placement of the first corresponding cross-ref element, whichever comes first.

The fig element can also contain the attribute loc, which is used when image-expansion files are not desired. There are four settings for loc. The setting loc="sidebar" is defined to allow text-wrapping around the figure whereas loc="wide" specifies that text will not wrap. The third setting loc="display" also allows text to wrap around the figure, but it is defined for use with a small image, such as a headshot. The position of the aforementioned non-expanding figures is determined by the placement of the fig element. See examples below.

Example SGML Notes
<fig loc="display"><caption><p>Chris Smith, M.D.</p></caption><link locator="Smith"></fig> Example of a headshot.
<fig id="s1" loc="sidebar"><no><b>Scheme 1.</b></no><link locator="scheme1"></fig> Note from the two examples that id, no, and caption are optional.

The fourth setting loc="thumb" is to be used for non-expanding figures that retain the same display and layout as the expanding figure types.

Please note two things with the HighWire implementation of fig: First, nested fig elements are not supported. (Multi-image figures should simply use multiple link elements.) Second, the id attribute is required, whether or not a corresponding cross-ref is present. One exception to this rule exists: display figures need not include an id attribute. Nevertheless, the importance of this attribute to standard figures is sufficiently critical that the HighWire DTD lists it as required. The few vendors whose journal styles require display figures may modify their local copies of of the DTD to relax this restriction as needed.

The inline-fig element specifies that an inline figure should be inserted at precisely the point in the document where it occurs, with no expansion link. It contains a single link element. It is commonly used for displayed formulas which cannot be represented in SGML and must be sent as scans, or for "dingbat" characters which cannot be represented as standard entities. Conceptually, inline and display figures may be distinguished by the fact that inline figures must appear at a precise point within the fulltext for the content to be comprehensible, while it is sufficient for display figures to appear in the vicinity of related text which will be wrapped around them.

The caption element simply consists of one or more paragraphs describing the parent element. It is also used within tables and textboxes.



Description: Displayed formula

Related math elements: F, FEN, LIM/NU/DE...

The fd (displayed formula) element encloses a formula which should be set off from the running text. A formula may contain the same elements as a paragraph, but in most cases it will contain a high percentage of mathematical constructions. The parser translates the entire contents of every fd element from SGML through a variety of programs in order to obtain a GIF image for the formula. If a link element is found within the formula (i.e. nested within an inline-fig element), the parser assumes that the entire formula has been provided as a scan; it includes the scan in the HTML and ignores any other text within the fd element.

Partly for historical reasons, the fd element exhibits an exception to the rule that no item labels are generated automatically. If the optional id attribute is set, the parser will automatically label the formula with the text obtained by removing the leading "FD" or "E" identifier from the attribute, so that a formula with id "fd1" will be labeled "(1)". This behavior may be overridden by specifying an explicit no (number) element within the formula, which may also be done to label formulas with no id attribute. The content developer is also able to adjust display so that no labels at all will be generated online, even though you have provided id or no elements. This is useful if the printer supplies displayed formulas as scans and includes the formula label within the scan.

There is a fixed limit to the width of displayed formula images that may be rendered by our process. If this limit is exceeded, resulting in truncated formulas, it usually means that multi-line formulas are being provided within a single fd element (or equivalent). In this case, the formulas should be split along the lines are provided in multiple fd elements.



Description: Inline formula

Related math elements: FD, FEN, LIM/NU/DE...

The f (inline formula) element encloses a formula which appears within the running text, rather than being set off in the manner of a displayed formula. It may contain the same elements as a paragraph, but in most cases it will contain a high percentage of mathematical constructions. Under normal circumstances, the parser translates the entire contents of every f element from SGML through a variety of programs in order to obtain a GIF image for the formula.

f is perhaps the most subtle element in the HighWire DTD, in that it is not always obvious when it should be used. For instance, should every textual instance of the mathematical variable "n" be tagged as a formula? The last sentence of the above paragraph suggests that the answer should be "no," since this could result in hundreds of GIFs being created per article, the vast majority of which could be represented with simple HTML. But where should the line be drawn?

A simple example may help. Consider the following equation:

x = <lim><op>&sum;</op><ll>n=1</ll><ul>5</ul></lim> n

If f is not used in this example, the parser will output the "x =" as plain text. Then it will encounter the limit construction, realize that a GIF is necessary to represent it, and jump automatically into GIF-processing mode. When the matching </lim> tag is found, it will output the GIF and re-enter plain text mode before outputting the final "n". Thus the formula will be displayed as

x = n
which is not exactly aesthetically pleasing. On the other hand, if the entire formula is tagged using f, the parser will enter GIF-processing mode earlier and leave it later, so that the entire formula will be displayed as a GIF:
In other words, f ideally should be used for any formula which contains any elements which cannot be represented as plain HTML.

On the other hand, it should not be used too often, to avoid the aforementioned "hundreds of GIFs per article".


<A>, <AC>

Description: Accent construction

The a (accent construction) element consists of two sub-elements ac. The first sub-element is the accented character (one character only), and the second sub-element is the accent (one accent or mark only), which most often is an entity reference for a floating accent, e.g. &circ; for the circumflex accent.

If there is an ISO standard entity that can be used instead of an accent construction (for example, "&eacute;"), please use that entity.

Nested accent constructions are not allowed.


<FEN>, <CP>

Description: Fence construction

Related math elements: F, FD, LIM/NU/DE...

Characters such as parentheses (), square brackets [], or curly braces {}, that are used to set off parts of a formula, are collectively called fences. The fen construct is provided in order to enable automatic adjustment of their height to match the dimensions of the material between the fences. The delimiting symbol is specified by the cp (fence post) element, which can appear at the beginning or end of the fence construction, or any number of times inside the fence construction. There should be at least one cp element in each fence construction.

The fen element does not generate any output itself, but only delimits a scope. All delimiters that occur within this scope should be tagged as cp. The height of a delimiter is then determined by the maximum height of the contents of the enclosing fen element.

In practice, the only way for HPS to support this automatic height adjustment in all cases would be to enter GIF-processing mode and generate a GIF whenever it encountered a fen element. Due to the desire to use plain HTML to represent mathematics whenever possible, this is not actually done; in fact, the parser ignores the fen element entirely. However, when the parser is already in GIF-processing mode (because it is within an f, fd, or mathematical element that requires it), proper use of the cp element will allow the GIF-processing program to adjust the height of the delimiters correctly.

For example:

<f><fen><cp type=lpar> ISE=<fr shape="built"> <nu>∫<inf>D</inf> (y(<rm><b>x</b></rm>) &minus;<a><ac>y</ac><ac>&circ;</ac></a> (<rm><b>x</b></rm>))<sup>2</sup>d<rm><b>x</b></rm></nu> <de>var<inf>D</inf>y(<rm><b>x</b></rm>)</de></fr> <cp type=rpar></fen></f>

Would look like:

The cp element is empty, with two attributes, type and style. The required type attribute identifies the delimiting symbol and can have the values listed in the below table (from page 35 of the Elsevier documentation:

The optional style attribute can have the values s (single), d (double), and t (triple).


<LIM>, <OP>, <LL>, <UL>, <FR>, <NU>, <DE>, <RAD>, <RCD>, <RDX>, <AR>

Description: Mathematical elements

Related math elements: F, FEN, FD...

The elements described on this page are used to generate limit constructions, fractions, radicals (roots), and arrays. All of them result in an automatic jump into GIF-processing mode, i.e. they all generate a GIF.

The lim (limit) element consists of a mandatory op (operator) element, an optional ll (lower limit) element, and an optional ul (upper limit) element.

Here's an example of lim tagging:


And here's what it would look like after processing:

The fr (fraction) element consists of a nu (numerator) element and a de (denominator) element, both mandatory. (The fraction bar itself is not tagged; it is implicit.)

The optional shape attribute can have the values built (built-up, the default) or sol (solidus).

Here's a fun example of fraction tagging, within an equation:

<fd id="eq1.1"><no>1.1</no>p(<rm><b>t</b></rm><inf>N</inf>|<rm><b>X</b></rm><inf>N</inf>,<b>&thetas;</b>)=<fr shape="built"><nu><rm>exp</rm>(&minus;<fr shape="built"><nu>1</nu><de>2</de></fr><rm><b>t</b></rm><inf>N</inf><sup>T</sup> <rm><b>C</b></rm><inf>N</inf><sup>&minus;1</sup><rm><b>t</b></rm><inf>N</inf>)</nu><de>(2&pi;)<sup><fr shape="built"><nu>N</nu><de>2</de></fr></sup>&verbar;<rm><b>C</b></rm><inf>N</inf>&verbar;<sup><fr shape="built"><nu>1</nu><de>2</de></fr></sup></de></fr>,</fd>

What this will end up looking like:


The rad (radical) element consists of a mandatory rcd (radicand) element and an optional rdx (index) element.

The ar (array) element represents a rectangular scheme, consisting of one or more r (row) elements. As with tables, each row consists of one or more c (cell) elements, but the parser does not support the span or alignment attributes or border subelements for arrays as it does for tables. Arrays may be used in combination with fences to create matrices, and may even be used to represent certain chemical diagrams.

Arrays are also useful for generating "stack" constructions involving overscripts, underscripts, superscripts and subscripts that must be placed vertically with respect to each other. While limit and accent constructions can generate some of these, HighWire has observed that they are insufficent in some cases. Since the vertical spacing of traditional arrays is too wide to be usable for stack constructions, the HighWire DTD adds the shape attribute, which can have the values standard (the default), close, or mixed. close arrays look much like standard arrays, but with tighter spacing. mixed arrays use tight spacing as well, but they also use a smaller font size for the top character of two, or the top and bottom characters of three. Both close and mixed arrays must consist of either two or three rows, with exactly one column per row. The table below contains several examples of each array shape.
(2 rows)
(3 rows)
(2 rows)
(3 rows)
(2 rows)
(3 rows)


<SUP>, <INF>, <B>, <IT>, <OF>, <SC>, <GE>, <SSF>, <TY>, <SCP>, <RM>, <UNL>, <OVL>

Description: Character formatting elements

The basic use of character formatting elements should be familiar from HTML, which uses constructs that are quite similar to those in the HighWire DTD. (The table at the end of this section contains some examples for the uninitiated.) The DTD and parser support these character formatting elements: sup (superscript), inf (inferior or subscript), b (boldface), it (italic), ty (typewriter), scp (small caps), rm (roman), ssf (sans-serif, not currently supported), unl (underline), and ovl (overline). The of (openface), sc (script), and ge (german) elements remain in the DTD but are not yet supported by the parser.

Font changes can be used anywhere in the document, with one exception: rm, which can only be used in formulae. In normal text, all letters, latin and greek, have the default shape "upright" (roman). In most journals, all letters inside formulae, latin and greek, have the default shape "slanted" (italic). Therefore, the font change rm is used in formulae only, to generate letters or words in roman font. It is useless in normal running text, since running text is printed in a roman font by default.

Due to the difficulty of representing small caps within HTML, HighWire places some severe restrictions on the content allowed within the scp element. Specifically, SGML elements (including nested font constructs), special-character entities, and uppercase letters must not occur. If a string of text is to contain a combination of "standard" and "small" caps, the "standard" caps must appear outside the scp element, not as uppercase characters within the element. For example, to obtain the string "MATLAB", one currently must write M<scp>atlab</scp>, not <scp>Matlab</scp>.


Back Matter



Description: Back matter container

Within the back matter you'll find the acknowledgements section (ack) and the bibliography (bibl) more about it below...




The acknowledgments section contains the information pertaining to who sponsored the study, who the authors are thanking, etc. If this section has a specific, consistent heading in the print (such as "Acknowledgements"), it's not necessary to include this heading in the SGML. The parser program will recognize the appearance of an ack and is able to assign a heading automatically. If, however, there are inconsistent headings for this section, then your content developer will likely ask you to insert the text of the heading within ack.



Description: Bibliography and subelements

The top-level bibl element contains an optional section title, followed by one or more bib (bibliographic reference) elements. Each bib element in turn contains on optional no (number) element, followed by an other-ref element containing the citation text. bib has an id attribute, which allows it to be used as the target of a cross-reference.

The other-ref element citation text can contain the following: title (the journal title, not the article title), month (publication month), day (publication day), date (publication year), volume-nr (volume number), first-page (first page of the citation), doinum (DOI number), and pmid (PubMed ID). Four items--title, date, volume-nr, and first-page--are necessary for Medline matching, and they should be tagged for all citations to journal articles. doinum and pmid are to be used as available in the content. month, day, and issue will improve the match rate and should be tagged as well when present.

Another optional item for a citation is to tag the first author, to further differentiate a citation where multiple matches may be possible. This is done by using firstauthor, with appropriate first name and last name elements included. For example, <firstauthor><snm>Doe</snm> <fnm>J</fnm></firstauthor>.

By default, the bibliography is represented as an ordered list. (See no for an important note about the contents of that element in this context.) If no is not included (i.e. citations are unnumbered), this representation is not appropriate.

The further-reading element is similar to the bibl element. Use further-reading for secondary references: items like Suggested Reading, Additional References, Further Reading. etc. Like bibl, further-reading contains an optional section title followed by one or more bib elements.

Example of bibl tagging:

<bibl><bib id="R1"><no>1</no><other-ref>Vial, T. and Descotes, J. <date>1995</date>. Clinical cytotoxicity of cytokines used as haemopoietic growth factors. <it><title>Drug Saf.</title></it> <volume-nr>13</volume-nr>:<first-page>371</first-page>.</other-ref></bib><bib id="R2"><no>2</no><other-ref>Nohria, A. and Rubin, R. H. <date>1994</date>. Cytokines as potential vaccine adjuvants. <it><title>Biotherapy</title></it> <volume-nr>7</volume-nr>:<first-page>261</first-page>.</other-ref></bib><bib id="R3"><no>3</no><other-ref>Ramshaw, I. A., Ramsay, A. J., Karupiah, G., Rolph, M. S., Mahalingam, S. and Ruby, J. C. <date>1997</date>. Cytokines and immunity to viral infections. <it><title>Immunol. Rev.</title></it> <volume-nr>159</volume-nr>:<first-page>119</first-page>.</other-ref></bib></bibl>

And here's what it might look like once processed:


1. Vial, T. and Descotes, J. 1995. Clinical cytotoxicity of cytokines used as haemopoietic growth factors. Drug Saf. 13:371.
2. Nohria, A. and Rubin, R. H. 1994. Cytokines as potential vaccine adjuvants. Biotherapy 7:261.
3. Ramshaw, I. A., Ramsay, A. J., Karupiah, G., Rolph, M. S., Mahalingam, S. and Ruby, J. C. 1997. Cytokines and immunity to viral infections. Immunol. Rev. 159:119.





Description: Response section

The response section comes at end of an article, and is often used when there is a "Letter to the Editor" and a "Response" within one article. The Letter to the Editor is tagged as usual, then the Response is identified with the response element. response can contain the same types of elements you would find in a regular article, such as the front matter/body matter/back matter.

The response element can also be used in cases where there are multiple letters in one file, multiple book reviews that fall under one theme title, or even mutiple editorials/articles that fall under one title/theme, and are authored by different people.

If there is not a specific content title for the response, but rather a heading like "Author's Reply", this can be used as the title (atl) within the response.