Chapter 8. Markup Language Support

It's true that many of the people who use Emacs are developers, writing code, tweaking it, recompiling it, and just generally enjoying the services of an amazingly extensible work environment. A variety of people, including developers, need to produce text for publication, whether internally, online, or in book format. This chapter describes the markup language support that Emacs offers, a topic relevant to both information publishers and developers, as more and more development work uses variants of the Extensible Markup Language, XML.

Choosing a format for producing documents isn't all that straightforward these days, especially if you eschew Microsoft Word. Some people write HTML, and Emacs offers a few options for this. HTML gives you some control over formatting but displays differently on various browsers. Of course, it is important as the lingua franca of the Web.

Other text publishing options include the TEX family. TEX (pronounced "tek") is a formatter that was developed by Donald Knuth for generating books. LATEX (pronounced "lay-tek") is a set of TEX commands created by Leslie Lamport. With TEX and LATEX , you can produce very precisely formatted text with equations, interesting fonts, graphics, headers and footers, and the like. Whether using filters or features of the program itself, you can publish TEX documents in a variety of formats.

Another option for publishing text—as well as programming—is XML. XML, when combined with a Document Type Definition (DTD) or schema, enables you to write text once and publish it in a variety of formats. Extensible Style Language (XSL) is also important in this regard. Because the standards are still being defined, organizations involved in document production may choose an established XML dialect, such as DocBook, as their publication format. XML at this point provides less precise control over format, but maximizes flexibility.

XML bridges the programming and publishing worlds, and what you do with XML will in part determine what tools you use and what support you need. We discuss a few options for writing XML in Emacs, including psgml mode and Jim Clark's nxml mode, which uses Relax NG schemas rather than DTDs for validation.

Some word processors and other tools integrate formatting and editing. These tools are often called WYSIWYG (what you see is what you get) tools. What's the advantage of using Emacs versus a WYSIWYG tool? Well, whether you're writing LATEX, XML, or HTML, you can be crystal clear about what's in the file and how it's structured if you use Emacs. Save a Microsoft Word file as HTML and then open the resulting file in Emacs. Word bloats the file with additional tags and formatting that is not strictly required. In terms of output, the streamlined and straightforward code you picture in your mind's eye when viewing a page is definitely not what you get, an ironic consequence of using a WYSIWYG tool like Word to create markup files. Chances are, if you've read this far, you're planning to use Emacs anyway, so we won't belabor the point.

In this chapter, we talk about these markup modes:

• For writing HTML, Emacs HTML mode (a subset of SGML mode) and the add-on HTML helper mode are discussed.

• For writing XML, Emacs SGML mode and the add-on modes psgml mode and nxml mode are described in brief.

• For writing LATEX documents, Emacs LaTeX mode is discussed.

These major modes help you insert formatting commands, or markup, into your text. While the amount of help that Emacs offers varies, using the mode designed for your text formatter will streamline your work.

At this point we must insert a caveat. We provide a barebones introduction to the markup modes described in this chapter. What we say here will get you started, but not much more than that. Entire books could be and have been written about using each of the markup tools described here. Now that that's out of the way, let's talk about a few features that are important in all the modes: comment handling and font-lock mode.

8.1 Comments

All the modes described in this chapter share a feature with the programming language modes such as Java mode and Lisp mode, which we discuss in Chapter 9. All these modes understand comments and use a single command, M-; (for indent-for-comment) to insert the appropriate comment syntax. Table 8-1 lists the comment syntax for the tools in this chapter.


Table 8-1. Comments in markup modes

If you type M-; in: Emacs inserts:
HTML mode
HTML helper mode
SGML mode
nxml mode
psgml mode
LaTeX mode
%% (on blank lines)
% (on lines with content)

8.2 Font-Lock Mode

Font-lock mode is discussed primarily in Chapter 9; it's designed for coloring code to make it easier to read. But the fact is that it works well in other modes too, like the Buffer List (Chapter 4), Dired (Chapter 5), and in all the markup modes described in this chapter.

To turn on font lock mode, choose Syntax Highlighting from the Options menu. If you decide you want to turn it on for every session, select Save Options from the Options menu and Emacs writes your .emacs file.

For more details on font-lock mode, see Chapter 9.

8.3 Writing HTML

Without doubt, the most commonly used markup language today is hypertext markup language (HTML), used for creating web pages. HTML consists of text with tags that define characteristics about the text. HTML is not hard to write, and you could use Emacs or any other editor to write the tags and the text. An HTML tag generally looks like this:

text being tagged

For your convenience, several modes are available for writing HTML in Emacs, including HTML mode, HTML helper mode, html menus, and a variety of SGML[42] tools including sgml mode and psgml mode. Of these tools, we've chosen to describe HTML mode, a variant of sgml mode, which is included in GNU Emacs, and HTML helper mode, which is a popular add-on. If you are writing XHTML, a stricter version of HTML that can be validated, you should consider XHTML mode, described briefly in this section, or psgml mode, covered later in the XML section of this chapter.

Serious web developers may want to investigate some of the cutting edge development going on to make Emacs even more powerful. Check out HTMLModeDeluxe (http://www.emacswiki.org/cgi-bin/wiki/HtmlModeDeluxe) and the Emacs WebDev Environment by Darren Brierton (http://www.dzr-web.com/people/darren/projects/emacs-webdev). Both of these tools support mmm mode (where mmm stands for "multiple major modes"). Using this feature, the cursor changes major mode depending on the section of the page you are editing. When you edit a script, the mode changes automatically to support that type of authoring. Both are excellent tools for building complex web pages.

In the following sections, we are not going to teach you to write HTML. (For more information on writing HTML, see HTML and XHTML: The Definitive Guide by Chuck Musciano and Bill Kennedy, O'Reilly) Rather, we're going to teach you the rudiments of using HTML mode and HTML helper mode to help you create HTML documents.

8.3.1 Using HTML Mode

To start HTML mode, type M-x html-mode (or simply open an HTML file). Most authors use a standard template when they write HTML. You may already have one. If you don't, HTML mode is happy to supply one for you. Simply start by typing C-c C-t (for sgml-tag) or by selecting Insert Tag from the SGML menu. If you enter the

tag that signifies the start of an HTML document, Emacs inserts a basic template in your buffer.


Type: C-c C-t html Enter

Emacs prompts for a title.


Type: A Tale of Two Cities Enter

Emacs inserts an HTML template.


Note that Emacs automatically creates a first-level header that is equal to the title you entered. It also inserts a hyperlink so that readers can email you. Depending on your spam tolerance, you may want to delete that line. Also, Emacs is just guessing at your name and email address. You can set these explicitly by adding two lines to your .emacs file. Change Mr. Dickens' information to settings appropriate for you.

(setq user-mail-address cdickens@great-beyond.com)

(setq user-full-name "Charles Dickens")

You could approach HTML mode in a couple of ways. You could learn the key bindings for various tags, or you could simply use the sgml-tag command for everything. It depends how many bindings you want to learn. A mixed approach may be best, where you learn keystrokes for the most common tags and use sgml-tag for less common tags.

Key bindings are intuitive in HTML mode. Like most specialized editing modes, many functions are bound to C-c C- something. We've seen C-c C-t to insert a tag. You won't be too surprised to find that to move forward to the next tag you type C-c C-f and to move back to the previous tag you type C-c C-b. To insert an

tag, type C-c C-h. You see what we mean.

HTML mode is designed for writing HTML, not XHTML. XHTML is stricter, requiring all tags to have a closing tag. The common

tag is a salient example. HTML authors would never use the closing tag

that XHTML requires. HTML mode inserts a lone

tag even when given a command, such as sgml-tag, that normally inserts a tag pair. If you want to write XHTML, use XHTML mode instead. Emacs starts this mode itself if your file contains a reference to an XHTML document type definition. Other than completion of tags, XHTML mode is very similar to HTML mode described here.[43]

Being able to hide the tags is a helpful feature. To hide HTML tags, type C-c Tab; use the same command to display the tags again. Let's say that we've inserted some of our dickens file into the dickens.html file we were just working with.


Initial state:

dickens.html with tags showing.


Type: C-c Tab

Emacs hides the tags.


You can keep typing text, concentrating on what you're writing rather than being distracted by the markup. Emacs protects you from deleting tags when you're writing by making hidden text read-only. If you move the cursor onto a hidden tag, Emacs displays it in the minibuffer.

Of course, the whole purpose of writing HTML is to display it in a web browser. Typing C-c C-v (for browse-url-of-buffer) opens the default web browser to view the web page you're writing.

If you'd like to look at the file in a web browser each time you save, you can turn on a function called html-autoview-mode, invoked by pressing C-c C-s. When you save the file, Emacs automatically opens it in the default browser.

8.3.1.1 Character encoding in HTML mode

What if you want to include special characters or characters from other character sets in your web page? The short answer is that you can enter a character's encoding explicitly. For example, to enter a capital Ü with an umlaut, you can type

Ü
. Many characters can also be represented as named entities, which are certainly easier to remember than numbers. For example, the named entity for a capital Ü with an umlaut is
Ü
.

But HTML mode does provide more support than this. We'll take the simplest case first. Let's say you can create a character with your keyboard; for a common case, take the ampersand, a character that must be encoded since it has a special meaning in HTML. Type C-c C-n & Enter. Emacs inserts the entity for an ampersand,

&
. You can insert entities for a wide variety of keyboard characters this way.

But let's say that you are inserting characters that are not on your keyboard. For example, perhaps you are in the U.S. writing up a list of contributors from Europe and many of their names have accent marks. The ISO Latin-1 character set will handle this.

If you have a keyboard that already emits Latin-1 characters and Latin-1 is your default coding system for keyboard input, inserting such characters is relatively straightforward. Simply press C-c 8 to turn on a minor mode called SGML name entity mode. Emacs says

sgml name entity mode is now on
.[44] C-c 8 toggles this state. Type Latin-1 characters as you normally would and Emacs inserts the named entities associated with those characters.

For those of us with other keyboard encodings, however, there's a bit more to do. To get bindings to insert entities into your HTML file, we discuss two options. The first is ISO accents mode. This mode provides support, as the name implies, for accented text. Whether you're typing umlauts, cedillas, circumflexes, acute, or grave marks, ISO accents mode is up to the task. The other option is to use the C-x 8 prefix to insert a wide range of entities, including currency signs, mathematical symbols, and copyright signs (as well as all the accented characters ISO accents mode supports).

8.3.1.1.1 Using ISO accents mode

To use ISO accents mode to insert entities in your file, type C-c 8 to turn on SGML name entity mode, then M-x iso-accents-mode Enter to turn on that mode. In ISO accents mode, certain characters (including /, ~, ', ", `, and ^) are interpreted as prefixes to create accented characters. SGML name entity mode captures these keystrokes and automatically inserts the appropriate HTML entity. For example, typing

'a
produces the HTML entity for á,
á
. For specific key bindings, see Table 8-2.

8.3.1.1.2 Using the C-x 8 prefix

You can also insert a wide range of entities using C-x 8 after you do some setup.[45] First enter SGML name entity mode by typing C-c 8. Next specify Latin-1 as your character set by typing C-x Enter k latin-1 Enter. You can then enter a large number of entities by typing commands prefixed with C-x 8. For example, to insert the entity for a yen symbol, type C-x 8 Y. Watch the minibuffer. The literal character will appear in the minibuffer as the entity is inserted. Both ISO accents mode and the C-x 8 prefixes allow you to type a single undo command (C-_) to translate the entity back into the literal character.

Table 8-2 provides a list of accented characters and the bindings that help insert them. Table 8-3 lists other named entities including punctuation marks and symbols.


Table 8-2. Bindings for inserting entities for accented characters[46]

C-x 8 prefix keystrokes ISO accents mode shortcut Character entity Character displayed in browser
C-x 8 " "
´
´
C-x 8 ' a ' a
á
á
C-x 8 ' A ' A
Á
Á
C-x 8 ' e ' e
é
é
C-x 8 ' E ' E
É
É
C-x 8 ' i ' i
í
í
C-x 8 ' I ' I
Í
Í
C-x 8 ' o ' o
ó
ó
C-x 8 ' O ' O
Ó
Ó
C-x 8 ' u ' u
ú
ú
C-x 8 ' U ' U
Ú
Ú
C-x 8 ' y ' y
ý
ý
C-x 8 ' Y ' Y
Ý
Ý
C-x 8 ` a ` a
à
à
C-x 8 ` A ` A
À
À
C-x 8 ` e ` e
è
è
C-x 8 ` E ` E
È
È
C-x 8 ` i ` i
ì
ì
C-x 8 ` I ` I
Ì
Ì
C-x 8 ` o ` o
ò
ò
C-x 8 ` O ` O
Ò
Ò
C-x 8 ` u ` u
ù
ù
C-x 8 ` U ` U
Ù
Ù
C-x 8 ^ a ^ a
â
â
C-x 8 ^ A ^ A
Â
Â
C-x 8 ^ e ^ e
ê
ê
C-x 8 ^ E ^ E
Ê
Ê
C-x 8 ^ i ^ i
î
î
C-x 8 ^ I ^ I
Î
Î
C-x 8 ^ o ^ o
ô
ô
C-x 8 ^ O ^ O
Ô
Ô
C-x 8 ^ u ^ u
û
û
C-x 8 ^ U ^ U
Û
Û
C-x 8 " " " "
¨
¨
C-x 8 " a " a
ä
ä
C-x 8 " A " A
Ä
Ä
C-x 8 " e " e
ë
ë
C-x 8 " E " E
Ë
Ë
C-x 8 " i " i
ï
ï
C-x 8 " I " I
Ï
Ï
C-x 8 " o " o
ö
ö
C-x 8 " O " O
Ö
Ö
C-x 8 " u " u
ü
ü
C-x 8 " U " U
Ü
Ü
C-x 8 " s " s
ß
ß
C-x 8 " y " y
ÿ
ÿ
C-x 8 " Y " Y
Ÿ
Ÿ
C-x 8 ~ ~
¬
¬
C-x 8 ~ a ~ a
ã
ã
C-x 8 ~ A ~ A
Ã
Ã
C-x 8 ~ d ~ d
ð
ð
C-x 8 ~ D ~ D
Ð
Ð
C-x 8 ~ n ~ n
ñ
ñ
C-x 8 ~ N ~ N
Ñ
Ñ
C-x 8 ~ o ~ o
õ
õ
C-x 8 ~ O ~ O
Õ
Õ
C-x 8 ~ t ~ t
þ
þ
C-x 8 ~ T ~ T
Þ
Þ
C-x 8 / /
÷
÷
C-x 8 o / /
˚
°
C-x 8 / a / a
å
å
C-x 8 / A / A
Å
Å
C-x 8 / e / e
æ
æ
C-x 8 / E / E
Æ
Æ
C-x 8 / o / o
ø
ø
C-x 8 / O / O
Ø
Ø
C-x 8 , , ~~
¸
¸
C-x 8 , c ~c
ç
ç
C-x 8 , C ~C
Ç
Ç

Table 8-3. Bindings for inserting entities for punctuation and symbols

C-x 8 prefix keystrokes Character entity Character displayed in browser
C-x 8 1 / 2
½
½
C-x 8 1 / 4
¼
¼
C-x 8 3 / 4
¾
¾
C-x 8 SPC
 
nonbreaking space
C-x 8 !
¡
¡
C-x 8 $
¤
¤
C-x 8 +
±
±
C-x 8 -
­
soft hyphen
C-x 8 .
·
·
C-x 8 <
«
«
C-x 8 =
¯
¯
C-x 8 >
»
»
C-x 8 ?
¿
¿
C-x 8 |
¦
¦
C-x 8 c
¢
¢
C-x 8 C
©
©
C-x 8 L
£
£
C-x 8 P
C-x 8 R
®
®
C-x 8 S
§
§
C-x 8 u
µ
µ
C-x 8 x
×
×
C-x 8 Y
¥
¥
C-x 8 ^ 1
¹
¹
C-x 8 ^ 2
²
²
C-x 8 ^ 3
³
³
C-x 8 _ a
ª
ª
C-x 8 _ o
º
º

Table 8-4 lists HTML mode commands.


Table 8-4. HTML mode commands

Keystrokes Command name Action
(none) html-mode Enter HTML mode.
C-c C-t SGMLInsert Tag sgml-tag Inserts a tag, prompting for attributes. If you enter html as the tag name, inserts a template html file.
C-c Tab SGMLToggle Tag Visibility sgml-tags-invisible Hides or shows the tags in the file.
C-c C-v SGMLView Buffer Contents browse-url-of-buffer Display buffer in default browser.
C-c C-s html-autoview- mode If this mode is on (this command toggles it), display file in browser each time it is saved in Emacs.
C-c 8 sgml-name-8bit-mode If turned on, certain keystrokes for inserting Latin-1 characters are captured and replaced with the appropriate entities. See "Character encoding in HTML mode" for details.
C-c C-f SGMLForward Tag sgml-skip-tag-forward Move forward to the next tag of the same level.
C-c C-b SGMLBackward Tag sgml-skip-tag-backward Move backward to previous tag of the same level.
C-c Del or C-c C-d SGMLDelete Tag sgml-delete-tag With cursor on or before a tag, deletes tag or tag pair.
C-c 1 html-headline-1 Insert an

.
C-c 2 html-headline-2 Insert an

.
C-c 3 html-headline-3 Insert an

.
C-c 4 html-headline-4 Insert an

.
C-c 5 html-headline-5 Insert an
.
C-c 6 html-headline-6 Insert an
(useful for footnote text) .
C-c Enter html-paragraph Insert

tag.
C-c C-c h HTMLHref Anchor html-href-anchor Insert a hyperlink.
C-c C-c n HTMLName Anchor html-name-anchor Insert an anchor so that a link can be created to the anchored part of the page.
C-c C-c u HTMLUnordered List html-unordered-list Create a bulleted list.
C-c C-c o HTMLOrdered List html-ordered-list Create a numbered list.
C-c C-c l HTMLList Item html-list-item Add an item to a list.
C-c C-c i HTMLImage html-image Insert
and position cursor for you to enter filename of image.
C-c C-j HTMLLine Break html-line Insert a line break (

).
C-c C-c - HTMLHorizontal Rule html-horizontal-rule Insert a horizontal rule (

).
C-c C-c r html-radio-buttons Insert a group of radio buttons. Emacs prompts for a name for the group, then repeatedly for value, whether it should be checked, and associated text. Press C-g to complete the group.
C-c C-c c HTMLCheckboxes html-checkboxes Insert a group of checkboxes. Emacs prompts for a name for the group, then repeatedly for value, whether it should be checked, and associated text. Press C-g to complete the group.
C-c ? SGMLDescribe Tag sgml-tag-help Provide brief verbal description of tag at cursor position.

8.3.2 Using HTML Helper Mode

HTML helper mode, written by Nelson Minar and now maintained by Gian Uberto Lauri, offers great flexibility in writing HTML. You can enable various hand-holding features depending on your level of expertise and preferences.

Why would you choose HTML helper mode over Emacs's own HTML mode? Although HTML mode makes it easy to write basic HTML, it provides little support for programmatic, interactive web pages. HTML helper mode supports ASP, JSP (and JDE, the Java Development Environment, discussed in Chapter 9), and PHP, to name a few more advanced features. If you're writing HTML in Emacs, you're likely to be a developer of such pages rather than a more text-oriented author. For this reason, HTML helper mode continues to be popular among Emacs users.

Html helper mode is not part of Emacs by default. You can download it from its homepage at http://www.nongnu.org/baol-hth. Download the file into a directory such as ~/elisp, move to that directory, and then type:

% tar xvzf html-helper-mode.tar.gz

The system unpacks the tar file for you. (Of course, if you are installing on Windows, you can simply use WinZip to decompress and unpack the file.) The tar file contains several components, including:

html-helper-mode.el—the Lisp file for HTML helper mode

hhm-changelog—changes that have been made

hhm-config.el—a Lisp file that allows Emacs customization to work[47]

8.3.2.1 Starting HTML helper mode

Before you can start HTML helper mode, you have to load it into Emacs. (For a complete discussion of this topic, see "Building Your Own Lisp Library" in Chapter 11; we describe it briefly here.) Begin by typing M-x load-file Enter. Emacs asks which file to load and you enter ~/elisp/html-helper-mode.el and press Enter, adjusting the path to reflect the location where you installed html-helper-mode.el. You enter the mode by typing M-x html-helper-mode Enter.

HTML helper
appears on the mode line.

Making HTML helper mode part of your startup is easier. Put the following lines in your .emacs file:

(setq load-path (cons "~/elisp " load-path))

(autoload 'html-helper-mode "html-helper-mode" "Yay HTML" t)

In the first line, insert the complete path for the directory in which html-helper-mode.el is located in quotation marks, replacing ~/elisp to the correct value for your system. The second line tells Emacs to load HTML helper mode automatically when you start Emacs.

If you want to use HTML helper mode for editing HTML files by default, add this line to .emacs as well:

(setq auto-mode-alist (cons '("\\.html?$" . html-helper-mode) auto-mode-alist))

If you edit other types of files with HTML helper mode, you may want to add lines to include all the types of files you edit. Adding more lines is the easiest way. For example, to make HTML helper mode the default for PHP files, add this line to .emacs:

(setq auto-mode-alist (cons '("\\.php$" . html-helper-mode) auto-mode-alist))

8.3.2.2 A brief tour of HTML helper mode

The main reason people like HTML helper mode is that it provides easy menu access to a wide variety of options. Realizing that having a crowded menu with many submenus could overwhelm new users, the authors created an option called Turn on Novice Menu. Selecting this option from the HTML menu provides a barebones menu, as shown in Figure 8-1. Novice HTML writers can use these options to create a basic HTML document without worrying about what forms, JSPs, PHP, and the like mean.


Figure 8-1. HTML helper mode's Novice menu (Mac OS X)


Selecting Turn on Expert Menu from the HTML menu returns the larger menu with its numerous submenus, as shown in Figure 8-2.


Figure 8-2. HTML helper mode's Expert menu (Mac OS X)

8.3.2.3 Inserting an HTML template

HTML helper mode inserts a template for you every time you create a new HTML file.


Type: C-x C-f new.html

HTML helper mode inserts a template with all the basic elements needed for a valid HTML document (Windows).


The template contains all the basic HTML elements. The entire document is surrounded by

tags. Then the head and the body are separated. Following an

tag that tells the browser to insert a horizontal line, called a horizontal rule, the
tag leaves a place for the author to put in his or her email address. In these days of spam, it's unlikely you'll want to do that. (You can leave the
tag blank or delete it.)

If you do want to include an email address, enter a line like this in your .emacs file (substituting your own email address, of course):

(setq html-helper-address-string

 "Charles Dickens")


Type: C-x C-f newfile.html

Emacs inserts the HTML template, including the address.


Normally you begin filling out the template by entering title and a level-one header (these are often the same). You can then begin writing paragraphs of text. Before you start typing, press M-Enter. Emacs inserts

and positions the cursor between them. You can see from the ending paragraph tag that HTML helper mode is working toward XHTML compliance.


Type: M-Enter

Emacs positions the cursor between

and

so you can start insert text.

8.3.2.4 Putting tags around a region

When editing HTML files, you often spend a lot of time marking up existing text. If you preface any of the tag commands with C-u, Emacs inserts the tags around a region rather than putting them at the cursor position.[48] To demonstrate, we'll start a new HTML file and insert text from our dickens file.


Type: C-x C-f ataleoftwocities.html


Emacs inserts the HTML template.


Move the cursor past the

pair and type C-x C-i dickens.

Emacs inserts the dickens text file, to which we can add HTML tags.


If you were really doing this properly, you'd type something like "A Tale of Two Cities, Chapter 1 as the title and the first-level header. But for now, you just want to see how to mark up a region of existing text. Begin by marking the Dickens paragraph as a region and type C-u M-Enter.


Type: M-h C-u M-Enter.

Emacs inserts opening and closing paragraph tags.

8.3.2.5 Using completion

HTML helper mode supports completion. You type the beginning of a tag and press M-Tab (for tempo-complete-tag).[49] If there's more than one possibility, a window of possible completions appears. Let's say you are working on a bulleted list.


Type:

Emacs inserts the tags to begin and end the list and the tag for one list item.


Note, however, that completion is sometimes case-sensitive. For example, typing shows the following completions:

. C-c C-f b HTMLInsert Form ElementsButton tempo-template-html-button Insert
. C-c C-f m HTMLInsert Form ElementsSubmit Form tempo-template-html- submit-form Insert
. C-c C-f s HTMLInsert Form ElementsSelections tempo-template-html-selections Insert
. C-c C-f o HTMLInsert Form ElementsOption tempo-template-html-option Insert
. C-c C-f v HTMLInsert Form ElementsOption with Value tempo-template-html-option-with-value Insert
. C-c C-f i HTMLInsert Form ElementsImage Field tempo-template-html-input-image-field Insert
. C-c C-f r HTMLInsert Form ElementsRadiobutton tempo-template-html-input-radiobutton Insert
. C-c C-f c HTMLInsert Form ElementsCheckbox tempo-template-html-checkbox Insert
. C-c C-f p HTMLInsert Form ElementsText Area tempo-template-html-text-area Insert
. C-c C-f f HTMLInsert Form ElementsForm tempo-template-html-form Insert
. C-c C-f t `HTMLInsert Form ElementsText Field tempo-template-html-text-field Insert
. C-c C-f h HTMLInsert Form ElementsHidden Field tempo-template-html-hidden-field Insert
. C-c M-l s HTMLInsert Logical StylesStrong tempo-template-html-strong Insert
. C-c M-l e HTMLInsert Logical StylesEmphasized tempo-template-html-emphasized Insert
. C-c M-l b HTMLInsert Logical StylesBlockquote tempo-template-html-blockquote Insert
. C-c M-l p HTMLInsert Logical StylesPreformatted tempo-template-html-preformatted Insert
. C-c C-p s HTMLInsert Physical StylesStrikethru tempo-template-html-strikethru Insert
. C-c C-p f HTMLInsert Physical StylesFixed tempo-template-html-fixed Insert
. C-c C-p u HTMLInsert Physical StylesUnderline tempo-template-html-underline Insert
. C-c C-p i HTMLInsert Physical StylesItalic tempo-template-html-italic Insert
. C-c C-p b HTMLInsert Physical StylesBold tempo-template-html-bold Insert
. C-c C-p c HTMLInsert Physical StylesCenter tempo-template-html-center Insert
. C-c C-p l HTMLInsert Physical StylesSpanning Class tempo-template-html-spanning-class Insert
. C-c C-p 5 HTMLInsert Physical StylesSpanning Style tempo-template-html-spanning-style Insert
. C-c C-s a HTMLInsert Logical StylesAddress tempo-template-html-address Insert
. C-c M-l d HTMLInsert Logical StylesDefinition tempo-template-html-definition Insert
. C-c M-l v HTMLInsert Logical StylesVariable tempo-template-html-variable Insert
. C-c M-l k HTMLInsert Logical StylesKeyboard Input tempo-template-html-keyboard Insert
. C-c M-l r HTMLInsert Logical StylesCitation tempo-template-html-citation Insert
. C-c M-l x HTMLInsert Logical StylesSample tempo-template-html-sample Insert
. C-c M-l c HTMLInsert Logical StylesCode tempo-template-html-code Insert
. C-c C-h b HTMLInsert Structural ElementsBase tempo-template-html-base Insert
. C-c C-h l HTMLInsert Structural ElementsLink tempo-template-html-link Insert
. C-c C-h m HTMLInsert Structural ElementsMeta Name tempo-template-html-meta-name Insert
. C-c C-h n HTMLInsert Structural ElementsNextid tempo-template-html-nextid Insert
. C-c C-h i HTMLInsert Structural ElementsIsindex tempo-template-html-isindex Insert
. C-c C-h B HTMLInsert Structural ElementsBody tempo-template-html-body Insert
. C-c C-h H HTMLInsert Structural ElementsHead tempo-template-html-head Insert
. C-c C-t t HTMLInsert TablesTable tempo-template-html-table Insert
. C-c C-t p HTMLInsert Tableshtml table caption tempo-template-html-html-table-caption Insert
. C-c C-t d HTMLInsert TablesTable Data tempo-template-html-table-data Insert
. C-c C-t h HTMLInsert TablesTable Header tempo-template-html-table-header Insert
. C-c C-t r HTMLInsert TablesTable Row tempo-template-html-table-row Insert
.

8.4 Writing XML

Writing XML involves entering structured information that complies with a document type definition or schema. Even within Emacs, the XML support you receive varies. At the low end of the spectrum, there is plain vanilla Fundamental mode. It provides simply a screen where you type. Specialized modes like SGML mode provide support for entering tags, as we saw earlier in our discussion of HTML mode, a derivative of SGML mode. But neither of these approaches help you parse or validate XML (SGML mode has a command for validating, but it is tricky to set up correctly). More advanced Lisp packages, though currently not included in Emacs, are available to provide these functions. These add-on packages provide validation against DTDs or schemas, parsing capabilities, and, typically, an array of standard DTDs and schema definitions. In Emacs, these tools primarily work in conjunction with one of two major modes. psgml mode validates XML (and SGML) against DTDs. The newer nxml mode validates against RELAX NG schemas. We cover both of these options in this section. Before we go into detail on those modes, however, let's look briefly what Emacs has built-in with SGML mode.

8.4.1 Writing XML with SGML Mode

Emacs's own SGML mode provides support for entering tags. We covered much of this earlier under HTML mode, so we provide just one brief example here. Inserting, hiding, and showing tags are especially helpful features provided by SGML mode.

Let's look at a chapter on enumerated types by Java in a Nutshell author David Flanagan. This chapter uses the DocBook DTD.


Initial state:

Editing a document that uses the DocBook DTD (Mac OS X).


Note that Emacs displays XML on the mode line. XML mode in this context is a subset of SGML mode. Actually, despite this name, all the commands in this mode start with sgml, not xml. The menu of relevant commands is called SGML as well. Emacs doesn't pretend to have extensive XML support.

We want to insert a paragraph before the first paragraph.


Add a blank line following the title and type: C-c C-t

Emacs inserts an open angle bracket and prompts for the tag name (Mac OS X).


Type: para Enter

Emacs inserts opening and closing paragraph tags (Mac OS X).


Note that Emacs is not following our indentation style. We can correct it by moving to the beginning of the line and pressing Tab. See Table 8-4 earlier in this chapter for details on SGML mode commands.

8.4.2 TEI Emacs: XML Authoring for Linux and Windows

The Text Encoding Initiative (TEI) wanted an XML authoring environment for Emacs, so it created (the somewhat misleadingly named) TEI Emacs.[50] Despite its name, TEI Emacs does not include Emacs itself. Rather, it creates an authoring environment for writing XML using nxml mode or psgml mode. It incorporates XSLT tools, along with most of the standard DTDs, such as the three forms of XHTML DTDs (strict, frameset, and transitional), DocBook DTDs, and more. Naturally, the TEI's own DTDs and schemas are also included.

The active development of this tool and its careful packaging led us to describe this tool despite the fact that it is limited to Linux and Windows at this writing.[51] You should have Emacs 21.3 already installed before you install this tool. Installing TEI Emacs is trivial. The Windows version has an installer, and Linux users follow simple instructions at http://www.tei-c.org/Software/tei-emacs/, the web site for downloading TEI Emacs.

8.4.3 Writing XHTML Using nxml Mode

James Clark, an XML pioneer, wrote nxml mode to provide Emacs support for his schema standard RELAX NG. For details on the standard, visit http://www.relaxng.org/ or pick up a copy of RELAX NG by Eric van der Vlist (O'Reilly). The important thing about nxml mode is that it validates text as you type instead of making validation and debugging separate steps.

If you did not install TEI Emacs, you can download nxml mode and its schemas from http://thaiopensource.com/download/. If you decide to become an active nxml mode user, you may want to join a related Yahoo Group discussion list (see http://groups.yahoo.com/group/emacs-nxml-mode/).

In this section, we change our running HTML example to XHTML, first using a RELAX NG schema and nxml mode. Open dickens.html, then enter nxml mode.


Type: C-x C-f dickens.html Enter M-x nxml-mode Enter

Editing dickens.html in nxml mode.


nxml mode tells you what schema it is using in the minibuffer. It's smart enough to know that its XHTML schema is best for this purpose.

The mode line tells us that this file is currently invalid. Emacs highlights errors with red underscores. Let's deal with these errors one at a time.


Move the cursor to the red underscore at the end of the html tag.

The minibuffer describes what's missing.


Editing XHTML with a schema requires a namespace definition in the

tag. nxml mode knows what we need. This is a good time to use nxml's completion feature to let it supply the details for us. C-Enter completes the current tag.


Type: Space xmlns=" C-Enter

Emacs inserts the rest of the namespace declaration.


The mode line tells us that this file is still invalid. Moving to the underlined address tag gives us a fairly cryptic reason; it says,

Element not allowed in this context
. Let's move down to the closing body tag to see if that error provides any more insight into the problem.


Move to

.

The minibuffer says

Missing end-tag "p"
.


This message provides a clue. Although HTML authors are not accustomed to adding closing tags to paragraphs, XHTML requires them. Let's insert a closing tag after our paragraph.


Move to the line following the Dickens paragraph and type:

Emacs inserts a closing tag.


Note that just typing was adequate to insert a closing tag for the current element. We don't need to type C-Enter to invoke completion. That's because in nxml mode, slash is bound to nxml-electric-slash. It automatically completes the nearest open element, another shortcut for us.

A similar command is C-c C-f (for nxml-finish-element). With C-c C-f, you don't have to type anything; it inserts the relevant closing tag for you.

Look at the mode line now. It says valid. Using nxml mode, it's not too tough to take an HTML file and change it to valid XHTML.

Validating text as you type it is a key feature of nxml mode. It's validating against a schema. To specify a different schema, type C-c C-s (for rng-set-schema-and-validate). The minibuffer prompts for the file where the schema resides. A number of schemas can be found online at http://www.relaxng.org/#schemas. You can also convert DTDs to schemas using tools listed on that page.

Your menus vary depending on whether you install nxml mode directly or whether you use TEI's version. TEI provides support for encoded characters using the UniChar menu. It also provides extensive XSLT support. TEI's NXML menu includes some TEI skeletons as well as nxml mode options. Nxml mode installed from thaiopensource.org includes an XML menu with options for setting the schema and customizing the mode. Table 8-7 lists some of the commands available in nxml mode.


Table 8-7. Nxml mode commands

Keystrokes Command name Action
C-Enter nxml-complete Complete the current tag.
/ nxml-electric-slash Add a closing tag for the last open element.
C-c C-n rng-next-error Move to the next error.
C-c C-l rng-save-schema-location Creates (or updates) a file called schemas.xml in your home directory. This file associates schemas with files.
C-c C-s rng-set-schema-and-validate Set the schema and validate against it.
C-c C-a rng-auto-set-schema Set the schema automatically according to the contents of the file.
C-c C-w rng-what-schema Show in the minibuffer the current schema associated with this file.
C-c C-v rng-validate-mode Toggles whether the mode line indicates that the file is valid or invalid.
C-c C-u nxml-insert-named-char Insert a named character; press Tab to see a list.
(none) nxml-insert-xml-declaration Insert an XML declaration at the beginning of the file.
C-c Tab nxml-balanced-close-start-tag-inline Insert the ending tag for the starting tag you are typing, putting the ending tag on the current line.
C-c C-b nxml-balanced-close-start-tag-block Insert the ending tag for the starting tag you are typing, putting the ending tag on a separate line.
C-c C-f nxml-finish-element Finish the current element.
M-h nxml-mark-paragraph Mark the current paragraph.
M-} nxml-forward-paragraph Move forward one paragraph.
M-{ nxml-backward-paragraph Move back one paragraph.
C-M-p nxml-backward-element Move back one element.
C-M-n nxml-forward-element Move forward one element.
C-M-d nxml-down-element Move down one element (if nested).
C-M-u nxml-backward-up-element Move up one element (if nested).

8.4.4 Using psgml Mode

Lennart Stafflin's psgml mode has been around for a while. It is more robust than Emacs's own SGML mode, but, like any add-on, you have to install it in order to use it. Either install TEI Emacs as described earlier or download psgml mode from http://www.lysator.liu.se/projects/about_psgml.html and follow the installation instructions there. TEI Emacs includes a functioning psgml mode, so if you've installed TEI Emacs, your setup work is done.

psgml mode consists of two parts: sgml-mode for writing SGML and xml-mode for writing XML (and in our case XHTML).


To start psgml mode to edit our XHTML file, type M-x xml-mode.

XML appears on the mode line and an

*SGML LOG*
window opens. If you are using TEI Emacs, XSLT appears on the mode line along with XML.


The

*SGML LOG*
window displays messages about this session. (If it doesn't appear immediately, click on the first character in the file.) The log buffer complains that it could not find an external entity called html. This file has been changed to work with the XHTML RELAX NG schema. psgml mode expects it to conform to an XHTML DTD. To get started with the (minimal) work needed to undertake the transformation from a schema-based file to a DTD-based file, we ask psgml to normalize the buffer.


Type: M-x sgml-normalize or select Normalize from the Modify menu

psgml mode eliminates the namespace declaration in the

tag.


More needs to be done, however. The first statements in an XHTML file include an XML statement and a DOCTYPE entry that identifies the DTD this document should be validated against. One of the nice things about TEI Emacs is that it includes a variety of DTDs. (Users of standard psgml mode don't have this feature; sorry.[52])


At the beginning of the file, select DTD → Insert DTD → XHTML Transitional.

Emacs inserts the two required elements for us.


That's all it takes to make this file a well-formed XHTML file. psgml mode allows for validation against the DTD. Let's validate it using C-c C-v to make sure it's okay.


Type: C-c C-v

psgml mode inserts the default validate command in the minibuffer; press Enter to run it.


Press Enter and type y to save the buffer when prompted

The

*compilation*
buffer indicates (somewhat cryptically) that the document is valid.


Of course, typical documents are far more complex than this one. Options on the View menu provide selective hiding and showing of elements, including an option to hide all tags, allowing you to focus on the content of the file instead.

psgml mode also offers numerous options. If you are running TEI Emacs, you'll find the File Options and User Options submenus on the XML/SGML menu. If you've installed psgml mode standalone, you'll find them on the SGML menu. Table 8-8 summarizes some of the psgml commands.


Table 8-8. Bindings in psgml mode

Keystrokes Command name Action
C-M-Space sgml-mark-element Mark the current element.
M-Tab sgml-complete Complete the current tag.
C-M-t sgml-transpose-element Transpose two elements.
C-M-h sgml-mark-current-element Mark the current element.
C-M-k ModifyKill Element sgml-kill-element Delete the current element (and any child elements).
C-M-u MoveBackward Up Element sgml-backward-up-element Move up to the parent element for this element.
C-M-d MoveDown Element sgml-down-element Move down to the next child element.
C-M-b MoveBackward Element sgml-backward-element Move to the previous element.
C-M-f MoveForward Element sgml-forward-element Move to the next element.
C-M-e MoveEnd of Element sgml-end-of-element Move to the end of the current element.
C-M-a MoveBeginning of Element sgml-beginning-of-element Move to the beginning of the current element.
C-c C-w SGMLWhat Element sgml-what-element Similar to sgml-position but describes hierarchy in terms of tags versus content (for example, start-tag in title in head in html).
C-c C-v SGMLValidate sgml-validate Insert validation command in the minibuffer so you can modify it if necessary before pressing Enter to execute it.
C-c C-t SGMLList Valid Tags sgml-list-valid-tags List tags that are valid in the current context.
C-c C-q ModifyFill Element sgml-fill-element Fill element according to the mode's indentation rules.
C-c C-o MoveNext Trouble Spot sgml-next-trouble-spot Find the next problem spot and display the problem in the minibuffer.
C-c C-n MoveUp Element sgml-up-element Move to the parent element.
C-c Enter sgml-split-element Split current element.
C-c C-l SGMLShow/Hide Warning Log sgml-show-or-clear-log Display or delete the
SGML LOG
buffer (menu option name is misleading).
C-c C-k ModifyKill Markup sgml-kill-markup Delete current tag.
C-c / MarkupEnd Current Element sgml-insert-end-tag Insert closing tag for current tag.
C-c - ModifyUntag Element sgml-untag-element Delete the current tag pair.
C-c # ModifyMake Character Reference sgml-make-character-reference Change character under the cursor to the equivalent entity.
C-c C-f C-e ViewFold Element sgml-fold-element Hide the current element and its children if any.
C-c C-u C-e ViewUnfold Element sgml-unfold-element Show the current element and its children if any.
C-c C-f C-s ViewFold Subelement sgml-fold-subelement Hide subelements.
C-c C-f C-r ViewFold Region sgml-fold-region Hide the region.
C-c C-u C-a ViewUnfold All sgml-unfold-all Show all hidden tags and text.

8.5 Marking up Text for TEX and LATEX

GNU Emacs provides excellent support for marking up TEX files. Most people today use LATEX , which is written in TEX and provides more control over formatting. As a result, we'll talk about LaTeX mode here.

Before we launch into this discussion, we assume that you have set up LATEX on your platform. On Red Hat Linux, it's set up by default. Windows and Mac OS X users must install and configure LATEX before proceeding.[53]

Emacs attempts to guess whether you're editing a TEX or LATEX file and enter the appropriate mode. You can force LaTeX mode if Emacs doesn't enter it automatically by typing M-x latex-mode Enter.

8.5.1 Matching Braces

LATEX commands often take the form

\keyword{text}
. LaTeX mode doesn't try to figure out if you're using the "right" keywords since the language is extensible and you may have defined your own keywords. It does, however, provide support for avoiding the most common error: mismatched curly braces and dollar signs.

In LATEX , curly braces ({}) and dollar signs ($$) should always appear in pairs; Emacs checks to make sure that each opening brace or dollar sign has a counterpart. When you type a closing brace or dollar sign, the cursor moves quickly to its counterpart (provided that it is on the screen; it shows the context in the minibuffer if it is not), then back again.

Emacs generates braces in matching pairs. The command C-c { inserts opening and closing braces and positions the cursor for typing between the braces.

Typing C-c } moves you past the right brace. It always finds the correct closing brace, given your current position. If there is no closing brace, you get an error message that says

Scan error: Unbalanced parentheses
. You also get this error message if you type C-c } while the cursor is in a section that is not surrounded by braces, which can be a little confusing.

To check for mismatched curly braces and dollar signs, type M-x tex-validate-buffer Enter. This command checks the entire buffer for unbalanced parentheses, curly braces, dollar signs, and the like. (If you have a large file, you might want to validate a region instead using M-x tex-validate-region Enter). If it finds any errors, Emacs displays an

*Occur*
buffer with
Mismatches:
at the top and a list of lines on which it found errors. You can then easily move to each line that contains an error with M-x goto-line.

Sometimes a mismatched parenthesis early in the buffer can start a chain reaction of "errors" through the rest of the file. If you suspect that one of the corrections you make may have fixed most of the remaining errors, simply run tex-validate-buffer again.

When you're stepping through errors, C-c } provides a good way to check where the closing brace for a given opening brace is. Position the cursor right after the opening brace and press C-c }.

8.5.2 Quotation Marks and Paragraphing

LaTeX mode also has features for handling quotation marks and paragraph separation. Typing a quotation mark (") causes Emacs to simulate left and right quotation marks. Left quotation marks are represented as two backtick characters (``) while right quotation marks are represented as two apostrophes (' '). (Left and right quotation marks are not part of the standard ASCII character set.) If you need to type a literal quotation mark for any reason, simply use the quote-character command preceding the quotation mark, like this: C-q ".

8.5.3 Command Pairs

LaTeX mode provides support for inserting command pairs. To insert a command pair, type C-c C-o (for latex-insert-block). Emacs prompts for the block name, and then for associated options. For example, type C-c C-o Enter document Enter Enter (the second Enter indicates no options). Emacs inserts the command pair and positions the cursor between them:

\begin{document}

_

\end{document}

You can use this command to mark up a text file after you write it. If you mark a region, you can type C-c C-o to wrap a command pair around that region.

A related command is C-c C-e (for latex-close-block). In this case, you type an opening command, press C-c C-e, and Emacs inserts the corresponding closing command.

These commands work with any keyword, regardless of what it is. Emacs can't check to make sure that it's a valid LATEX keyword or even that it's been defined. For example, if you type \begin{eating} C-c C-e, Emacs inserts \end{eating}. It's up to you to make sure you use valid keywords.

8.5.4 Processing and Printing Text

In addition to marking up files for LATEX , you can process files, see your errors (if any), and invoke a viewer, all without leaving Emacs. To process a file, just type C-c C-f (for tex-file).[54] Emacs saves the file before processing it. Messages that would appear on screen are channeled to a buffer called

*tex-shell*
, which Emacs displays on your screen. If the buffer isn't on the screen, typing C-c C-l (for tex-recenter-output-buffer) automatically displays it.

To demonstrate, let's try processing dickens.tex, a very basic file indeed.


Type: C-c C-f

Processing a LATEX file displays a special

*tex-shell*
buffer.


This command generates a .dvi file, which is an intermediate, device-independent file. You can view the resulting file by typing C-c C-v. On Linux, the default viewer is xdvi. Pressing C-c C-v displays the output in an xdvi window.


Type: C-c C-v

Output displayed by xdvi.


To print the .dvi file, give the command C-c C-p (for tex-print); this formats the .dvi file and sends it to your default printer. C-c C-q (tex-show-print-queue) displays the print queue so you know when to go to the printer to look for your processed output.

Two important variables tell Emacs how to print a TEX , file. You need to know about them if C-c C-p or C-c C-q doesn't work correctly; if these commands don't work, the configuration of TEX , on your system may be nonstandard, or the print and print queue commands are slightly different. The variable tex-dvi-print-command determines the command that is used to print a .dvi file; its default is lpr -d. For print queues, the command used to show the print queue is controlled by the tex-show-queue-command variable. By default, tex-show-queue-command is set to lpq.

Table 8-9 summarizes TeX and LaTeX mode commands.


Table 8-9. TeX and LaTeX mode commands

Keystrokes Command name Action
(none) tex-mode Enter TeX or LaTeX mode according to file's contents.
(none) plain-tex-mode Enter TeX mode.
(none) latex-mode Enter LaTeX mode.
C-j tex-terminate-paragraph Insert two hard returns (standard end of paragraph) and check syntax of paragraph.
C-c { tex-insert-braces Insert two braces and put cursor between them.
C-c } up-list If you are between braces, position the cursor following the closing brace.
(none)TeXValidate Buffer tex-validate-buffer Check buffer for syntax errors.
(none)TeXValidate Region tex-validate-region Check the region for syntax errors.
C-c C-f TeXTeX File tex-file Saves the current file, then processes it.
C-c C-b TeXTeX Buffer tex-buffer Process buffer.[55]
C-c C-l TeXTeX Recenter tex-recenter-output-buffer Put the message shell on the screen, showing (at least) the last error message.
C-c C-k TeXTeX Kill tex-kill-job Kill processing.
C-c C-p TeXTeX Print tex-print Print output.
C-c C-q TeXShow Print Queue tex-show-print-queue Show print queue.
C-c C-e latex-close-block Provide closing element of a command pair.
(none) tex-close-latex-block Provide closing element of a command pair.
C-c Tab` TeXBibTeX File tex-bibtex-file Process the current file using BibTeX, a system for creating bibliographies automatically.
C-c C-v TeXTeX View tex-view View .dvi output.
(none)TeXTeX Print (alt printer) tex-alt-print Print .dvi file using an alternative printer defined by the variable tex-alt-dvi-print-command.
C-c C-o latex-insert-block Insert a block (prompts for block name and options).
C-c C-u tex-goto-last-unclosed-latex-block Look backward in the file to find the nearest unclosed block and move the cursor there.
M-Enter latex-insert-item Insert
\item
.
(none) latex-split-block Insert an end to the current block and the beginning of a new one.
" tex-insert-quote Insert TeX-style quotation marks.
Загрузка...