As many programmers know, the task of programming usually breaks down into a cycle of think-write-debug. If you have used Unix (or various other operating systems) for programming, you have probably become accustomed to using separate tools for each phase of the cycle, for example, a text editor for writing, a compiler for compiling, and the operating system itself for running programs. You would undoubtedly find an environment much more productive if the boundaries between the cycle phases—and the tools that support them—were erased.
Emacs provides considerable support for writing, running, and debugging programs written in a wide variety of languages, and it integrates this support into a smooth framework. You never have to leave Emacs when developing programs, so you will find it easier to concentrate on the actual programming task (i.e., the "think" part of the cycle) because you won't have to spend lots of time going from one tool to another.
When you write code, you can use one of Emacs's programming language modes; these turn Emacs into a spiffy syntax-directed or language-sensitive editor that knows about the syntax of the language. That makes it easier for you to write code in a uniform, easy-to-read, customizable style. Language modes exist for several different programming languages.
Emacs also supports running and debugging programs. Shell mode (see Chapter 5) and multiple windows (see Chapter 4) allow you to run your code while editing it. Emacs has a powerful facility for interfacing to many compilers and the Unix make command: Emacs can interpret compilers' error messages and visit files where errors occur at the appropriate line number. Indeed, many tools (such as the Java build tool, ant) include command-line options to format their output in an Emacs-friendly way.
In this chapter, we cover the features of language modes in general such as compiling and debugging programs, comments, indentation, and syntax highlighting. We also spend a bit of time upfront looking at the etags facility, which is a great help to programmers who work on large, multifile projects. These features apply to all language modes. We then delve into Emacs's support for various languages, including C, C++, Java, Perl, SQL, and Lisp.
Emacs provides a number of features that appeal to developers. You can edit code quickly with font support and auto-completion of function and variable names; you can compile the program and even run a debugger all without leaving your "editor." While you don't have some of the graphical tools commonly found in commercial integrated development environments (IDEs), almost every other feature of those IDEs can be found in Emacs—for every language you could imagine working in.
Of course, there will always be occasions when you need to view your documents without the bells and whistles some language modes attach. You can always switch to plain text (M-x text-mode) or, more to the point, fundamental mode (M-x fundamental-mode).
As mentioned at the beginning of this chapter, Emacs's support for programmers does not end when you are done writing the code. A typical strategy for using Emacs when working on a large programming project is to log in, go to the directory where your source files reside, and invoke Emacs on the source files (e.g., emacs Makefile myproj*.[ch] for C programmers). While you are editing your code, you can compile it using the commands described later—as you will see, you need not even worry about saving your changes. You can also test your compiled code in a shell using shell mode (see Chapter 5). The bottom line is that you should rarely—if ever—have to leave Emacs throughout your session.
Emacs provides an interface to compilers and the Unix make utility that is more direct and powerful than shell mode. At the heart of this facility is the command M-x compile Enter. This command causes a series of events to occur. First, it prompts you for a compilation command. The default command is make -k,[56] but if you type another command, that new command becomes the default for subsequent invocations during your Emacs session. You can change the default by setting the variable compile-command in your .emacs file. For example, to use the Java build tool ant as your default compile command, just add this line:
(setq 'compile-command "ant -emacs")
After you have typed the command, Emacs offers to save all unsaved file buffers, thus relieving you of the responsibility of making sure your changes have been saved. It then creates a buffer called
*compilation*
and an associated window. It runs the compilation command (as a subprocess, just like the shell in shell mode), with output going to the *compilation*
buffer. While the command runs, the minibuffer says Compiling: run
; it says exit
when the compile job finishes.
Now the fun begins. If the compilation resulted in an error, you can type C-x ` (for next-error; this is a backquote, not a single quote). Emacs reads the first error message, figures out the file and line number of the error, and visits the file at that line number. After you have corrected the error, you can type C-x ` again to visit subsequent error locations. Each time you type C-x `, Emacs scrolls the
*compilation*
window so that the current error message appears at the top.
To start at the first error message again, type C-x ` with a prefix argument (i.e., C-u C-x `). A nice thing about C-x ` is that you can use it as soon as an error is encountered; you do not have to wait for the compilation to finish.
The mode of the
*compilation*
buffer (compilation mode) supports a few other useful commands for navigating through the error messages as summarized in Table 9-1.
Table 9-1. Compilation mode commands
Keystrokes | Command name | Action |
---|---|---|
C-x ` | next-error | Move to the next error message and visit the corresponding source code. |
M-n | compilation-next-error | Move to the next error message. |
M-p | compilation-previous-error | Move to the previous error message. |
C-c C-c | compilation-goto-error | Visit the source code for the current error message. |
Space | scroll-down | Scroll down one screen. |
Del | scroll-up | Scroll up one screen. |
Space and Del are handy screen-scrolling commands found in various read-only Emacs modes.
Note that M-n and M-p do not visit the source code corresponding to the error message; they simply allow you to move easily through error messages that may take up more than one line each. However, you can visit the source code from any error message by typing C-c C-c.
How does Emacs interpret the error message? It uses the variable compilation-error-regexp-alist, which is a list of regular expressions designed to match the error messages of a wide variety of C and C++ compilers and the lint C code checking program.[57] It should also work with compilers for languages for which Emacs has language modes, such as Java, Fortran, Ada, and Modula-2. Emacs tries to parse (analyze) an error message with each of the regular expressions in the list until it finds one that extracts the filename and line number where the error occurred.
There is a chance that the error message parser won't work with certain compilers, especially if you are using Emacs on a non-Unix system. You can find out by trying M-x compile on some code that you know contains an error; if you type C-x `, and Emacs claims that there are
no more errors
, the next-error feature does not work with your compiler.
If the parser doesn't work for you, you may want to try adding a regular expression to compilation-error-regexp-alist that fits your compiler's error message format. We'll show you an example of this in Chapter 11.
The compile package also includes similar support for the Unix grep (search files) command, thus effectively giving Emacs a multifile search capability. If you type M-x grep, you are prompted for arguments to send to grep—that is, a search pattern and filename(s). Emacs runs grep with the -n option, which tells it to print filenames and line numbers of matching lines.[58] The same happens as with M-x compile; you can type C-x ` to have Emacs visit the next matched line in its file.
We have already seen various examples of Emacs modes, including text mode (see Chapter 2) and shell mode (see Chapter 5). Special functionality like the buffer list (see Chapter 4) and Dired (see Chapter 5) are actually modes as well. All modes have two basic components: an Emacs Lisp package that implements the mode and a function that invokes it.
The version of Emacs on which this book is based (21.3.5) comes with language modes for Ada, assembly, awk, C, C++, Common Lisp, Fortran, ICON, Java, Lisp, MIM, Modula-2, Objective-C, Pascal, Pike, Perl, PROLOG, Python, Scheme, SGML, Simula, and SQL; future versions will undoubtedly add more. Many—but not all—of the language modes are "hooked" into Emacs so that if you visit a file with the proper filename suffix, you will automatically be put in the correct mode. To find out whether Emacs does this for the language you use, look up your language in the table of Emacs Lisp packages in Appendix B. If one or more suffixes is listed in the right-hand column, Emacs invokes the mode for files with those suffixes.
However, if no suffix is listed (or if your compiler supports a different suffix than the ones listed), you can set up Emacs to invoke the mode automatically when you visit your source files. You need to do two things: first, look again at the right-hand column in the package table entry for your language, and you will find the name of the function that invokes the mode (e.g., ada-mode, modula-2-mode). Second, you insert code in your .emacs file that tells Emacs to automatically load the proper package whenever you visit a file with the suffix for the language in question.
You need to write two lines of code for this customization. The first uses the autoload function, which tells Emacs where to look for commands it doesn't already know about. It sets up an association between a function and the package that implements the function so that when the function is invoked for the first time, Emacs loads the package to get the code. In our case, we need to create an association between a function that invokes a language mode and the package that implements the mode. This shows the format of autoload:
(autoload 'function "filename" "description" t)
Note the single quote preceding
function
and the double quotes around filename
and description
; for more details on this Lisp syntax, see Chapter 11. If you are a PHP programmer, for example, you can grab the latest Emacs PHP mode from http://sourceforge.net/projects/php-mode/ online. You would then put the following line in your .emacs file:
(autoload 'php-mode "php-mode" "PHP editing mode." t)
This tells Emacs to load the PHP package when the function php-mode is invoked for the first time.
The second line of code completes the picture by creating an association between the suffix for source files in your language and the mode-invoking function so that the function is automatically invoked when you visit a file with the proper suffix. This involves the Emacs global variable auto-mode-alist, covered in Chapter 10; it is a list of associations that Emacs uses to put visited files in modes according to their names. To create such an association for PHP mode so that Emacs puts all files with the suffix .php in that mode, add this line to your .emacs file:
(setq auto-mode-alist (cons '("\\.php$" . php-mode) auto-mode-alist))
This Lisp code sets up the following chain of events when you visit a file whose suffix indicates source code in your programming language. Let's say you visit the file pgm.php. Emacs reads the file, then finds an entry corresponding to the .php suffix in the auto-mode-alist and tries to invoke the associated function php-mode. It notices that the function php-mode doesn't exist, but that there is an autoload association between it and the PHP package. It loads that package and, finding the php-mode command, runs it. After this, your buffer is in PHP mode.
For some interpreted languages like Perl and Python, you will also want to update the interpreter-mode-alist global variable:
(setq interpreter-mode-alist
(cons '("python" . python-mode) interpreter-mode-alist))
If your script file begins with the Unix interpreter prefix #!, Emacs checks that line to determine what language you are using. That can be especially helpful when the script file does not have a telltale extension like .py or .pl.
Although language modes differ in exact functionality, they all support the same basic concepts. The most important of these involves knowledge of the syntax of the language in question—its characters, vocabulary, and certain aspects of its grammar. We have already seen that Emacs handles some syntactic aspects of human language. When you edit regular text, Emacs knows about words, sentences, and paragraphs: you can move the cursor and delete text with respect to those units. It also knows about certain kinds of punctuation, such as parentheses: when you type a right parenthesis, it "flashes" the matching left parenthesis by moving the cursor there for a second and then returning.[59] This is a convenient way of ensuring that your parentheses match correctly.
Emacs has knowledge about programming language syntax that is analogous to its knowledge of human language syntax. In general, it keeps track of the following basic syntactic elements:
• Words, which correspond to identifiers and numbers in most programming languages.
• Punctuation, which includes such things as operators (e.g., +, -, <, and >) and statement separators (e.g., semicolons).
• Strings, which are strings of characters to be taken literally and surrounded by delimiters (such as quotation marks).
• Parentheses, which can include such things as square brackets ([ and ]) and curly braces ({ and }) as well as regular parentheses.
• Whitespace, such as spaces and tabs, which are to be ignored.
• Comments, which are strings of characters to be ignored and surrounded by delimiters that depend on the language (e.g., /* and */ for C, // and a newline for C++ and Java, or semicolon (;) and a newline for Lisp).
Emacs keeps this information internally in the form of syntax tables; like keymaps (as described in Chapter 10), Emacs has a global syntax table used for all buffers, as well a local table for each buffer, which varies according to the mode the buffer is in. You can view the syntax table for the current buffer by typing C-h s (for describe-syntax). In addition, language modes know about more advanced language-dependent syntactic concepts like statements, statement blocks, functions, subroutines, Lisp syntactic expressions, and so on.
All programming languages have comment syntax, so Emacs provides a few features that deal with comments in general; these are made language-specific in each language mode. The universal comment command for all language modes is M-; (for indent-for-comment).[60] When you type M-;, Emacs moves to a column equal to the value of the variable comment-column; if the text on the line goes past that column, it moves to one space past the last text character. It then inserts a comment delimiter (or a pair of opening and closing delimiters, as in /* and */ for C) and puts the cursor after the opening delimiter.
For example, if you want to add a comment to a statement, put the cursor anywhere on the line containing that statement and type M-;. The result is
result += y; /*
_ */
You can then type your comment in between the delimiters. If you were to do the same thing on a longer line of code, say,
q_i = term_arr[i].num_docs / total_docs;
the result would be
q_i = term_arr[i].num_docs / total_docs; /*
_*/
You can customize the variable comment-column, of course, by putting the appropriate code in your .emacs file. This is the most useful way if you want to do it permanently. But if you want to reset comment-column temporarily within the current buffer, you can just move the cursor to where you want the comment column to be and type C-x ; (for set-comment-column). Note that this command affects only the value of comment-column in the current buffer; its value in other buffers—even other buffers in the same mode—is not changed.
When you are typing a comment and want to continue it on the next line, M-j (for indent-new-comment-line) does it. This command starts a new comment on the next line (though some language modes allow you to customize it so that it continues the same comment instead). Say you have typed in the text of the comment for this statement, and the cursor is at the end of the text:
result += y; /* add the multiplicand
_*/
You want to extend the comment to another line. If you type M-j, you get the following:
result += y; /* add the multiplicand*/
/* */
You can type the second line of your comment. You can also use M-j to split existing comment text into two lines. Assume your cursor is positioned like this:
result += y; /* add the
_multiplicand */
If you type M-j now, the result is:
result += y; /* add the */
/* multiplicand */
If you want to comment out a section of your code, you can use the comment-region command (not bound to keystrokes except in certain language modes). Assume you have code that looks like this:
this = is (a);
section (of, source, code);
that += (takes[up]->a * number);
of (lines);
If you define a region in the usual way and type M-x comment-region, the result is:
/* this = is (a); */
/* section (of, source, code); */
/* that += (takes[up]->a * number); */
/* of (lines); */
You can easily get rid of single-line comments by typing M-x kill-comment Enter, which deletes any comment on the current line. The cursor does not have to be within the comment. Each language mode has special features relating to comments in the particular language, usually including variables that let you customize commenting style.
In addition to syntactic knowledge, Emacs language modes contain various features to help you produce nicely formatted code. These features implement standards of indentation, commenting, and other aspects of programming style, thus ensuring consistency and readability, getting comments to line up, and so on. Perhaps more importantly, they relieve you of the tiresome burden of supplying correct indentation and even of remembering what the current indentation is. The nicest thing about these standards is that they are usually customizable.
We have already seen that, in text mode, you can type C-j instead of Enter, at the end of a line, and Emacs indents the next line properly for you. This indentation is controlled by the variable left-margin, whose value is the column to indent to. Much the same thing happens in programming language modes, but the process is more flexible and complex.
As in text mode, C-j indents the next line properly in language modes. You can also indent any line properly after it has been typed by pressing Tab with the cursor anywhere on the line.
Some language modes have extra functionality attached to characters that terminate statements—like semicolons or right curly braces—so that when you type them, Emacs automatically indents the current line. Emacs documentation calls this behavior electric. Most language modes also have sets of variables that control indentation style (and that you can customize).
Table 9-2 lists a few other commands relating to indentation that work according to the rules set up for the language in question.
Table 9-2. Basic indentation commands
Keystrokes | Command name | Action |
---|---|---|
C-M-\ | indent-region | Indent each line between the cursor and mark. |
M-m | back-to-indentation | Move to the first nonblank character on the line. |
M-^ | delete-indentation | Join this line to the previous one. |
The following is an example of what C-M-\ does. This example is in C, and subsequent examples refer to it. The concepts in all examples in this section are applicable to most other languages; we cover analogous Lisp and Java features in the sections on modes for those languages.
Suppose you have the following C code:
int times (x, y)
int x, y;
{
int i;
int result = 0;
for (i = 0; i < x; i++)
{
result += y;
}
}
If you set mark at the beginning of this code, put the cursor at the end, and type C-M-\, Emacs formats it like this:
int times (x, y)
int x, y;
{
int i;
int result = 0;
for (i = 0; i < x; i++)
{
result += y;
}
}
C-M-\ is also handy for indenting an entire file according to your particular indentation style: you can just type C-x h (for mark-whole-buffer) followed by C-M-\.
M-m is handy for moving to the beginning of the actual code on a line. For example, assume your cursor is positioned like this:
int result = 0;
If you type M-m, it moves to the beginning of the int:
int result = 0;
As an example of M-^, let's say you want the opening curly brace for the for statement to appear on the same line as the for. Put the cursor anywhere on the line with the opening curly brace, type M-^, and the code looks like this:
for (i = 0; i < x; i++) {
result += y;
}
Language modes usually provide additional indentation commands that relate to specific features of the language. Having covered the general language mode concepts, we want to show you a few other general utilities: etags and font-lock mode. The etags facility helps programmers who work on large, multifile programs. All language modes can also take advantage of font-lock mode to make development more efficient.
Another general feature of Emacs that applies to programmers is the etags facility.[61] etags works with code in many other languages as well, including Fortran, Java, Perl, Pascal, LATEX,, Lisp, and many assembly languages. If you work on large, multifile projects, you will find etags to be an enormous help.
etags is basically a multifile search facility that knows about C and Perl function definitions as well as searching in general. With it, you can find a function anywhere in an entire directory without having to remember in which file the function is defined, and you can do searches and query-replaces that span multiple files. etags uses tag tables, which contain lists of function names for each file in a directory along with information on where the functions' definitions are located within the files. Many of the commands associated with etags involve regular expressions (see Chapter 11) in search strings.
To use etags, you must first invoke the separate etags program in your current directory to create the tag table. Its arguments are the files for which you want tag information. The usual way to invoke it is etags *.[ch], that is, building a tag table from all files ending in .c or .h. (That's for you C programmers; other languages would use their appropriate extensions, of course.) You can run etags from shell mode or with the command M-! (for shell-command). The output of etags is the file TAGS, which is the tag table. When you are writing code, you can update your tag table to reflect new files and function definitions by invoking etags again.
After you have created the tag table, you need to make it known to Emacs. To do this, type M-x visit-tags-table Enter. This prompts you for the name of the tag table file; the default is TAGS in the current directory, as you would expect. After you execute this step, you can use the various Emacs tags commands.
The most important tag command is M-. (for find-tag). This command prompts you for a string to use in searching the tag table for a function whose name contains the string. Supply the search string, and Emacs visits the file containing the matching function name in the current window and goes to the first line of the function's definition. A variation of M-. is C-x 4 . (for find-tag-other-window), which uses another window instead of replacing the text in your current window.
A nice feature of M-. is that it picks up the word the cursor is on and uses it as the default search string. For example, if your cursor is anywhere on the string my_function, M-. uses my_function as the default. Thus, when you are looking at a C statement that calls a function, you can type M-. to see the code for that function.
If you have multiple functions with the same name, M-. finds the function in the file whose name comes first in alphabetical order. To find the others, you can use the command M-, (for tags-loop-continue) to find the next one (or complain if there are no more). This feature is especially useful if your directory contains more than one program, that is, if there is more than one function called main. M-, also has other uses, as we will see.
You can use the tag table to search for more than just function definitions. The command M-x tags-search Enter prompts for a regular expression; it searches through all files listed in the tag table (such as, all .c and .h files) for any occurrence of the regular expression, whether it is a function name or not. This capability is similar to the grep facility discussed earlier in this chapter. After you have invoked tags-search, you can find additional matches by typing M-,.
There is also an analogous query-replace capability. The command M-x tags-query-replace Enter does a regular expression query-replace (see Chapter 3) on all files listed in the tag table. As with the regular query-replace-regexp command, if you precede tags-query-replace with a prefix argument (i.e., C-u M-x tags-query-replace Enter), Emacs replaces only matches that are whole words. This feature is useful, for example, if you want to replace occurrences of printf without disturbing occurrences of fprintf. If you exit a tags-query-replace with Esc or C-g, you can resume it later by typing M-,.
The command M-x tags-apropos rounds out the search facilities of etags. If you give it a regular expression argument, it opens a
*Tags List*
buffer that contains a list of all tags in the tag table (including names of files as well as functions) that match the regular expression. For example, if you want to find out the names of output routines in a multiple-file C program, you could invoke tags-apropos with the argument print or write.
Finally, you can type M-x list-tags Enter to list all the tags in the table—that is, all the functions—for a given C file. Supply the filename at the prompt, and you get a
*Tags List*
buffer showing the names of functions defined in that file along with their return types (if any). Note that if you move your cursor to this list, you can use M-. to look at the actual code for the function. M-. picks up the word the cursor is on as the default function name, so you can just move the cursor to the name of the function you want to see and press M-. followed by Enter to see it.
There's one last common feature to mention. The use of fonts to help present code is very popular—so popular, in fact, that it is now universal. Unlike the indentation and formatting supported by the various language modes, nothing in the code itself changes. But when you're in font-lock mode, your program certainly looks different.
You can turn on this feature for any language mode with M-x font-lock-mode to see for yourself. Keywords get a particular color; comments get a different color and are often italicized; strings and literals get yet another color. It can aid quick browsing of code. Many people come to depend on it much the way they rely on proper indentation. If you become one of those people, you'll want to make it the default for all language sessions. You can add the following line to your .emacs file to achieve this aim:
;; Turn on font-locking globally
(global-font-lock-mode t)
The colors and styles used are customizable if you don't like the defaults. M-x list-faces-display produces a list of the named faces Emacs knows about. You'll see something similar to the screen shown in Figure 9-1.
Figure 9-1. Fonts available for customization in Emacs
Of course, in real life, the colors and bold and whatnot should be more pronounced. You'll also see quite a few more faces. You can modify any of those faces with either M-x modify-face (a simple prompted "wizard" approach) or M-x customize-face (the big fancy interactive approach). You can also add lines to your .emacs file for your favorite customizations. Here's an example:
'(font-lock-comment-face
((((class color) (background light))
(:foreground "Firebrick" :slant italic)))))
Note that not all displays support all of the possible variations of bold, italic, underline, colors, and so on. This is a classic case of "your mileage may vary." Still, with the ability to customize it all yourself, you should be able to find a combination that works well on your system.
The remaining sections in this chapter deal with several of the language-specific modes including JDEE, a suite of packages devoted to the world of Java development in Emacs.
You need not read all of these sections if you are interested in only one or two of the languages. If you program in another language for which Emacs has a mode, you may want to read one of the following sections to get the "flavor" of a language mode; all language modes have the same basic concepts, so this should get you off to a good start. Indeed, many language modes use another mode as a base. For example, Java mode is really just an extension of C mode.
Emacs automatically enters C mode when you visit a file whose suffix is .c, .h, .y (for yacc grammars), or .lex (lex specification files). Emacs invokes C++ mode when you visit a file whose suffix is .C, .H, .cc, .hh, .cpp, .cxx, .hxx, .c++, or .h++. You can also put any file in C mode manually by typing M-x c-mode Enter. Similarly, you can use c++-mode to put a buffer into C++ mode.
Both C and C++ modes are implemented in the same Emacs Lisp package, called cc-mode,[62] which also includes a mode for the Objective-C language used in Mac OS X. C mode understands both ANSI C and the older Kernighan and Ritchie C syntax. We describe C mode functions, but you should assume that everything also applies to C++ mode. C++ mode has a small number of additional features, which we describe at the end of this section.
We should also note that the Emacs mode for Perl is derived from an older version of C mode. If you program in Perl, you will find that virtually all of the motion, indentation, and formatting commands in C mode apply equally to Perl mode, with perl- replacing c- in their names. Emacs invokes Perl mode on files with suffix .pl. (However, to be honest we prefer CPerl mode, discussed later in this chapter.)
In C mode, Emacs understands the syntax elements described earlier in this chapter. The characters semicolon (;), colon (:), comma (,) curly braces ({ and }), and pound sign (#, for C preprocessor commands) are all electric, meaning that Emacs automatically indents the current line when you type them. It also actively uses the font options when you have font-lock mode turned on.
In addition to the standard Emacs commands for words and sentences (which are mainly useful only inside multiline comments), C mode contains advanced commands that know about statements, functions,[63] and preprocessor conditionals. A summary of these commands appears in Table 9-3.
Table 9-3. Advanced C motion commands
Keystrokes | Command name | Action |
---|---|---|
M-a | c-beginning-of-statement | Move to the beginning of the current statement. |
M-e | c-end-of-statement | Move to the end of the current statement. |
M-q | c-fill-paragraph | If in comment, fill the paragraph, preserving indentations and decorations. |
C-M-a | beginning-of-defun | Move to the beginning of the body of the function surrounding the point. |
C-M-e | end-of-defun | Move to the end of the function. |
C-M-h | c-mark-function | Put the cursor at the beginning of the function, the mark at the end. |
C-c C-q | c-indent-defun | Indent the entire function according to indentation style. |
C-c C-u | c-up-conditional | Move to the beginning of the current preprocessor conditional. |
C-c C-p | c-backward-conditional | Move to the previous preprocessor conditional. |
C-c C-n | c-forward-conditional | Move to the next preprocessor conditional. |
Notice that the statement motion commands have the same key bindings as backward-sentence and forward-sentence, respectively. In fact, they act as sentence commands if you use them within a C comment.
Similarly, M-q is normally the fill-paragraph command; C mode augments it with the ability to preserve indentations and decorative characters at the beginnings of lines. For example, if your cursor is anywhere in this comment:
/* This is
* a
* comment paragraph with wildly differing right
* margins.
* It goes on for a while,
* then stops. */
typing M-q has this result:
/* This is a comment paragraph with wildly differing right margins.
* It goes on for a while, then stops. */
You will find that the preprocessor conditional motion commands are a godsend if you have to slog through someone else's voluminous code. Especially if you're faced with code built to run on a variety of systems—like Emacs itself—often the most important question you need answered is, "What code is actually compiled?"
With C-c C-u, you can tell instantly what preprocessor conditional governs the code in question. Consider this code block:
#define LUCYX
#define BADEXIT -1
#ifdef LUCYX
...
*ptyv = open ("/dev/ptc", O_RDWR | O_NDELAY, 0);
if (fd < 0)
return BADEXIT;
...
#else
...
fprintf (stderr, "You can't do that on this system!");
...
#endif
Imagine that the ellipses (
...
) represent hundreds of lines of code. Now suppose you are trying to determine under what conditions the file /dev/ptc is opened. If your cursor is on that line of code, you can type C-c C-u, and the cursor moves to the line #ifdef LUCYX—telling you that the code is compiled if you're on a LUCYX system. If you want to skip the code that would not be compiled and go directly to the end of the conditional, type C-c C-n. We will see another command that is useful for dealing with C preprocessor code later in this section.
C statement and statement block delimiter characters are bound to commands that, in addition to inserting the appropriate character, also provide proper indentation. These characters are {, }, ;, and : (for labels and switch cases). For example, if you are closing out a statement block or function body, you can press C-j (or Enter) and type }, and Emacs lines it up with its matching {. This eliminates the need for you to scroll back through the code to find out what column the { is in.
Because } is a parenthesis-type character, Emacs attempts to "flash" a matching { when you type }. If the matching { is outside of the text displayed in your window, Emacs instead prints the line containing the { in the minibuffer. Furthermore, if only whitespace (blanks or tabs) follows the { on its line, Emacs also prints a ^J (for C-j) followed by the next line, thus giving a better idea of the context of the {.
Recall the "times" example earlier in this chapter. Let's say you are typing in a } to end the function, and the { that begins the function body is off-screen. There is no code on the line following the beginning {, so you see the following in the minibuffer after you type }:
Matches {^J int i;
Coding style in C—or any programming language for that matter—is a very personal thing. C programmers learn from various books or by referring to various different pieces of other people's code; eventually they evolve a personal style that may or may not conform to those that they learned from.
C mode provides a rich set of features for customizing its indentation behavior that mirrors this way of learning the language. At the simplest level, you can choose a coding style by name. Then, if you're not satisfied, you can customize your chosen style or even create your own from scratch. The latter tasks, however, require a fair amount of Emacs Lisp programming knowledge (see Chapter 11) and perhaps a bit of bravery.
You can choose a named coding style with the command M-x c-set-style. This command prompts you for the name of the style you want. The easiest thing to do at this point is to type Tab, the completion character (see Chapter 14), which brings up a
*Completions*
window that lists all of the choices. Type one of them and press Enter to select it.
By default, Emacs comes loaded with the styles shown in Table 9-4.
Table 9-4. Built-in cc-mode indentation styles
Style | Description |
---|---|
bsd | Style used in code for BSD-derived versions of Unix. |
cc-mode | The default coding style, from which all others are derived. |
ellemtel | Style used in C++ documentation from Ellemtel Telecommunication Systems Laboratories in Sweden. |
gnu | Style used in C code for Emacs itself and other GNU-related programs. |
java | Style used in Java code (the default for Java mode). |
k&r | Style of the classic text on C, Kernighan and Ritchie's The C Programming Language. |
linux | Style used in C code that is part of the Linux kernel. |
python | Style used in python extensions. |
stroustrup | C++ coding style of the standard reference work, Bjarne Stroustrup's The C++ Programming Language. |
user | Customizations you make to .emacs or via Custom (see Chapter 10). All other styles inherit these customizations if you set them. |
whitesmith | Style used in Whitesmith Ltd.'s documentation for their C and C++ compilers. |
To show how some of these styles work, let's start with the C function example from earlier in this chapter:
int times (x, y)
int x, y;
{
int i;
int result = 0;
for (i = 0; i < x; i++)
{
result += y;
}
}
If you define a region around this code and you type C-M-\ (for indent-region), Emacs reformats the code in the default style like this:
int times (x, y)
int x, y;
{
int i;
int result = 0;
for (i = 0; i < x; i++)
{
result += y;
}
}
If you type C-c . (for c-set-style), enter k&r, and then repeat the reformatting, the code looks like this:
int times (x, y)
int x, y;
{
int i;
int result = 0;
for (i = 0; i < x; i++)
{
result += y;
}
}
Or, if you want to switch to GNU-style indentation, choose the style gnu and reformat. The result is:
int times (x, y)
int x, y;
{
int i;
int result = 0;
for (i = 0; i < x; i++)
{
result += y;
}
}
Once you decide on a coding style, you can set it up permanently by putting a line in your .emacs file that looks like this:
(add-hook 'c-mode-hook
'(lambda ( )
(c-set-style "stylename")))
Unfortunately, we'll have to wait until Chapter 11 to understand exactly what this code does. For now, make sure that you insert a single quote (') before the
(lambda
in the second line.
Each coding style contains subtleties that makes it nontrivial for Emacs to implement. Older versions of Emacs did this by defining several variables that controlled various indentation levels; these were not easy to work with and, frankly, did not really cover 100 percent of the nuances of each style. The current version of C mode, in contrast, uses a considerably larger set of variables—too large, in fact, for anyone other than hardy Emacs Lisp hackers to deal with.
Therefore, C mode keeps track of groups of these variables and their values under named styles. One huge variable, called c-style-alist, contains all of the styles and their associated information. You can customize this beast either by changing values of variables within existing styles or by adding a style of your own. For further details, look in the file cc-mode.el in your system's Emacs Lisp directory (see Chapter 11).
C mode contains a number of other useful features, ranging from the generally useful to the arcanely obscure. Perhaps the most interesting of these are two ways of adding additional electric functionality to certain keystrokes, called auto-newline and hungry-delete-key.[64]
When auto-newline is enabled, it causes Emacs to add a newline character and indent the new line properly whenever you type a semicolon (;), curly brace ({ or }), or, at certain times, comma (,) or colon (:). These features can save you some time and help you format your code in a consistent style.
Auto-newline is off by default. To turn it on, type C-c C-a for c-toggle-auto-state. (Repeat the same command to turn it off again.) You will see the (C) in the mode line change to (C/a) as an indication. As an example of how it works, try typing in the code for our
times( )
function. Type the first two lines up to the y on the second line:
int times (x, y)
int x, y
_
Now press the semicolon; notice that Emacs inserts a newline and brings you down to the next line:
int times (x, y)
int x, y;
_
Type the opening curly brace, and it happens again:
int times (x, y)
int x, y;
{
_
Of course, the number of spaces Emacs indents after you type the { depends on the indentation style you are using.
The other optional electric feature, hungry-delete-key, is also off by default. To toggle it on, type C-c C-d (for c-toggle-hungry-state). You will see the (C) on the mode line change to (C/h), or if you have auto-newline turned on, from (C/a) to (C/ah).
Turning on hungry-delete-key empowers the Del key to delete all whitespace to the left of the point. To go back to the previous example, assume you just typed the open curly brace. Then, if you press Del, Emacs deletes everything back to the curly brace:
int times (x, y)
int x, y;
{
_
You can toggle the states of both auto-newline and hungry-delete-key with the command C-c C-t (for c-toggle-auto-hungry-state).
If you want either of these features on by default when you invoke Emacs, you can put lines like the following in your .emacs file:
(add-hook 'c-mode-hook
'(lambda ( )
(c-toggle-auto-state)))
If you want to combine this customization with another C mode customization, such as the indentation style in the previous example, you need to combine the lines of Emacs Lisp code as follows:
(add-hook 'c-mode-hook
'(lambda ( )
(c-set-style "stylename")
(c-toggle-auto-state)))
Again, we will see what this hook construct means in "Customizing Existing Modes" in Chapter 11.
C mode also provides support for comments; earlier in the chapter, we saw examples of this support. There is, however, another feature. You can customize M-j (for indent-new-comment-line) so that Emacs continues the same comment on the next line instead of creating a new pair of delimiters. The variable comment-multi-line controls this feature: if it is set to nil (the default), Emacs generates a new comment on the next line, as in the example from earlier in the chapter:
result += y; /* add the multiplicand */
/* */
This outcome is the result of typing M-j after multiplicand, and it shows that the cursor is positioned so that you can type the text of the second comment line. However, if you set comment-multi-line to t (or any value other than nil), you get this outcome instead:
result += y; /* add the multiplicand
*/
The final feature we'll cover is C-c C-e, (for c-macro-expand). Like the conditional compilation motion commands (e.g., C-c C-u for c-up-conditional), c-macro-expand helps you answer the often difficult question, "What code actually gets compiled?" when your source code contains a morass of preprocessor directives.
To use c-macro-expand, you must first define a region. Then, when you type C-c C-e, it takes the code within the region, passes it through the actual C preprocessor, and places the output in a window called
*Macroexpansion*
.
To see how this procedure works, let's go back to the code example from earlier in this chapter that contains C preprocessor directives:
#define LUCYX
#define BADEXIT -1
#ifdef LUCYX
*ptyv = open ("/dev/ptc", O_RDWR | O_NDELAY, 0);
if (fd < 0)
return BADEXIT;
#else
fprintf (stderr, "You can't do that on this system!");
#endif
If you define a region around this chunk of code and type C-c C-e, you see following the message:
Invoking /lib/cpp -C on region...
followed by this:
done
Then you see a
*Macroexpansion*
window that contains this result:
*ptyv = open ("/dev/ptc", O_RDWR | O_NDELAY, 0);
if (fd < 0)
return -1;
If you want to use c-macro-expand with a different C preprocessor command, instead of the default /lib/cpp -C (the -C option means "preserve comments in the output"), you can set the variable c-macro-preprocessor. For example, if you want to use an experimental preprocessor whose filename is /usr/local/lib/cpp, put the following line in your .emacs file:
(setq c-macro-preprocessor "/usr/local/lib/cpp -C")
It's highly recommended that you keep the -C option for not deleting comments in your code.
As we mentioned before, C++ mode uses the same Emacs Lisp package as C mode. When you're in C++ mode, Emacs understands C++ syntax, as opposed to C (or Objective-C) syntax. That results in differences in how some of the commands discussed here behave, but in ways that are not noticeable to the user.
There are few apparent differences between C++ and C mode. The most important is the Emacs Lisp code you need to put in your .emacs file to customize C++ mode: instead of c-mode-hook, you use c++-mode-hook. For example, if you want C++ mode's indentation style set to Stroustrup with automatic newlines instead of the default style, put the following in your .emacs file:
(add-hook 'c++-mode-hook
'(lambda ( )
(c-set-style "Stroustrup")
(c-toggle-auto-state)))
Notice that you can set hooks for C mode and C++ mode separately this way, so that if you program in both languages, you can set up separate indentation styles for each.
C++ mode provides an additional command: C-c : (for c-scope-operator). This command inserts the C++ double colon (::) scope operator. It's necessary because the colon (:) is normally bound to electric functionality that can reindent the line when you don't want that done. The scope operator can appear virtually anywhere in C++ code whereas the single colon usually denotes a case label, which requires special indentation. The C-c : command may seem somewhat clumsy, but it's a necessary workaround to a syntactic clash in the C++ language.
Finally, both C and C++ mode contain the commands c-forward-into-nomenclature and c-backward-into-nomenclature, which aren't bound to any keystrokes by default. These are like forward-word and backward-word, respectively, but they treat capital letters in the middle of words as if they were starting new words. For example, they treat ThisVariableName as if it were three separate words while the standard forward-word and backward-word commands treat it as one word. ThisTypeOfVariableName is a style used by C++ programmers, as opposed to this_type_of_variable_name, which is somehow more endemic to old-school C code.
C++ programmers may want to bind c-forward-into-nomenclature and c-backward-into-nomenclature to the keystrokes normally bound to the standard word motion commands. We show you how to do this in "Customizing Existing Modes" in Chapter 11.
We've covered the main features of C and C++ modes, but actually these modes include many more features, most of them quite obscure or intended only for hardcore Emacs Lisp-adept customizers. Look in the Emacs Lisp package cc-mode.el—and the ever-expanding list of cc- helper packages—for more details.
As we mentioned earlier, recent versions of Emacs come with support for Java built-in (Java mode is based on cc-mode). We'll explore Java mode briefly and then take a more in-depth look at the Java Development Environment for Emacs (JDEE).
Java mode shares all of the formatting and font features mentioned above, but understands the Java language specifically. You get thrown into Java mode when opening any .java file.
When working in Java mode, you have exactly the same features available as you do in C mode. Syntax highlighting handles Java keywords and syntax when font-lock mode is turned on. You can navigate Java commands using M-a and M-e. When commenting out a region, it uses the C++ style // comments.
You'll notice a small augmentation in the indent alignment commands if you choose to spread your throws or extends clauses over multiple lines. For example, consider the following method declaration:
public Object getNetResource(String host, int port, String resName)
throws IllegalArgumentException,
IOException,
SQLException,
FileNotFoundException
{
If you mark the region and run M-C-\ to indent the region, it uses a special alignment for the exception list:
public Object getNetResource(String host, int port, String resName)
throws IllegalArgumentException,
IOException,
SQLException,
FileNotFoundException
{
It all works like it is supposed to—just with Java as the language at the core of the action. However, for more than casual Java editing, you should read the next section on the JDEE.
While you can certainly get started right away with the built-in Java mode, if you do more than occasional Java programming, you might want to venture into the world of Paul Kinnucan's Java Development Environment for Emacs (JDEE). It takes Emacs into the realm of Java IDE. You won't find a GUI builder, but everything else is in place and ready to roll.
You can pick up the latest version of the JDEE online from http://jdee.sunsite.dk/.[65] This site is essential to getting the JDEE up and running. You'll find all sorts of tips and tricks and full user documentation on all of the bells and whistles is available.
Before you can install the JDEE, you'll need the following components:
Collection of Emacs Development Environment Tools (CEDET)
Available on SourceForge (http://cedet.sourceforge.net/) or by following the links from the JDEE home page. This collection is quite popular as a foundation for more interesting programmer tools. You may already have a sufficient version installed, but it's best to get the latest release.
The JDEE Emacs Lisp library package
Available as a separate download from the JDEE site.
One or more JDKs
While technically not required for editing files in Emacs, a JDK is required to take advantage of any of the compilation or debugging features of the JDEE. You'll also have to register each JDK you plan to use, but more on that later.
Installing CEDET is fairly straightforward if you have a make command available. (For Windows users, you'll want to have the Cygnus Unix Distribution installed. It gives you access to a large subset of Unix tools which will come in handy far beyond the installation of the JDEE.)
After you download the CEDET distribution from SourceForge, unpack it wherever you want it to reside. Open a terminal window (or start a Cygwin bash terminal on Windows) and change to the directory where you unpacked the distribution. From there you should be able to run the following command:
shell$ make EMACS=
/path/to/emacs
That process will probably take a few minutes to complete. The Lisp files will be compiled for you.
When the make command completes, you should be in good shape. The last step for CEDET is to update your .emacs file:
;; Turn on CEDET's fun parts
(setq semantic-load-turn-useful-things-on t)
;; Load CEDET
(load-file "/path-to-cedet/common/cedet.el")
Installing the ELisp library package from the JDEE site is also straightforward. Unpack the downloaded file wherever you like, but before you run the make command, you'll need to edit the Makefile and configure the entries outlined in Table 9-5 to match your system.
Table 9-5. JDEE Makefile entries
Makefile entry | Example | Description |
---|---|---|
|
|
The top-level directory for any shared or info directories. |
|
|
The directory where your main Emacs directory is located. |
|
|
The directory where any local Lisp files should be installed. |
|
|
The directory where the elib Lisp files will go. |
|
|
The command to start Emacs. This can be a fully qualified path or simply "emacs" to reach the default version found on your system. |
Run the make command with the install option to get everything set up:
shell$ make install
The last step for the ELisp library is to make sure the Emacs defaults acknowledge the new package. You simply need to add the new directory to your load-path variable, as described next.
The ELisp library actually provides a simple template file that matches where you installed the package. After the make process completes, you should have an elib_startup.el file in the directory where you ran the make command. That file contains the line you'll need to add to your .emacs file or you can merge it with the system default.el file for everyone to use. (The default.el file is often found in your site-lisp directory. Chapter 11 has more details.)
Five basic steps are required to install the JDEE on your system:
1. Get the necessary prerequisites downloaded and installed.
2. Update the load path (.emacs).
3. Set theJDEE to load at startup (.emacs).
4. Compile JDEE .el files (optional).
5. Register your JDKs (optional).
The previous section covered the first step. Make sure you take care of those prerequisites before continuing. The next steps can be handled in your .emacs file. The JDEE site proposes the following entries as a minimal setup; we excerpt them here (with one or two small tweaks) for easy reference.
;; This .emacs file illustrates the minimal setup
;; required to run the JDEE.
;; Set the debug option to enable a backtrace when a
;; problem occurs.
(setq debug-on-error t)
;; Update the Emacs load-path to include the path to
;; the JDEE and its require packages. This code assumes
;; that you have installed the packages in the
;; /usr/local/emacs/site-lisp directory. Adjust appropriately.
(add-to-list 'load-path
(expand-file-name "/usr/local/emacs/site-lisp/jde/lisp"))
(add-to-list 'load-path
(expand-file-name "/usr/local/emacs/site-lisp/semantic"))
(add-to-list 'load-path
(expand-file-name "/usr/local/emacs/site-lisp/speedbar"))
(add-to-list 'load-path
(expand-file-name "/usr/local/emacs/site-lisp/eieio"))
(add-to-list 'load-path
(expand-file-name "/usr/local/emacs/site-lisp/elib"))
;; If you want Emacs to defer loading the JDEE until you open a
;; Java file, edit the following line
(setq defer-loading-jde nil)
;;
to read:
;;
;; (setq defer-loading-jde t)
;;
(if defer-loading-jde
(progn
(autoload 'jde-mode "jde" "JDE mode." t)
(setq auto-mode-alist
(append
'(("\\.java\\'" . jde-mode))
auto-mode-alist)))
(require 'jde))
;; Set the basic indentation for Java source files
;; to two spaces.
(add-hook 'jde-mode-hook
'(lambda ( )
(setq c-basic-offset 2)))
;; Include the following only if you want to run
;; bash as your shell.
;; Set up Emacs to run bash as its primary shell.
(setq shell-file-name "bash")
(setq shell-command-switch "-c")
(setq explicit-shell-file-name shell-file-name)
(setenv "SHELL" shell-file-name)
(setq explicit-sh-args '("-login" "-i"))
(if (boundp 'w32-quote-process-args)
(setq w32-quote-process-args ?\")) ;; Include only for MS Windows.
Of course, you'll need to make sure the paths in the
add-to-list 'load-path
lines match the actual directories you're using.
Compiling the JDEE Lisp files is not required, but as noted in "Byte-Compiling Lisp Files" in Chapter 11, it's a good idea and speeds up several operations including general startup times. The JDEE makes this step simple. After you have it installed, start Emacs and run M-x jde-compile-jde. You run this command only once, so it is definitely worthwhile.
The last step we need to cover is registering your Java development kits. This is not strictly necessary, but you don't want to skip this step. It is especially handy if you work in an environment where you have to test multiple versions of the JDK. With all of your kits registered in the JDEE, you can switch between versions with a simple variable change.
To register a JDK, use the M-x customize-variable command. The variable you need to customize is jde-jdk-registry. That will land you in the interactive customization screen. You can select the INS (insert) button to add the version number and path of your JDK. You can repeat that process for as many JDKs as you want to register. See Figure 9-2 for a list of such entries on a Mac OS X system.
Figure 9-2. Inserting JDK entries in a Custom list
Be sure to hit the State button and save this state for future sessions. You can click the Finish button when you're done or just close the buffer.
After you have your JDKs registered, you can switch to the active version using that same M-x customize-variable command. This time, edit the jde-jdk variable. You'll be prompted to choose one of the registered versions. You may or may not want to save this decision for future sessions. In any case, this variable can be edited at any time.
The compilation feature requires access to the tools.jar file (or the equivalents built-in to some JDKs). If the JDEE compile command fails with an error message about not being able to find the tools.jar file, your best bet is to customize the JDEE variable jde-global-classpath. Make sure that variable includes the tools.jar file.
For some systems that do not have a tools.jar file[66], you can steal that file from another machine, but usually you just need to get your classpath and registry entries set up correctly. Customizing the variables in Table 9-6 should get you compiling and running without too much effort.
Table 9-6. JDEE variables to customize
JDEE variable | Sample values |
---|---|
jde-global-classpath |
|
jde-jdk-registry |
|
|
Whew! That was a lot of work. But the good news is that once you've made it through the installation process, you have all the spiffy features of the JDEE forever at your command. So let's get on with the features!
First off, you're still in Emacs, so the usual motion commands described for Java mode (and C mode) still apply. But the JDEE adds two really great features to your editing cycle: command completion and class browsing.
The idea behind command completion is that the JDEE can (usually) predict which methods and variables are valid choices to make at certain points in your Java program. For example, if you start typing System. in your program, there are a finite number of choices for what follows that period. JDEE can display a list of those choices.
The command to show your list of completions is C-c C-v C-. (for jde-complete), which defaults to showing you a menu of completions. (You can change that behavior by customizing the jde-complete-function variable.) The completions are generated by looking at all of the classes listed in the jde-global-classpath variable (or the CLASSPATH environment variable if no global classpath was defined).
The class browser can be accessed quickly from the JDE menu and launches a BeanShell browser for the class your cursor was on. It's like a context-sensitive documentation tool, but a bit more powerful. Figure 9-3 shows what you get when starting the browser while your cursor is on the word System.
Figure 9-3. The BeanShell class browser launched from the JDEE
You can also launch the class browser with the M-x jde-browse-class-at-point command.
One other edit-time feature worth pointing out is the Code Generation item in the JDE menu. It has some great timesavers built-in, as shown in Table 9-7.
Table 9-7. Code Generation menu options
Keystrokes | Menu option (M-x command) | Action |
---|---|---|
C-c C-v C-l (lowercase L) | Println Wizard(jde-gen-println) | Prompts for the contents to print and inserts a complete method for you. |
C-c C-v C-z | Import Class(jde-import-find-and-import) | Prompts for the (simple) class name to import and automatically adds the proper import line to the top of your file. |
C-c C-v i | Implement Interface(jde-wiz-implement-interface) | Prompts you for the name of the interface to implement. Adds any missing import statements (including dependent imports, such as imports required for method arguments). Provides commented skeletons for each of the methods in the interface. |
Other helpers are available from the JDE menu. Generate Get/Set Pairs in particular is great for working with JavaBeans design patterns. Just create your list of attributes and then run the wizard. It even checks to see if you already have an existing get/set pair. If you do, it notes that get/set pair as "existing" and keeps on trucking so you can use the wizard to update existing classes.
Compiling the current buffer can be done quickly with the C-c C-v C-c command. Any errors show up in the compilation buffer. That compilation buffer also allows you to navigate quickly to any errors that the compiler finds. Simply move your cursor to the error in question (using the normal motion commands) and hit Enter. You'll find yourself in the right file on the right line number. Very handy indeed.
Note that you can also run ant builds with M-x jde-ant-build. Check out the JDEE documentation or the help for various jde-ant variables for more information.
Running a simple program that has its own
main( )
method is easy: just press C-c C-v C-r. That command executes the current buffer (by opening an execution buffer named *fully.qualified.ClassName*
). Any output from the program shows in the buffer. You can move around in the buffer just as you would in a normal text buffer.
Of course, if you are working on anything other than a simple test class, you'll probably be in a package. Java's use of the classpaths rarely leaves room for being at the "bottom" of a package hierarchy. For example, in the package
com.oreilly.demo
, you want to start execution from the same directory that contains the com directory, not from the demo directory that contains the actual Java files. Regrettably, the demo directory is the default.
You can edit the following variables to make executing in larger projects a bit more convenient:
jde-run-working-directory
The directory in which execution starts
jde-run-application-class
The fully qualified name of the class that contains the main( ) method to execute
With those values set, you should be able to run your application from any buffer, regardless of what directory the file you're editing happens to be in.
Another fun note about running your application through the JDEE: if any stack traces appear because of exceptions, you can navigate those traces by using the C-c C-v C-[ and C-c C-v C-] commands (up and down, respectively). Again, Emacs makes it possible to manage quite a large portion of a development project all from one interface.
A crucial element in any good IDE is its debugger. The JDEE allows you to stay in the Emacs realm while interacting with the jdb process. The JDEE also comes with its own debugger, the JDEbug application. JDEbug is more powerful but requires more setup effort.
Before we touch anything, you need to make sure that your classes are compiled with support for debugging. Otherwise, many things will appear broken when you run the debugger.
To add debug support when you compile, you run the javac command with the -g option. With the JDEE you can also use the variable jde-compile-option-debug to hold all the variations for debugging you like. If you customize this variable through Custom (see Chapter 10), just choose the "all" option for which debugging information to include. (Optionally, you can be more specific and select from the three types of debug information: Lines, Variables, and Source.)
We'll look at the jdb route just to get you started. You can start the debug session by typing M-x jde-jdb. The same variables that control the starting directory and main application class are used for debugging purposes.
After you have launched the debugger, you can control the debug process in a number of ways.
• Interact directly with the jdb process in the
*debug*
buffer. Here you can type any command that you would normally give when running jdb.
• Use the Jdb menu. You have all the usual debug options available: step into/over, continue, toggle breakpoint, and so on. This is a bit more limited than the first approach, but easier to manage if you're new to jdb.
• Use keyboard commands while you're in your source buffer. These commands are even more limited than the menu options, but give you really quick access to the most common tasks (namely stepping and break points). Table 9-8 shows the commands that are available while you're in a source buffer.
Table 9-8. JDEE debugger controls
Keystrokes | Menu item | JDB command |
---|---|---|
C-c C-a C-s | Step Into | step |
C-c C-a C-n | Step Over | next |
C-c C-a C-c | Continue | cont |
C-c C-a C-b | Toggle Breakpoint | stop in/stop at/clear |
C-c C-a C-p | Display Expression | |
C-c C-a C-d | Display Object | dump |
Figure 9-4 shows a simple application running in debug mode. Notice the small black triangle to the left of the Java source code in the upper buffer. That's the debug cursor that lets you know where you are in the file. It tracks the commands you issue, whether by directly entering jdb commands, by menu option, or through the keyboard.
Figure 9-4. Debugging a Java application with jdb
Clearly, there is a lot more to the JDEE than we can cover here. The package you download comes with some good documentation and several user guides for the basic JDEE and various options like the debuggers. The JDEE web site, at http://jdee.sunsite.dk, is a great source of information, too. As you would expect from an Emacs package, you can customize everything. Those customizations are stored in your .emacs file so you can tweak them by hand (or at least peek at them).
The best approach is to install the JDEE and start coding with it. If you find yourself saying "There should be a way to do X," get out the documentation. Chances are there is a way to do X—usually with more options than you could hope for!
Emacs has Perl support. Indeed, much like Perl itself, there are multiple ways to get things done—in this case, multiple Perl modes: the classic Perl mode (which comes up by default) and the more popular CPerl mode.
You should have a version of CPerl mode built right in, but you can also pick up the latest release from CPAN (the Comprehensive Perl Archive Network) online at http://www.cpan.org.
You can add one of the following pairs of lines to your .emacs file to make sure CPerl mode is invoked rather than Perl mode
;; load cperl-mode for perl files
(fset 'perl-mode 'cperl-mode)
;; or maybe use an alias
(defalias 'perl-mode 'cperl-mode)
CPerl mode is mostly like cc-mode with respect to motion and other programming language features. It also includes fun debug operations. You can start the debugger with M-x cperl-db. You'll be prompted to verify the debugger command and then be dropped into a split-screen mode. One buffer allows you to drive the normal perldb environment with all the regular commands you're accustomed to using in the Perl debugger.
The other buffer shows your script and follows along as you work through the debugger. It tracks the line you're about to execute as you issue commands in the other buffer. It's amazing how quickly you grow to depend on having such tools available while you're developing scripts. It is worth trying out if you've never done it before.
A big reason we wanted to mention Perl mode here is to highlight a few caveats. Perl is an amazingly expressive language much more akin to the idioms found in human languages than just about any other computer language out there. That expressiveness can cause problems—especially when considering the expressiveness of regular expressions.
Perl supports all sorts of "funny" variable names like $' and $/. CPerl mode boasts the use of a syntax table to help understand most of Perl's odd and occasionally disruptive verbiage. The older Perl mode has no such trick up its sleeves and suffers under many circumstances in the font-lock and indentation realms. This is one of the main reasons to make the leap into CPerl mode.
Even with that syntax table, though, you'll probably find some combinations of variables and strings that give Emacs headaches. Sometimes restructuring your code will help, sometimes not. The important thing to remember is that it won't harm your program at all. It might make things a bit less readable, but the script itself should run just fine. And if it doesn't, you can always launch the debugger to find out why!
Here are some parting .emacs thoughts for you Perl programmers. These lines select cperl-mode as the default and make sure the syntax highlighting is turned on. These lines also turn on folding (outline-minor-mode in the snippet below). Folding allows you to "hide" chunks of your code, such as functions where the body of the function is "folded" into the name. That can make it easier to get a grip on everything that is going on in the file. Try it—it can become addictive!
;; Turn on highlighting globally
(global-font-lock-mode t)
;; automatically load cperl-mode for perl files
(fset 'perl-mode 'cperl-mode)
;; show only the toplevel nodes when loading a file
(add-hook 'cperl-mode-hook 'hide-body)
;; outline minor mode with cperl
(add-hook 'cperl-mode-hook 'outline-minor-mode)
;; Change the prefix for outline commands from C-c @ to C-c C-o
(setq outline-minor-mode-prefix "\C-co")
(load-file "cperl-mode.el")
For you database folks out there, you can even run interactive SQL sessions through Emacs. You can navigate through your SQL command history using normal motion commands and even create complex SQL statements in any buffer and then shuttle them off to the interactive area for debugging.
Before we get started with SQL queries, you do need to have a few things in place. Most of the SQL interaction modes require an actual client application for their particular database. For example, we use the MySQL server. We have to install the MySQL client programs (
mysql
, at a minimum) on any system where we want to use SQL mode. Even though the MySQL version of SQL mode is built-in, we still need access to a real client. This is true for every type of database you expect to access.
And speaking of communicating with the database, you must also have the basics of communication taken care of. You need to have network access to the server in question. You also need to have a valid username and password for connecting to that server. A good rule before jumping into SQL mode in Emacs is to make sure you can connect and interact with your database server from your machine. If it works from a terminal window or other client application, you can make it work in Emacs.
One last thing to remember: the various SQL modes in Emacs are just helpers, so you can't do anything with them that you couldn't do with your normal database client. You won't magically have access to that restricted table with everyone's salaries. Sorry. Even so, it's just more convenient to stay in Emacs when possible, so let's forge ahead.
You'll find two modes of operation for dealing with SQL. The interactive mode lets you communicate directly with a database server and run commands and view their output immediately. The editing mode allows you to build up (and edit) more complex commands. If you want, you can have the editing buffer send parts of itself to the interactive session for testing and verification.
Start the interactive mode by typing M-x sql-mysql (or rather, your own variant of the interactive modes shown in Table 9-9).
Table 9-9. Commands for entering database-specific SQL modes
sql-db2 | sql-linter | sql-postgres |
sql-informix | sql-ms (Microsoft) | sql-solid |
sql-ingres | sql-mysql | sql-sqlite |
sql-interbase | sql-oracle | sql-sybase |
You'll be prompted for things like your username and password, the database or catalog to use, and the server to contact. Remember the prerequisites, though; many modes require that you have a normal command-line client available. The mode simply supplies an intelligent layer on top of those clients.
After you get connected, just type normal SQL commands that your server understands. Most interactive clients have some type of "end-of-line" marker to let the system know when to send a completed command. In MySQL, for example, you can end statements with a semicolon (;) or the \g sequence.
Emacs keeps these commands in a history buffer for you so that you can revisit them. M-p and M-n allow you navigate to previous and next commands respectively. (C-p and C-n simply allow you to move around in the buffer as you would expect.)
You can also put a buffer directly into SQL mode with M-x sql-mode. This provides some assistance for motion and composition of SQL statements, but mostly it's there to let you build complex statements and then ship them to the interactive buffer for execution. Table 9-10 shows how to send various segments of the buffer to the database.
Table 9-10. SQL mode send commands
Keystroke | Command name | Action |
---|---|---|
C-c C-c | sql-send-paragraph | Send the paragraph the cursor is on. A paragraph is defined by the particular database client. For the sql-mysql process, for example, a paragraph begins with a statement like select or update and ends with a semicolon. Any number of lines can intervene. |
C-c C-r | sql-send-region | Send the marked region. |
C-c C-b | sql-send-buffer | Send the entire buffer. |
The output of all of these send commands shows up in your interactive buffer. Nothing changes in the editing buffer so you should feel free to experiment. That's what these modes are here for!
Emacs has three Lisp modes, listed here by their command names:
emacs-lisp-mode
Used for editing Emacs Lisp code, as covered in Chapter 11 (filename .emacs or suffix .el).
lisp-mode
Used for editing Lisp code intended for another Lisp system (suffix .l or .lisp).
lisp-interaction-mode
Used for editing and running Emacs Lisp code.
All three modes have the same basic functionality; they differ only in the support they give to running Lisp code.
All three Lisp modes understand the basic syntax elements common to all language modes. In addition, they have various commands that apply to the more advanced syntactic concepts of S-expressions, lists, and defuns. An S-expression (or syntactic expression) is any syntactically correct Lisp expression, be it an atom (number, symbol, variable, etc.), or parenthesized list. Lists are special cases of S-expressions, and defuns (function definitions) are special cases of lists. Several commands deal with these syntactic concepts; you will most likely become comfortable with a subset of them.
Table 9-11 shows the commands that handle S-expressions.
Table 9-11. S-expression commands
Keystrokes | Command name | Action |
---|---|---|
C-M-b | backward-sexp | Move backward by one S-expression. |
C-M-f | forward-sexp | Move forward by one S-expression. |
C-M-t | transpose-sexps | Transpose the two S-expressions around the cursor. |
C-M-@ | mark-sexp | Set mark to the end of the current S-expression; set the cursor to the beginning. |
C-M-k | kill-sexp | Delete the S-expression following the cursor. |
(none) | backward-kill-sexp | Delete the S-expression preceding the cursor. |
Since an S-expression can be a wide variety of things, the actions of commands that handle S-expressions are determined by where your cursor is when you invoke them. If your cursor is on a ( or on a space preceding one, the S-expression in question is taken to be the list that starts with that (. If your cursor is on some other character such as a letter or number (or preceding whitespace), the S-expression is taken to be an atom (symbol, variable, or constant).
For example, suppose your cursor is in this position:
(mary bob (dave (pete)) ed)
If you type C-M-f, the cursor moves like this:
(mary bob (dave (pete)) ed)
That is, the cursor moves forward past the S-expression (dave (pete)), which is a list. However, say your cursor is positioned like this:
(mary bob (dave (pete)) ed)
When you type C-M-f, it moves here:
(mary bob (dave (pete)) ed)
In this case, the S-expression is the atom bob.
The commands moving in lists are shown in Table 9-12.
Table 9-12. Commands for moving in lists
Keystrokes | Command name | Action |
---|---|---|
C-M-n | forward-list | Move forward by one list. |
C-M-p | backward-list | Move backward by one list. |
C-M-d | down-list | Move forward and down one parenthesis level. |
(none) | up-list | Move forward out of one parenthesis level. |
C-M-u | backward-up-list | Move backward out of one parenthesis level. |
As a mnemonic device, you can think of lists as analogous to lines and S-expressions as analogous to characters; thus, C-n and C-p appear in list motion commands, whereas C-f and C-b appear in S-expression motion commands. C-M-n and C-M-p work similarly to C-M-f and C-M-b, respectively, except that you must position the cursor so that there is a list in front or back of it to move across—that is, there must be an opening or closing parenthesis on, after, or before the cursor. If there is no parenthesis, Emacs signals an error. For example, if your cursor is positioned like this:
(fred bob (dave (pete)) ed)
and you type C-M-n, Emacs complains with the message:
Containing expression ends prematurely
However, if your cursor is here:
(fred
_bob (dave (pete)) ed)
the "next list" is actually (dave (pete)), and the cursor ends up like this if you type C-M-n:
(fred bob (dave (pete))
_ed)
The commands for moving up or down lists enable you to get inside or outside them. For example, say your cursor is here:
(fred bob (dave (pete)) ed)
typing C-M-d moves the cursor here:
(fred bob (dave (pete)) ed)
This is the result because fred is the next level down after its enclosing list. Typing C-M-d again has this result:
(fred bob (dave (pete)) ed)
You are now inside the list (dave (pete)). At this point, typing C-M-u does the opposite of what C-M-d does: it moves the cursor back and outside of the two lists. But if you type M-x up-list Enter, you will move forward as well as out, resulting in this:
(fred bob (dave (pete))
_ed)
The commands for defuns listed in Table 9-13 are more straightforward.
Table 9-13. Commands for working with functions
Keystrokes | Command name | Action |
---|---|---|
C-M-a | beginning-of-defun | Move to the beginning of the current function. |
C-M-e | end-of-defun | Move to the end of the current function. |
C-M-h | mark-defun | Put the cursor at the beginning of the function, put the mark at the end. |
These commands work properly only when the (defun that starts the current function is at the beginning of a line.
The Lisp modes provide "flashing" of matching left parentheses; if the matching parenthesis is outside of the current window, the line it is on appears in the minibuffer. The Lisp modes also provide indentation via the Tab key and C-j for newline-and-indent (except in Lisp interaction mode, described later in this chapter). The indentation style supported by the Lisp modes "knows" a lot about Lisp keywords and list syntax; unfortunately, it is not easily customized.[67]
Here is an example, a Lisp equivalent of the "times" C function shown earlier in the chapter, that illustrates the indentation style:
(defun times (x y)
(let ((i 0)
(result 0))
(while (< i x)
(setq result (+ result y)
i (1+ i)))
result))
The basic indentation value is 2; this value is used whenever code on the next line goes down a level in nesting. For example, the body of the function, after the line containing defun, is indented by 2. The (while... and result)) lines are indented by 2 with respect to the let because they are the body of the block let introduces.
Things like defun, let, and while are function calls, even though they act like keywords. The indentation convention for function calls is that if there are arguments on lines after the line where the function name and first argument appear, the additional arguments line up with the first one. In other words, this has the form:
(function-name arg1
arg2
arg3
...)
The multiple arguments to setq in the preceding function provide another example of this.
However, the indentation of the line (result 0) shows that something a bit different happens with lists that are not function calls. The list in question is actually ((i 0) (result 0)), which is a list with two elements (both of which are also lists). The indentation style supported by the Lisp modes lines up these two elements.
Even though keyword-like terms such as let and while are actually function calls, the Lisp modes "understand" these functions to the extent that special indentation conventions are set up for them. For example, if we were to put the condition for the while-loop on a separate line and press Tab to indent it properly, the result would be:
(while
(< i x)
(setq result (+ result y)
i (1+ i)))
Similar things happen with if and cond control structures; Chapter 11 contains properly indented examples.
Another remark about indentation conventions: the Lisp modes are geared toward a style in which multiple right parentheses are put on the same line immediately following each other, instead of on separate lines. For example, the line i (1+ i))) contains right parentheses that close off the 1+ function, the setq, and the while respectively. If you prefer, you can put your closing parentheses on separate lines, but if you press Tab to indent them, they won't line up properly with their matching open parentheses; you have to indent them manually.
In addition to the Tab and C-j commands for indentation, the Lisp modes support the command C-M-q (for indent-sexp), which indents every line in the S-expression just following the cursor. You can use this command, for example, to indent an entire function definition: just put the cursor right before the defun and type C-M-q.
Comments in the Lisp modes are handled by the universal comment command M-;, which indents out to comment-column (or, if there is text at that column, one space past the last character), inserts a semicolon, and puts the cursor just past it. If you want a comment to occupy an entire line (or to start anywhere other than at comment-column), you must move to where you want the comment to start and type the semicolon yourself. Note that if you press Tab on any line that contains only a comment, the comment moves out to comment-column. To get around this, use two or more semicolons; doing so causes Tab to leave the comments where they are. The Lisp modes also support the other comment commands discussed earlier in the chapter, including M-j to extend a comment to another line and M-x kill-comment Enter to get rid of a single-line comment. These features are common to all three Lisp modes; next, we discuss the features unique to each.
Emacs Lisp mode was designed to be used with code meant to run within Emacs itself, so it facilitates running the code you type. Lisp is an interpreted (as opposed to purely compiled) language, so it is possible to blur the line between the write and run/debug phases of Lisp programming; Emacs Lisp mode takes some advantage of this opportunity, whereas Lisp interaction mode goes even further, as we'll see later. In Emacs Lisp mode, the command C-M-x (eval-defun) picks up the function definition around or after the cursor and evaluates it, meaning that it parses the function and stores it so that Emacs "knows" about the function when you invoke it.
Emacs Lisp mode also includes the command M-Tab (for lisp-complete-symbol),[68] which performs completion on the symbol (variable, function name, etc.) preceding the cursor, as described in Chapter 14. Thus, you can type the shortest unambiguous prefix for the symbol, followed by M-Tab, and Emacs tries to complete the symbol's name for you as far as it can. If it completes the symbol name, you can go on with whatever you are doing. If it doesn't, you haven't provided an unambiguous prefix. You can type more characters (to disambiguate further), or you can type M-Tab again, and a help window showing the choices pops up. Then you can type more characters and complete the symbol yourself, or you can try for completion again.
Lisp mode (as opposed to Emacs Lisp mode) is meant for use with Lisp processors other than the Emacs Lisp interpreter. Therefore it includes a couple of commands for interfacing to an external Lisp interpreter. The Lisp mode command C-c C-z (run-lisp) starts up your system's Lisp interpreter as a subprocess and creates the
*lisp*
buffer (with an associated window) for input and output.[69] If a Lisp subprocess already exists, C-c C-z uses it rather than creating a second one. You can send function definitions to the Lisp subprocess by putting the cursor anywhere within a function's definition and using C-M-x, which in this case stands for lisp-send-defun. This procedure causes the functions you define to become known to the Lisp interpreter so that you can invoke them later.
Emacs Lisp mode is probably the best thing to use if you are editing entire files of Emacs Lisp code, for example, if you are programming your own mode (as described in Chapter 11) or modifying an existing one. However, if you are editing "little" pieces of Lisp code (for example, making additions or modifications to your .emacs file), Emacs has more powerful features you can use that further blur the line between writing and running code.
The first of these is the command M-: (for eval-expression). This command enables you to type a one-line Lisp expression of any kind in the minibuffer; the expression is evaluated, and the result is printed in the minibuffer. This is an excellent, quick way to check the values of Emacs variables and to experiment with "internal" Emacs functions that aren't bound to keys or that require arguments. You can use the symbol completion command M-Tab while you are using eval-expression.
Unfortunately (or fortunately, depending on your point of view), Emacs doesn't normally let you use eval-expression. If you try pressing M-:, you will see the message
loading novice ...
in the minibuffer. Then a window pops up with a message on the order of, "You didn't really mean to type that, did you?" You get three options: press Space to try the command only once, y to try it and enable it for future use with no questions asked, or n to do nothing.
If you want to use eval-expression, type y. This command actually results in the following line being put in your .emacs file:
(put 'eval-expression 'disabled nil)
If you are a knowledgeable Lisp programmer, you will understand that this addition sets the property disabled of the symbol eval-expression to nil. In other words, Emacs considers certain commands to be verboten to novice users and thus allows commands to be disabled. If you want to skip this entire procedure and just use eval-expression, simply put the above line in your .emacs file yourself (make sure you include the single quotes).
Another feature that helps you exercise Emacs Lisp code is C-x C-e (for eval-last-sexp). This command runs the line of Lisp that your cursor is on and prints its value in the minibuffer. C-x C-e is handy for testing single lines of code in an Emacs Lisp file.
An even more powerful feature is Lisp interaction mode. This is the mode the default buffer
*scratch*
is in. Filenames with no suffixes normally cause Emacs to go into Lisp interaction mode, though you can change this using the variable auto-mode-alist, described earlier in this chapter and in more detail in Chapter 10. You can also put any buffer in Lisp interaction mode by typing M-x lisp-interaction-mode Enter; to create an extra Lisp interaction buffer, just type C-x b (for switch-to-buffer), supply a buffer name, and put it in Lisp interaction mode.
Lisp interaction mode is identical to Emacs Lisp mode except for one important feature: C-j is bound to the command eval-print-last-sexp. This command takes the S-expression just before point, evaluates it, and prints the result in the buffer. To get the usual newline-and-indent functionality attached to C-j in other modes, you must press Enter, followed by Tab.
Remember that an S-expression is any syntactically valid expression in Lisp. Therefore, you can use C-j in Lisp interaction mode to check the values of variables, enter function definitions, run functions, and so on. For example, if you type auto-save-interval and press C-j, the value of that variable (300 by default) appears. If you type a defun and press C-j after the last right parenthesis, Emacs stores the function defined (for future invocation) and prints its name; in this case, C-j is similar to C-M-x (for eval-defun) except that the cursor must be after (as opposed to before or in the middle of) the function being defined. If you invoke a function, Emacs evaluates (runs) the expression and responds with whatever value the function returns.
C-j in Lisp interaction mode gives you an excellent way to play with, incrementally develop, and debug Emacs Lisp code, and since Emacs Lisp is "true" Lisp, it is even useful for developing some bits of code for other Lisp systems.