OnWorks Linux and Windows Online WorkStations

Logo

Free Hosting Online for WorkStations

< Previous | Contents | Next >

fmt – A Simple Text Formatter

The fmt program also folds text, plus a lot more. It accepts either files or standard input and performs paragraph formatting on the text stream. Basically, it fills and joins lines in text while preserving blank lines and indentation.

To demonstrate, we’ll need some text. Let’s lift some from the fmt info page:


`fmt' reads from the specified FILE arguments (or standard input if none are given), and writes to standard output.


By default, blank lines, spaces between words, and indentation are

`fmt' reads from the specified FILE arguments (or standard input if none are given), and writes to standard output.


By default, blank lines, spaces between words, and indentation are


preserved in the output; successive input lines with different indentation are not joined; tabs are expanded on input and introduced on output.


`fmt' prefers breaking lines at the end of a sentence, and tries to avoid line breaks after the first word of a sentence or before the last word of a sentence. A "sentence break" is defined as either the end of a paragraph or a word ending in any of `.?!', followed by two spaces or end of line, ignoring any intervening parentheses or quotes. Like TeX, `fmt' reads entire "paragraphs" before choosing line breaks; the algorithm is a variant of that given by Donald E. Knuth and Michael F. Plass in "Breaking Paragraphs Into Lines",

`Software--Practice & Experience' 11, 11 (November 1981), 1119-1184.

preserved in the output; successive input lines with different indentation are not joined; tabs are expanded on input and introduced on output.


`fmt' prefers breaking lines at the end of a sentence, and tries to avoid line breaks after the first word of a sentence or before the last word of a sentence. A "sentence break" is defined as either the end of a paragraph or a word ending in any of `.?!', followed by two spaces or end of line, ignoring any intervening parentheses or quotes. Like TeX, `fmt' reads entire "paragraphs" before choosing line breaks; the algorithm is a variant of that given by Donald E. Knuth and Michael F. Plass in "Breaking Paragraphs Into Lines",

`Software--Practice & Experience' 11, 11 (November 1981), 1119-1184.


We’ll copy this text into our text editor and save the file as fmt-info.txt. Now, let’s say we wanted to reformat this text to fit a fifty character wide column. We could do this by processing the file with fmt and the -w option:


[me@linuxbox ~]$ fmt -w 50 fmt-info.txt | head

`fmt' reads from the specified FILE arguments (or standard input if

none are given), and writes to standard output.


By default, blank lines, spaces between words, and indentation are

preserved in the output; successive input lines with different indentation are not joined; tabs are expanded on input and introduced on output.

[me@linuxbox ~]$ fmt -w 50 fmt-info.txt | head

`fmt' reads from the specified FILE arguments (or standard input if

none are given), and writes to standard output.


By default, blank lines, spaces between words, and indentation are

preserved in the output; successive input lines with different indentation are not joined; tabs are expanded on input and introduced on output.


Well, that’s an awkward result. Perhaps we should actually read this text, since it explains what’s going on:

By default, blank lines, spaces between words, and indentation are preserved in the output; successive input lines with different indentation are not joined; tabs are expanded on input and introduced on output.

So, fmt is preserving the indentation of the first line. Fortunately, fmt provides an op- tion to correct this:



[me@linuxbox ~]$ fmt -cw 50 fmt-info.txt

`fmt' reads from the specified FILE arguments (or standard input if none are given), and writes to standard output.

[me@linuxbox ~]$ fmt -cw 50 fmt-info.txt

`fmt' reads from the specified FILE arguments (or standard input if none are given), and writes to standard output.


By default, blank lines, spaces between words, and indentation are preserved in the output; successive input lines with different indentation are not joined; tabs are expanded on input and introduced on output.


`fmt' prefers breaking lines at the end of a sentence, and tries to avoid line breaks after the first word of a sentence or before the

last word of a sentence. A "sentence break" is defined as either the end of a paragraph or a word ending in any of `.?!', followed by two spaces or end of line, ignoring any

intervening parentheses or quotes. Like TeX,

`fmt' reads entire "paragraphs" before choosing line breaks; the algorithm is a variant of

that given by Donald E. Knuth and Michael F. Plass in "Breaking Paragraphs Into Lines",

`Software--Practice & Experience' 11, 11

(November 1981), 1119-1184.

By default, blank lines, spaces between words, and indentation are preserved in the output; successive input lines with different indentation are not joined; tabs are expanded on input and introduced on output.


`fmt' prefers breaking lines at the end of a sentence, and tries to avoid line breaks after the first word of a sentence or before the

last word of a sentence. A "sentence break" is defined as either the end of a paragraph or a word ending in any of `.?!', followed by two spaces or end of line, ignoring any

intervening parentheses or quotes. Like TeX,

`fmt' reads entire "paragraphs" before choosing line breaks; the algorithm is a variant of

that given by Donald E. Knuth and Michael F. Plass in "Breaking Paragraphs Into Lines",

`Software--Practice & Experience' 11, 11

(November 1981), 1119-1184.


Much better. By adding the -c option, we now have the desired result.

fmt has some interesting options:


Table 21-3: fmt Options


Option Description

Option Description

-c Operate in crown margin mode. This preserves the indentation of the first two lines of a paragraph. Subsequent lines are aligned with the indentation of the second line.


image

-p string Only format those lines beginning with the prefix string. After formatting, the contents of string are prefixed to each reformatted line. This option can be used to format text in source code comments. For example, any programming language or configuration file that uses a “#” character to delineate a comment could be formatted by specifying -p '# ' so that only the comments will be formatted. See the example below.


image

-s Split-only mode. In this mode, lines will only be split to fit the specified column width. Short lines will not be joined to fill lines. This mode is useful when formatting text such as code where joining is not desired.


image

-u Perform uniform spacing. This will apply traditional “typewriter-


image


image

style” formatting to the text. This means a single space between words and two spaces between sentences. This mode is useful for removing “justification,” that is, text that has been padded with spaces to force alignment on both the left and right margins.


image

-w width Format text to fit within a column width characters wide. The default is 75 characters. Note: fmt actually formats lines slightly shorter than the specified width to allow for line balancing.


image


The -p option is particularly interesting. With it, we can format selected portions of a file, provided that the lines to be formatted all begin with the same sequence of charac- ters. Many programming languages use the pound sign (#) to indicate the beginning of a comment and thus can be formatted using this option. Let’s create a file that simulates a program that uses comments:



[me@linuxbox ~]$ cat > fmt-code.txt

# This file contains code with comments.


# This line is a comment.

# Followed by another comment line.

# And another.


This, on the other hand, is a line of code. And another line of code.

And another.

[me@linuxbox ~]$ cat > fmt-code.txt

# This file contains code with comments.


# This line is a comment.

# Followed by another comment line.

# And another.


This, on the other hand, is a line of code. And another line of code.

And another.


Our sample file contains comments which begin with the string “# “ (a # followed by a space) and lines of “code” which do not. Now, using fmt, we can format the comments and leave the code untouched:



[me@linuxbox ~]$ fmt -w 50 -p '# ' fmt-code.txt

# This file contains code with comments.


# This line is a comment. Followed by another

# comment line. And another.


This, on the other hand, is a line of code. And another line of code.

And another.

[me@linuxbox ~]$ fmt -w 50 -p '# ' fmt-code.txt

# This file contains code with comments.


# This line is a comment. Followed by another

# comment line. And another.


This, on the other hand, is a line of code. And another line of code.

And another.


Notice that the adjoining comment lines are joined, while the blank lines and the lines that do not begin with the specified prefix are preserved.


Top OS Cloud Computing at OnWorks: