
By
Walter
Alan Zintz
.
By popular demand I'm trying something new in the tutorial,
starting with this installment. The e-mail I receive from
tutorial readers most often asks me how to do some specific
type of editing job, using whatever editor tools are needed.
So, I'm now mixing my general-principle explanations with
in-depth coverage of particular work areas.
The first application area I'm covering is the one readers
ask about most often, by far: editing files where columns
are a major factor. Future areas are up to you readers. If
you have an application area you'd like to see explained in
some depth,
e-mail me
your suggestion.
Screen-Mode Addresses
You use them all the time. They're the address targets that
tell screen-mode commands like
c d y
which stretch of your file to act on. And even more often you
use such addresses without commands, to move around in the file.
For starters, I'll tell you some basics of screen-mode
addressing that aren't particularly clear to most editor users.
Then it's on to a few powerful but obscure addresses that most
of us rarely or never use.
A FEW ADDRESS PRINCIPLES
The first fact of screen-mode range addresses is simple enough:
one end of the range to be affected by the command is always marked
by the cursor itself. The address you give the command (always a
single address) indicates where the other end of the affected range
is to be. The address target can be either forward or backward
from the cursor position
, in most cases. But exactly how the cursor
and the target terminate the two ends of the range is variable.
At the start we have to distinguish between line addresses and
character addresses. Line addresses are very straightforward: the
command affects the entire line the cursor is on, the entire line
where the address point is located, and all the lines in between.
If you are using an address without a command, in order to move the
cursor, a line address generally puts the cursor on the first
non-whitespace character in the line addressed.
But line versus character addresses
affect
a lot more than exactly what's included in the range. As one example,
if you yank or delete text using a line address and then place that
text somewhere with a
p
or
P
command, that text will appear on a
new line or lines, above or below the line you are on, respectively.
But if you yanked or deleted with a character address, when
you put
the text back in, it will appear within the line you are on, just
just ahead of or behind the cursor. And to dispose of one editor
fallacy here and now, it does not make a bit of difference that the
range of text you yanked or deleted with a character address amounts
to exactly one or more lines -- it will still behave as any other
text yanked or deleted with a character address.
So which addresses
are line addresses? That depends on what your command is.
Besides the three commands I cited as examples above, there
are four other, less-used commands --
! < >
=
-- that also take addresses. The only thing
you have to know right now about these four commands is that they
can act only on entire lines; that's inherent in what they do.
So with these four commands, every address is a line address.
(Except a handful of addresses, such as ``f'', that cannot be
used with these commands at all.)
With the three more-used commands
c d
y
or with an address used by itself to move the
cursor, an individual address is either always a line address or
always a character address -- usually. There are exceptions to
this rule also, such as the address ``j'', which is a
character address when you are just moving the cursor, but a line
address to any command.
So just where does a character address take you? When you are
just moving around in the file, the cursor lands on the character
that is the target you sought. Or if the target was a string of
characters, the character address puts the cursor on the first of
these.
When you are using a character address with a command, the
situation is more complex. The one firm rule is that if the
character address is farther down in the file than the cursor
position, the cursor position is included in the range the command
affects; while if the address target is earlier in the file than
the cursor, the cursor position is not included in the range.
The question of whether
the address target is included in the
command's range, like all the other open questions raised in the
last few paragraphs, will have to be answered separately for
each address. (But the usual rule is that if the address target is
forward of the cursor, the target is not included; if the target lies
backward from the cursor, the target is included.)
Note also that a count given with any of these
seven commands is passed to the address. You may give the count before
or after the command character itself, but always before the address.
What the address does
with the count, if anything, is also a case-by-case question.
USEFUL ADDRESSES
There are four addresses
that together
resemble a miniaturized, localized version of the / and ? search
patterns. In each case, the search takes place only in the
current line, and only for a single character. To use any of
them, you type one of the four letters designating the kind of
inline
search, immediately followed by the character to be
searched for. (There are no metacharacters used with these
addresses.)
The letter ``f'' means that the search will go forward in the
current line and stop on the character typed next. ``F'' makes
the search run backward within the current line, otherwise the
same as ``f''. A ``t'' search is the same as an ``f'' search
except that the search stops with the character just short of
the one you type after the ``t'', and a ``T'' search is like a
``t'' search but running backward within the current line. Any
of these addresses can take a preceding count, which tells the
search not to stop at the first instance of the character sought,
but to go on to the
n
th, where
n
is the count.
Any of these search commands,
including the
repeat-search commands mentioned below, are character addresses and
can be used as an address for any of the three range commands that
does not require a line address. In every case, th
e character on
which the cursor would have landed had there been no command is
the furthest character included in the range the command will affect.
A few examples. ``Fp'' would cause a search that went backward
and landed on the closest prior letter ``p''. ``3f-'' would make
the search run forward within the current line and stop on the third
instance of a hyphen. ``2T '' would cause a backward search that
ended one character short of the second closest space character.
This search system has its own repeat-search characters, which use
storage buffers completely independent of those used for storing
previous / and ? search strings. A
semicolon ``;'' repeats the last inline search, in the same direction.
A comma ``,'' repeats the last search but reverses the direction.
Any count to the original search is not included in the repeat,
but you can give a count to either repeat character which will be
passed to the search command that is repeated. While a search is
limited to the current line,
you can run a search, move to another
line, then use a semicolon or comma to repeat the original search
on the new line.
Another very useful address
that operates within
a single line is the vertical bar ``|''. When preceded by a count,
this address takes the cursor to the
n
th character on the
current line, where
n
is the count, regardless of where the
cursor was when the address was given. (In this address,
n
is
absolute, not relative, starting from character one at the left edge
of the text.)
This address can also be used with a command. If the target
character position is forward from the cursor position, the furthest
character affected will be the last one before the target character.
If the target is backward from the cursor, the target character as well
as all those between it and the cursor will be affected by the
command.
Editing in Columns
Although the Vi/Ex editor was not specifically designed to
d
eal with columnar material, there are ways to use it
effectively for this kind of work. Your choice of techniques
will depend on whether you are dealing with
single-character columns
wherein each
character in a line is in a separate column, or
multi-character columns
where the columns
are set apart from each other by a separator character.
SINGLE-CHARACTER COLUMNS
Here I'm using ``columns'' the way most programmers do. A column
in this sense is simply the characters in a vertical section of
a file, one character wide. That is, the first character on each
line of the file is in the first column, the second character of
each line is in the second column, and so on. You'll find this
usage in systems that use punch-card images, such as early Fortran
programs; in the blocked records in certain databases, such as
the ones used for very large mailing lists; etcetera.
The essential point is that the systems th
at use these records
absolutely depend on each piece of information being entirely
within a certain column or range of columns, and nothing else
being within those columns except padding characters to fill up
any column positions not needed for the information in a
particular record.
For example, a mailing list may require that a suite or
apartment number be in columns 122 through 125 in each record
(line), with any padding following the actual number, so that an
address printing program that finds ``316 '' in those
columns will print ``, #316'' at the end of the street
address line. If it finds ``3A '' it will then print
``, #3A'', etcetera. Should the suite number be even
partially shifted out of the designated columns, the system will
either print garbage as the suite number or issue an error
message and skip that address altogether. The principle is the
same, and even more important, with computer programs in punch-
card image form.
When you are making changes
in existing records, and editing
visually, the first important point is to be sure your are at the
start of the particular field you need to modify. The
``|'' address I've explained above
takes care of that -- wherever you are in a line, typing
122|
brings the cursor to the 122nd
column. Unless there are not 122 columns in that line: then the
cursor will be placed in the last column that does exist, without
any warning or error message. But files of this sort have
generally been checked for exact block sizing, and if yours have
not been, it's easy to check visually.
To check visually that all the lines in the file are of the
proper length, start by running a
:se
list
command, which will display a dollar sign
at the end of each file line. Then scan through the file to check
that all those dollar signs are aligned vertically. If so, then
check that the uniform line length is the correct one -- if your
line
length should be 66 characters (not counting the nonvisible
newline), then run a
65|
command on
any line, and make sure that the cursor lands one column away from
the end of the line.
When you are at the start of the field to be changed, you have a
choice of ways to change it. If the change area is 12 characters
long, then typing
12cl
followed by the
12 new characters and then the escape key will do it. But if you
miss the count by even one character; if the actual number of
characters you type in is 11 or 13; then all the subsequent fields
on that line will be shifted one character out of place, which is
probably a recipe for disaster.
To avoid this hazard, make use of the little-known
R
command. It starts like the
familiar
r
command, in that when you
type the letter ``R'' in visual command mode the system waits to
see what character you type next, and whatever that next
character
is, it replaces the character that was under the cursor. But
instead of then returning you to command mode, the
R
command then moves the cursor one
character to the right and again waits to see what character you
type next -- the character you now type replaces the character
that is now under the cursor. This process continues until you
stop it by hitting the escape key. So if your cursor is on the
capital P in the following line:
but the greatest ancient Greek was Plato, who
and you type in ``RHomer'' followed by the escape key,
your line will now read:
but the greatest ancient Greek was Homer, who
and the cursor will be on the letter r at the end of ``Homer''.
This character at a time replacement is the way to make sure you
don't inadvertently shift any fields. Just be certain that you
don't keep typing replacement characters beyond the existing end
of the line; you would extend the line length that way. You can
give a
count to the
R
command, but
you don't want to in this use because the count will multiply the
number of times the new character string is inserted. That is,
in that example above about replacing ``Plato'' with ``Homer'',
if you had typed
3R
instead of
R
your revised line would read:
but the greatest ancient Greek was HomerHomerHomer, who
Entering completely new lines of information is another matter.
You should just type them straight across, as you would with any
text entry, but if the existing lines are cryptic to human eyes
you may not be able to tell by looking just where one field ends
and another begins. You can try to keep count of the characters,
of course, but a single mistake will throw all the subsequent fields
in that line out of position.
What you need here is an on-screen template to show you what
goes where. You can make one on the spot, just by typing a
template line into
your file, entering each data line just above
it, and deleting that template line when you are finished adding
lines. For example, suppose you are adding to a name file where
each record (line) starts with a month, day and year, continues
with a source code (each of the preceding as a two-digit number,
with a leading zero to pad it if necessary), and then has fields
for a last name, first name, and middle initial. It would not
be practical to judge where fields break just by looking at the
existing data lines, which might look like this:
07215854von TarekenstuttLeopold J
12077338Henderson-Blyth La Toya P
10108972Thistlethwaites Geraldine
But a simple template line can clear it all up. Here is one
for the job above:
m|d|y|s|LLLLLLLLLLLLLLL|FFFFFFFF|M
It has mnemonic characters to remind you of what goes in each
field, and the ``|'' to indicate the last position of each field
more noticeably. I've even used a lower-case letter for each field
that takes numeric c
haracters right justified and zero padded,
and a capital letter for each field that takes alpha characters
left justified and space padded.
The way to use this template is to start entering data lines
immediately above the template line. That way, as you hit return
to start a new line, that new line replaces the one you've just
finished in the position right above the template line. Yes,
eventually the template line will be driven down off the bottom of
the screen, but returning to command mode and typing the
lower-case letter ``z'' followed by the return key will move the
template line and the lines around it to the top of the screen.
But there will be times when you don't want to spend time
making individual changes that you should be able to handle
globally. Suppose an obsolescent operations code has been
replaced, and you now need to change every ``B27'' to ``K53''
throughout your file, but only when the ``B27'' appears in the
operations code columns, which are columns 9 through 11. Th
is
odd-looking command will do it:
:%s/^\(........\)B27/\1K53
Those eight consecutive dots in the search pattern guarantee
that a match will occur only when there are exactly eight
characters between the beginning of the line and the ``B27''.
So of necessity, the ``B'' must occur in column 9, and so on.
The ``\1'' puts those eight characters right back in again, so
only the ``B27'' is actually replaced.
If your columnar file has all lines of equal length, as most do,
you can use this technique from the right side, too. If all lines
in the file have 66 characters, then typing that last command as:
:%s/B27\(...\)$/K53\1
will accomplish the changes in a case where the operations code
columns are 61 through 63, without the need to type (and carefully
count) sixty consecutive dots.
But there will be times when the columns to be changed are in
the middle of horrendously long record lines. There are still a
couple of tric
ks you may be able to use. One is to find a
landmark somewhere in mid-line. Does column 158 always contain
either a ``*'' or a ``|'' character, neither of which can appear
anywhere else in the lines? Then you can make the above change
in columns 163 through 165 by typing:
:%s/\([*|]....\)B27/\1K53
Failing a landmark, let the editor count out a long string of
dots for you. To use this technique, you must first create your
substitution command as a text line within the file you are
editing, next write that line as a separate file (and then
delete the command line from your original file), and finally
use the
:so
command to pull in that
one-line file and run it as a line-mode command. If you need a
string of 92 consecutive dots in your command, create a blank line
at the end of your file, next type:
:1,92g/^/$s/^/.
to put those 92 dots there, and finally put the rest of the
command around that dot str
ing.
MULTI-CHARACTER COLUMNS
The other meaning of ``editing in columns'' has to do with text
rather than data files. It refers to tables of data such as you
might find accompanying a technical article, columns of text
and/or illustrations running in parallel as you'd find on a
newspaper page, and the like.
Yes, Unix formatting utilities and some word processing programs
will format your final output into columns. But you may not have
all these utilities, you may not want to spend time trying to get
the results you want from those benighted programs, or you may
plan to direct your output where formatters won't work.
Visually editing the columns of data in a table requires little
explanation. The one thing to remember: use the
R
as far as possible, to avoid
shifting subsequent columns out of alignment inadvertently.
This holds for creating tables, too; start by setting up a
rectangular block of space character
s, then replace spaces with
the column entries you want, to keep your next entry from
misaligning previous ones. This is also the best way to create
pictures, diagrams, graphs and maps using ASCII characters.
Things become problematic when you want to shift whole
columns around -- there are no built-in Vi facilities for doing
this. Here is what it is practical to do in the editor. As
a real life example, consider the piece below, which I use as
the tail end of Usenet (Net news) posts that announce Indonesian
classical music and dance performances at a local restaurant:
It's at the Dutch East Indies Restaurant ;,,,,;,,,,;,,,,;,,,,;
on Oakland's downtown waterfront. The food /%%%%%%%%%%%%%%%%%%%%%\
there is very good Indonesian cuisine at /%%%%%%%%%%%%%%%%%%%%%%%\
reasonable prices - dinners $8.95 to $17.50. "|""|"""|"""""|"""|""|"
Views are spectacular from the second floor _|__|___|_ _|___|__|_
picture windows, out over the water to Jack =|==|==
=|=====|===|==|=
London Square, Alameda and San Francisco. ~~~~~~~~~~~( (~~~~~~~~~~~
Formality is medium - cloth napkins and oil ) )
candles at the tables, but no supercilious
waiters, and the wall decorations are mostly Indonesian handicrafts.
The phone number for information and reservations is 510/444-6555.
( ( ( | Broadway ||I The Dutch East Indies Restaurant is
) ) )Jack London |==========||== in Jack London Village, a boutiques &
( ( ( Square |E ||8 bistros cluster that is just down the
) ) ) |m ||8 estuary from Jack London Square. Jack
( ( ( JACK LONDON |b ||0 London Village is rustic, picturesque,
) ) )VILLAGE |a || quiet and safe. To get there from the
( ( ( Alice|r Amtrak ||f Interstate 880 freeway heading north,
) ) ) -----------|c station ||r take the Oak Street exit and turn left;
( ( ( Street|a ||e five blocks will bring you to Embarca-
) ) ) |d Ja
ckson||e dero on your right, just before Oak
( ( ( parking lot |e ------||-- curves away to the left. (Going south
) ) ) |r Street||w on I-880, take the Jackson Street exit
( ( ( |o ||a and go two blocks straight ahead before
) ) ) | ||y you turn right on Oak Street.) Turn
( ( ( -------------||-- right onto Embarcadero and go three
) ) ) Oak Street|| blocks, until you go under an overpass
of Victorian ironwork. Immediately
turn left onto Alice Street, where you will see Jack London Village on
your right, and a large lot that offers validated parking on the left.
Walk into the Village's central courtyard, and you'll see the Dutch East
Indies on the estuary side, toward the right, and upstairs.
To create this, I started by drawing the stylized building and then
the map. In each case I created a large rectangular block of space
characters, then began trying ideas with the
R
command until I had something that
satisfied me. (The pavilion sketch eventually became wider than I had
planned, so I had to run a
:%s/.*/ & /
command to give
me more working space.) Next I put additional blocks of space characters
on the left of the drawing and the right of the map, to make a place for the
text I wanted to include. Then I started replacing spaces with text,
rewriting the text as I went along to fit it in nicely. When the text
reached the bottom of the figure I was fitting it to, I went to
full-width text lines, entering them the usual way. A tedious labor,
but pretty straightforward.
Now suppose I decided to redo this piece, by moving the picture
to where the map is now, and vice versa. A few well chosen substitution
and deletion commands would make copies of the two figures minus the text,
and I could just as easily copy the text without the two figures. But
how would I recombine them?
Sh
ort of typing the text in again from scratch, the best I could do
is to yank the lines of each figure, one at a time, and put them after
(or before) the appropriate text lines, one at a time. Not that I
would have to move back and forth between files with each yank and put;
I could yank up to 26 lines into the named buffers, then move to the
other file and put all 26 in their proper places. But there is no
Vi command to yank a rectangular block of characters.
Also take note that I should yank using addresses that are not
line addresses, even though I will be yanking whole lines. If I
should yank with line addresses, putting the pieces into the other
file must make those pieces separate lines -- then I would have
to join each pair of lines to create the columns I want.
Next Time Around
In the next part of this tutorial, I will go over host of
complications and opportunities that come from allowing the
replacement commands I've discussed to use metacharacters. Th
en
I'll answer a couple of questions from readers that should be of
use to quite a few of you from time to time.
|