Plusses: Viewed in its best light, Perl is a language that encapsulates the best features of the shell, sed, grep, awk, tr, C and Cobol. If you are familiar with these tools, you can write very powerful Perl programs rather quickly and easily. In particular, you can write programs that duplicate the functionality of shell scripts but are all in one file (indeed, they are all in one language) and thus are much more efficient.
Minuses: Perl is a jumble. It contains many, many features from many languages and programs. It contains many differing constructs that implement the same functionality. For example, there are at least 5 ways to perform a one-line if statement. While this is good for the programmer, it is extremely bad for everyone but the programmer (and bad for the programmer who tries to read his own program in 6 months), and has led to Perl being called a ``write-only'' language. There are other minuses as well, but I won't go into them further. You can discover them for yourself. A colleague of mine (Norman Ramsey, at Virginia) responded to an email I sent him about perl, and his response is worth quoting in it's entirity:
So this is the mistake. The brilliance is in including so many things people want to do, and in a form that is almost familiar, so they can pattern-match more easily. In fact, the familiarity is really an illusion, and if you're going to program in perl, you can hack without understanding, or you can restrict yourself to a subset you can understand. But if you're going to do the latter, why not program in sh, awk, and sed to begin with?
I've never used apl, so I can't compare.
I have---the thing with apl is that although it is *all* weird, it is weird in a very consistent way. You don't have the illusion of familiarity. You do have a huge set of unreadable glyphs, but they come with a very small set of simple rules for decrypting them, the most important of which are the right-to-left scan rule and the fact that user-defined functions take at most two arguments.
I just got to a part in my manual where they advocate using && as an if statement.
I got past that. I gave up on perl the day I learned I couldn't write a function to return an open file handle (e.g., an open socket). Now I use it only when forced.
The debate as to which is better: Perl, Python or Icon (which I'm not teaching in this class) is a heated one. Python and Icon have better language design. Perl has the most familiar regular expression syntax. I won't get into it, but if you look, you can find all sorts of opinions. Of course, it's best to formulate your own opinions by learning all three....
There are two recommended books in case you want more. First is ``Learning Perl'' by Schwartz, and ``Programming Perl'' by Schwartz and Wall. Both are published by O'Reilly & Associates.
UNIX> cat hw.perl print "hello world\n"; UNIX> perl hw.perl hello world UNIX>So, like awk, there is a print statement. Unlike awk, you have to provide your own newline, and it will only print out one value (i.e. you can't have it print out a bunch of comma-separated values as in awk). There are two scalar types in perl: strings and numbers. All numbers are floating point. Like awk, you can cast at will, and perl will understand.
You can concatenate strings with the dot operator.
Like C, perl programs are tokens separated by whitespace. I.e. commands can span lines. You must end all commands with semi-colons.
UNIX> cat simp.perl print "Jim\n"; print 1.55 . "\n"; print "Jim" . " " . "Plank" . "\n"; print (("5" + 6) . "\n"); UNIX> perl simp.perl Jim 1.55 Jim Plank 11 UNIX>
UNIX> cat scalar.perl $i = 1; $j = "2"; print "$i\n"; print "$j\n"; $k = $i + $j; print "$k\n"; print $i . $j . "\n"; print '$k\n'. "\n"; UNIX> perl scalar.perl 1 2 3 12 $k\n UNIX>
Boolean expressions in perl are kind of odd. Undef is false, as is the null string, and anything that casts to a string containing a single zero. Everything else is true. Therefore, all numbers but zero are true, as are all strings but "" and "0".
You compare numbers with the C comparative operators. You can use eq, ne, lt, gt, le, and ge to compare strings lexicographically.
There are many ways of doing if statements, but some of them are so odious that I won't divulge them. Read a perl manual.
Instead of doing "else if" as in C, you should do "elsif". This is like elif in the Bourne shell.
UNIX> cat input D F C B E A UNIX> perl stdin.perl < input D F C B E A UNIX>Note that you get the newline with the string. To get rid of the newline, use the chop() procedure, which modifies its argument to get rid of the last character.
You can open a file for input and then use it like STDIN, above. Moreover, you can open a file for output and print to it. For example, catinput.perl copies the file input to the file output. It also shows use of chop():
UNIX> perl catinput.perl UNIX> cat output D F C B E A UNIX>You can also open a file for append, print to pipes, read input from pipes, etc.
For example, sort1.perl sorts standard input by reading it into an array and printing the sorted array.
UNIX> cat input D F C B E A UNIX> perl sort1.perl < input A B C D E F UNIX>You can make this simpler: The STDIN token may be treated as an array, so you can simply print the sorted array. This is in sort2.perl:
UNIX> perl sort2.perl < input A B C D E F UNIX>Other useful things you can do is split a string into an array of its words (much like awk), and use the subroutines push() and pop() to add and remove elements from the end of an array.
You can get at the size of an array by using the array in a place where an integer is expected.
The program reverse.perl uses push() and pop() to reverse a file, and the program revline.perl uses split() in a typical way, and the array size to reverse each line of a file.
(The syntax of split() is split(pattern,string), where
the pattern specifies how the space between words is delimited.
split(/\s+/,string) means to use contiguous blocks of whitespace
as the word delimiter).
Of course, there is also a reverse operator which returns the
reverse of an array, and this can be used to make the above
programs simpler. See
reverse2.perl and
revline2.perl. The latter makes use
of the foreach construct to iterate over all elements in an array.
Does revline2.perl feel like it's approaching unreadability?
I agree.
Perl provides regular expression matching and substitution in a
form very familiar to sed/awk. The matching operator is =~
and is a boolean operator. Regular expressions are enclosed in slashes,
and work pretty much like sed/awk. There are a few differences:
You'll note in the last substitution of sub1.perl, I used
the parentheses as memory. This is analagous to
\( and \) in sed, except you can use the memory even in
the first pattern. For example, see sub2.perl:
Command line arguments are in the @ARGV array.
You can write to standard error by writing to STDERR.
You can exit from a program with exit.
UNIX> cat input2
I am Sam
I am Sam
Sam I am
That Sam I am, that Sam I am, I do not like that Sam I am!
UNIX> perl reverse.perl < input2
That Sam I am, that Sam I am, I do not like that Sam I am!
Sam I am
I am Sam
I am Sam
UNIX> perl reverse2.perl < input2
That Sam I am, that Sam I am, I do not like that Sam I am!
Sam I am
I am Sam
I am Sam
UNIX> perl revline.perl < input2
Sam am I
Sam am I
am I Sam
am! I Sam that like not do I am, I Sam that am, I Sam That
UNIX> perl revline2.perl < input2
Sam am I
Sam am I
am I Sam
am! I Sam that like not do I am, I Sam that am, I Sam That
UNIX>
Associative Arrays
Like awk and python, perl has associative arrays.
Again, set them by using them. When accessing a value, you precede it
with a dollar sign and enclose the key in curly braces.
When accessing the whole array, you precede it with
a percent sign.
The keys() function returns an
array of the keys of the associative array. The values() function
returns the values. Both of these return their keys/values in any
order. So, for example, suppose you have a list of first names,
last names,and phone numbers, and you want to print it sorted in the
format: last name, first, phone number. Then you can do something like
phone.perl. Note that perl does
support printf.
UNIX> cat input3
Peyton Manning 423-vol-qb4u
Phil Fulmer 423-vol-head
Pat Summitt 423-lvl-head
Joe Johnson 423-vol-prez
Jim Plank 423-vol-peon
UNIX> perl phone.perl < input3
Fulmer, Phil, 423-vol-head
Johnson, Joe, 423-vol-prez
Manning, Peyton, 423-vol-qb4u
Plank, Jim, 423-vol-peon
Summitt, Pat, 423-lvl-head
UNIX>
Listing files
Perl lets you do directory listings with shell-style pattern
matching. A simple example is ls.perl
which lists the files
in the current directory with the .perl extension:
UNIX> perl ls.perl
catinput.perl
hw.perl
ls.perl
match.perl
other.perl
phone.perl
reverse.perl
reverse2.perl
revline.perl
revline2.perl
scalar.perl
simp.perl
sort1.perl
sort2.perl
stdin.perl
sub1.perl
sub2.perl
UNIX>
Fancy string stuff
There are many fancy things that you can do inside double quotes for
string construction. I won't go into them here.
Some examples (these are in match.perl).
$i = "Jim";
$j = "JjJjJjJj";
$k = "Boom Boom, out go the lights!";
$i =~ /Jim/; True
$i =~ /J/; True
$i =~ /j/; False
$i =~ /j/i; True
$i =~ /\w/; True
$i =~ /\W/; False
$j =~ /j*/; True -- matches anything
$j =~ /j+/; True -- matches the first 'j'
$j =~ /j?/; True -- matches the first 'j'
$j =~ /j{2}/; False
$j =~ /j{2}/i; True -- ignores case
$j =~ /(Jj){3}/; True -- matches the entire string
$k =~ /Jim|Boom/; True -- matches Boom
$k =~ /(Boom){2}/; False -- there's a space between Booms
$k =~ /(Boom ){2}/; False -- the second Boom ends with a comma
$k =~ /(Boom\W){2}/; True
$k =~ /\bBoom\b/; True -- shows word delimiters
$k =~ /\bBoom.*the\b/; True
$k =~ /\Bgo\B/; False -- false, because "go" is a word
$k =~ /\Bgh\B/; True -- the "gh" is in the middle of "lights"
Note that when you run match.perl, the falses are printed as null
strings, not zeros.
Regular Expression Substitution
You can modify a string variable by applying a sed-like substitution.
The operator is again =~, and the substitution is specified as
s/pattern1/pattern2/
So, for example, see sub1.perl:
$j = "Jim Plank";
$j =~ s/ .*/i Hendrix/; Makes 'Jimi Hendrix'
$j =~ s/i/I/g; Makes 'JImI HendrIx'
$j =~ s/\b\w*\b/Dikembe/; Makes 'Dikembe HendrIx'
$j =~ s/(\b\w*\b)/Jimi "\1"/; Makes 'Jimi "Dikembe" HendrIx'
Unfortunately, you can't use =~ or ~ for that matter
to apply substitution as you would, for example, addition, like:
$k = $j ~ s/Jim/Jimi/;
Which is a pity.
$j = "Jim Plank";
$j =~ s/(\w*) (\w*)/\1 \1 \2/; Makes 'Jim Jim Plank'
$i = "I am the the man";
$i =~ s/(\b\w+\b) \1/\1/; Makes 'I am the man' -- figure it out!
Other constructs that you should know about
Look at other.perl. This contains
code for opening a file for append, writing to a pipe, reading
from a pipe and sorting numerically. Try it out.
Reading perl programs
Perl lets you do lots more than what I've detailed. If you start
reading
random perl programs, you'll notice the use of defaults
(e.g. $_) in procedures,
substitutions, foreach clauses, etc. The best thing I can say
is to read the manual before trying to read programs. I'm not a huge
fan of these shortcuts, but perhaps I'm not the prototypical perl
hacker.
More, more, more
There is much more that you can do with perl. I have
ommitted procedure calls, but obviously they exist in the language.
There is also support for networking.
The best way to
learn is to explore. Enjoy.