Boolean expressions in perl are kind of odd. Undef is false, as is the null string, and anything that casts to a string containing a single zero. Everything else is true. Therefore, all numbers but zero are true, as are all strings but "" and "0".
UNIX> cat hw.perl print "hello world\n"; UNIX> perl hw.perl hello world UNIX>So, like awk, there is a print statement. Unlike awk, you have to provide your own newline, and it will only print out one value (i.e. you can't have it print out a bunch of comma-separated values as in awk).
Like C, perl programs are tokens separated by whitespace (i.e. commands can span lines). You must end all commands with semicolons.
UNIX> cat simp.perl print "Jim\n"; print 1.55 . "\n"; print "Jim" . " " . "Plank" . "\n"; print (("5" + 6) . "\n"); UNIX> perl simp.perl Jim 1.55 Jim Plank 11 UNIX>
UNIX> cat scalar.perl $i = 1; $j = "2"; print "$i\n"; print "$j\n"; $k = $i + $j; print "$k\n"; print $i . $j . "\n"; print '$k\n'. "\n"; UNIX> perl scalar.perl 1 2 3 12 $k\n UNIX>
Construct | Meaning |
\cC | Any "control" character (here, CNTL-C) |
\\ | Backslash |
\" | Double quote |
\l | Lowercase next letter |
\L | Lowercase all following letters until \E |
\u | Uppercase next letter |
\U | Uppercase like \L |
\Q | Backslash-quote all non-alphanumerics until \E |
\E | Terminate \L, \U, or \Q |
UNIX>perl5 strings.perl This is a demonstration of "double quotes" and \ (backslashes) this is about changing case OF TEXT inside Double quotes And\ this\ is\ the\ \"backslash\-quote\"\ option\;\ which\ is\ weird\.\ UNIX>There several others that don't usually seem necessary to me but if you need it it is probably available. Read the man pages or a book for more info. Also remember that double-quoted strings are what one book calls variable interpolated meaning that variables are replaced by their current values inside the strings just like in the Bourne shell.
Perl provides regular expression matching and substitution in a form very familiar to sed/awk. The matching operator is =~ and is a boolean operator. Regular expressions are enclosed in slashes, and work pretty much like sed/awk. There are a few differences:
$i = "Jim"; $j = "JjJjJjJj"; $k = "Boom Boom, out go the lights!"; $i =~ /Jim/; True $i =~ /J/; True $i =~ /j/; False $i =~ /j/i; True $i =~ /\w/; True $i =~ /\W/; False $j =~ /j*/; True -- matches anything $j =~ /j+/; True -- matches the first 'j' $j =~ /j?/; True -- matches the first 'j' $j =~ /j{2}/; False $j =~ /j{2}/i; True -- ignores case $j =~ /(Jj){3}/; True -- matches the entire string $k =~ /Jim|Boom/; True -- matches Boom $k =~ /(Boom){2}/; False -- there's a space between Booms $k =~ /(Boom ){2}/; False -- the second Boom ends with a comma $k =~ /(Boom\W){2}/; True $k =~ /\bBoom\b/; True -- shows word delimiters $k =~ /\bBoom.*the\b/; True $k =~ /\Bgo\B/; False -- false, because "go" is a word $k =~ /\Bgh\B/; True -- the "gh" is in the middle of "lights"Note that when you run match.perl, the false results are printed as null strings, not zeros.
s/pattern1/pattern2/So, for example, see sub1.perl:
$j = "Jim Plank"; $j =~ s/ .*/i Hendrix/; Makes 'Jimi Hendrix' $j =~ s/i/I/g; Makes 'JImI HendrIx' $j =~ s/\b\w*\b/Dikembe/; Makes 'Dikembe HendrIx' $j =~ s/(\b\w*\b)/Jimi "\1"/; Makes 'Jimi "Dikembe" HendrIx'Unfortunately, you can't use =~ or !~ for that matter to apply substitution as you would, for example, addition, like:
$k = $j =~ s/Jim/Jimi/;Which is a pity.
You'll note in the last substitution of sub1.perl, I used the parentheses as memory. This is analogous to \( and \) in sed, except you can use the memory even in the first pattern. For example, see sub2.perl:
$j = "Jim Plank"; $j =~ s/(\w*) (\w*)/\1 \1 \2/; Makes 'Jim Jim Plank' $i = "I am the the man"; $i =~ s/(\b\w+\b) \1/\1/; Makes 'I am the man' -- figure it out!
There are many ways of doing if statements, but some of them are so odious that I won't divulge them. Read a perl manual.
Instead of doing "else if" as in C, you should do "elsif". This is like elif in the Bourne shell.
UNIX> cat input D F C B E A UNIX> perl stdin.perl < input D F C B E A UNIX>Note that you get the newline with the string. To get rid of the newline, use the chop() procedure, which modifies its argument to get rid of the last character.
You can open a file for input and then use it like STDIN, above. Moreover, you can open a file for output and print to it. For example, catinput.perl copies the file input to the file output. It also shows use of chop():
UNIX> perl catinput.perl UNIX> cat output D F C B E A UNIX>Note that when the file output was opened, a > was included in the string. This tells perl to open the file as output. Had we wanted to append to the file we would have used >> instead. You can also print to pipes, read input from pipes, etc.
For example, sort1.perl sorts standard input by reading it into an array and printing the sorted array.
UNIX> cat input D F C B E A UNIX> perl sort1.perl < input A B C D E F UNIX>You can make this simpler: The STDIN token may be treated as an array, so you can simply print the sorted array. This is in sort2.perl:
UNIX> perl sort2.perl < input A B C D E F UNIX>Be careful with this type of usage as if the input is huge there could be memory problems.
Other useful things you can do is split a string into an array of its words (much like awk), and use the subroutines push() and pop() to add and remove elements from the end of an array and shift and unshift to add and remove elements from the front of the array.
You can get at the size of an array by using the array in a place where an integer is expected.
The program reverse.perl uses push() and pop() to reverse a file, and the program revline.perl uses split() in a typical way, and the array size to reverse each line of a file.
(The syntax of split() is split(pattern,string), where the pattern specifies how the space between words is delimited. split(/\s+/,string) means to use contiguous blocks of whitespace as the word delimiter).
Of course, there is also a reverse operator which returns the reverse of an array, and this can be used to make the above programs simpler. See reverse2.perl and revline2.perl. The latter makes use of the foreach construct to iterate over all elements in an array. Does revline2.perl feel like it's approaching unreadability? I agree.
UNIX> cat input2 I am Sam I am Sam Sam I am That Sam I am, that Sam I am, I do not like that Sam I am! UNIX> perl reverse.perl < input2 That Sam I am, that Sam I am, I do not like that Sam I am! Sam I am I am Sam I am Sam UNIX> perl reverse2.perl < input2 That Sam I am, that Sam I am, I do not like that Sam I am! Sam I am I am Sam I am Sam UNIX> perl revline.perl < input2 Sam am I Sam am I am I Sam am! I Sam that like not do I am, I Sam that am, I Sam That UNIX> perl revline2.perl < input2 Sam am I Sam am I am I Sam am! I Sam that like not do I am, I Sam that am, I Sam That UNIX>
UNIX> cat input3 Peyton Manning 423-vol-qb4u Phil Fulmer 423-vol-head Pat Summitt 423-lvl-head Joe Johnson 423-vol-prez Jim Plank 423-vol-peon UNIX> perl phone.perl < input3 Fulmer, Phil, 423-vol-head Johnson, Joe, 423-vol-prez Manning, Peyton, 423-vol-qb4u Plank, Jim, 423-vol-peon Summitt, Pat, 423-lvl-head UNIX>
UNIX> perl ls.perl catinput.perl hw.perl ls.perl match.perl other.perl phone.perl reverse.perl reverse2.perl revline.perl revline2.perl scalar.perl simp.perl sort1.perl sort2.perl stdin.perl sub1.perl sub2.perl UNIX>
The last part needs some explanation. The <=> operator, called in the perl books the spaceship operator, compares the two values $a and $b (don't worry about their old values they are protected) numerically returning 1 if $a > $b, -1 if $a < $b and 0 otherwise. This lets sort sort numerically instead of lexicographically.
Command line arguments are in the @ARGV array.
You can exit from a program with exit.