Lecture 4

Basic Perl


  1. Perl scripts
  2. Scalar Data
  3. Lists and arrays
  4. Input Output
  5. Regular Expressions

Perl scripts

A simple hello.pl would be


print "Hello World!\n";

The first line dictates which version of perl to use (this location is for Mac) but you could also write


print "Hello World!\n";

if perl is in your search path.

Run it with $ perl hello.pl

Scalar data


Single Quotes

A single quote string literal is just a collection of characters which mean exactly what they are except for \' and \\.

print 'Apa' . "\n" ;
print 'Apan\'s' . "\n" ;
print 'Apan\\' . "\n" ;

$ perl test.pl


Double Quotes

A double quote string literal a bit different in that it interpolates variables and supports backslash escapes

$q = 123;
print "A\tB\n\LBCDE\E\n$q\n";

$ perl test.pl
A       B

Here \t is tab, \n is newline \LSTUFF\E forces lowercase
$n is a variable with a value 123

Honey Badger Don't Care, nor does Perl

Perl automatically converts between numbers and strings depending on what operator is used.

print "A" . 3*2*1 . "\n";

$ perl test.pl

Note that * takes precedence of the concatenation operator .

Scalar variables

A scalar variable is denoted by $name and holds a single scalar value. Scalar variables can be changed throughout the program.

$number = 123;
$name = 'Daniel';

They can also be modified with binary assignment operators like *=, =+ or .=

$sum = 0;
$sum += $number;   # $sum is now  123
$name .= 'Appelo'; # $name is now Daniel Appelo

Note that variable names are case sensitive.

While and for loops

$n = 5; $fact = 1; $i = 1;
while ($i <= $n ) {
    $fact *= $i;
    $i += 1;
print "Computed $n! = $fact \n";

$ Computed 5! = 120
for ($i = 1; $i <= 1000; $i *= 2) {
    print "$i ";
print "\n";

1 2 4 8 16 32 64 128 256 512

Lists and arrays

@my_array = (1, 2, 3);
print $my_array[0] . "\n" ;
print $my_array[1] . "\n" ;
print $my_array[2] . "\n" ;

Note that you access the elements in the array with $ and [ ] (starting at 0.)


You can loop over all the elements in the array using foreach

@my_array = qw/ spam sausage ham/ ;
foreach $elem (@my_array) {
  print "We have $elem for lunch \n";

We have spam for lunch
We have sausage for lunch
We have ham for lunch

The default variable $_

Form many control structures you may leave out $elem, then perl uses the default variable $_

@my_array = qw/ spam sausage ham/ ;
foreach (@my_array) {
  print "We have ";
  print ;
  print " for lunch \n";

Here both foreach and print use $_.


open FILE, "$cmdFile";
open OUTFILE, ">", "$outFile";

Slightly different version halting the program if the file failed to open.

open(FILE,"$cmdFile") || die "cannot open file $cmdFile!" ;
open(OUTFILE,"> $outFile") || die "cannot open file!" ;

Reading and writing to file

This snippet copies the file $cmdFile to $outFile.

open(FILE,"$cmdFile") || die "cannot open file $cmdFile!" ;
open(OUTFILE,"> $outFile") || die "cannot open file!" ;
while( $line = <FILE> )  # read one line at a time until EOF
    print OUTFILE $line;
    print $line;
close( OUTFILE );
close( FILE );

Regular Expressions

A typical thing to do is to search a file for some pattern and then do something (e.g. replace it with something else.) This is what perl is really good at, mainly due to it's strong support for regular expressions or regex for short.

In short a regex is just a pattern that can be used to match a string against. Either it matches or it does not.

For example if the default $_ = "The grass is green" `` is set then if we match against the regex ``/re/

  print "Found it!\n";

would print Found it!.

Regular Expressions

The regex \re\ is not very advanced. This is not typical ;-) Usually regexs can be very hard to read due to all the different options that can be used to construct them. A couple of options are:

  • /./ - Any single character except newline.
  • /X*/ - Match the preceding item X zero or more times. Thus .* matches any old junk.
  • /X+/ - Match the preceding item X one or more times. E.g. /Q+/ matches Q, QQ etc.
  • Parenthesis (grouping) can be used if you want to construct an item longer than one character, e.g. /(OMG)+/ would match OMGOMGOMG while /OMG+/ would match OMGGGGGGGGG.

More useful options

  • Modifiers can be placed after the regex to modify the pattern, /(OMG)+/i would make the match case-insensitive so that it would also match oMgomGomg.There are more of these, you can find info online.
  • Anchors \b can be very useful, they ensure you "match whole word only", e.g. /\bOMG\b/ only matches OMG and not ROMG.

The binding operator, =~

You don't have to match against the default variable just use the binding operator =~ to match against the sting on the left

$str = 'A cat in a hat';
if($str =~ /cat/){
  print "Found a cat!\n";


Often you want to find a pattern in a string and change it. The substitution, s///, is very useful for this! You can of course do this for the default or by using binding.

$_ = "A red rose.\n";
print ;

>> A RED rose.
 $str = "Sju skonsjungande sjukskoterskor skotte sjuttiosju
 sjosjuka sjoman pa skeppet.\n";
$str =~ s/sjo/lake/g;
$str =~ s/sjuk/sick/g;
print $str;

>> Sju skonsjungande sickskoterskor skotte sjuttiosju lakesicka
lakeman pa skeppet.

Homework 2