More Perl
- Arrays and hashes
- References
Arrays
- An array (often called a list), is a collection of scalars
- Array variables always begin with an at sign (@)
- The "a" in "at" is a mnemonic for array
- A list literal is an array comprised entirely of scalars
@days = ("Monday", "Tuesday", "Wednesday", "Thursday", "Friday");
- The qw operator quotes all strings in a list of unquoted strings
@languages = qw(peaches apples pears);
Accessing Array Elements
- Array elements are referenced through subscripts delimited by brackets ([ and ])
- Like in Java, subscripts start with 0
- Because array elements are scalars, you must use a "$" in front of the array variable name when accessing an array element
@days = ("Monday", "Tuesday", "Wednesday", "Thursday", "Friday"); $dayOfWeek = $days[1];"Tuesday"
Taking a Slice of an Array
- A unique feature is that you can extract more than one array element at a time
- The result is called a slice (quite naturally)
- You specify the indices of the desired elements between brackets
@array[m, n, ...]
- Here is an example:
@fruits = qw(lemons, peaches, oranges, apples, limes, bananas); @citrus = @fruits[0,2,4]; # these are equivalent ways ... @select = (0, 2, 4); @citrus = @fruits[@select]; # ... of specifying the slice("lemons", "oranges", "limes")
List Assignment
- A list literal that contains only scalar variables can be the target of an assignment
($mon) = @days; ($mon, $tue) = @days;
- Negative index numbers (starting with -1) access elements from the end of the array
$lastDay = $days[-1];
Replacing Elements in an Array
- To replace an element in an array, assign a new value to the element
$fruits[4] = "pear";
- You can replace several elements simultaneously by assigning an array with the new elements to a slice of the array
@fruits[0,2] = ("persimmon", "pommegranade");
Length of an Array
- Assigning an array to a scalar assigns it the length of the array
$length = @days;
- $#list returns the index of the last element in the array
- The following two expressions have therefore the same value
$days $#days + 1;
Processing All Array Elements
- The foreach loop is used to process the elements of an array
- For example, you can take the square root of all elements in the array with:
@numbers = (1 4 9 16 25); foreach $number (@numbers) { $number = sqrt($number); }(1, 2, 3, 4, 5)
- The loop variable becomes an alias for the array elements, one at a time (changing $number is in fact changing the values of the array elements)
List Operators
- Perl has four operators for adding new elements on one of the ends of an array, and removing them
- shift and unshift, which allow you to implement queues
- push and pop, which can be used to implement stacks in arrays
Using Arrays as Queues
- Use shift to remove elements from the beginning of an array
@cds = ("Motherland", "Stunt", "Reckless"); $first = shift(@cds);"Motherland"
- Use unshift to add elements to the beginning of an array
unshift(@cds, "Zooropa");("Zooropa", "Stunt", "Reckless")
Using Arrays as Stacks
- Use push to add elements to the end of an array
@list = ("h1", "table"); push(@list, "tr");("h1", "table", "tr")
- Use pop to remove elements from the end of an array
$last = pop(@list);"tr"
Concatenating Arrays
- You can concatenate arrays using the following syntax:
(@first, @second, ...)
- The result is a new array with the elements from all the other arrays
((1, 2, 3), ('a', 'b'))(1, 2, 3, 'a', 'b')
Splitting Strings into Parts
- The split operator breaks strings into parts using a specified substring as the basis for the split
- split is the opposite of join discussed earlier
- The resulting substrings are placed into an array
$query = "user=foo&password=bar"; @parameters = split /&/, $query;('user=foo', 'password=bar')
Sorting Arrays
- Perl provides the sort operator for sorting an array
sort @list; # ascending alphabetical order
- To sort in any other order, you need to provide a sort block (ie code that compares two elements, $a in one, $b in the other array)
sort {$b cmp $a} @list; # descending alphabetical order sort {$a <=> $b} @list; # ascending numerical order sort {$b <=> $a} @list; # descending numerical order
- Note that cmp and <=> operators used in the sort blocks, for strings and numbers respectively, return -1 for lt (<), 0 for eq (==), and 1 for gt (>)
Complete Example
- Here is a long example illustrating the use of arrays
$index = 0; while ($name = <STDIN>) { chomp $name; $names[$index++] = uc($name); # convert to uppercase } print "\nThe sorted list of names is:\n\n"; foreach $name (sort @names) { print "$name\n"; }Hilda Betrand Sophie Michael
The sorted list of names is:
BETRAND HILDA MICHAEL SOPHIE
Hashes
- An associative array or hash, is a collection of scalars in key-value pairs
- Names of h variables begin with a percent sign (%)
- Hashes are initialized with list literals
- The symbol => can be used between a key and its associated element
%phone = ( Bob => '1234', Alice => '2345', Mary => '3456' );
- Arrays can be assigned to hashes
- Elements with odd index become keys, and even ones values
%phone = ('Bob', '1234', 'Alice', '2345', 'Mary', '3456');
Getting and Setting Hash Elements
- Individual hash elements can be accessed using the key inside parentheses ({ and })
- References to a hash element begin with a $ sign, as they are a scalars
$query = $ENV{'QUERY_STRING'};
- It is possible to access multiple hash elements at the same time using similar syntax as used for arrays above
@numbers = @phone{'Bob', 'Alice'};('1234', '2345')
- You can set hash elements by assigning a value to the reference to a hash element
$phone{'Bob'} = '8888'; # Bob's new phone number
Getting the Keys and Values
- The keys operator returns all keys of a hash
@listings = keys %phone;('Bob', 'Alice', 'Mary')
- You can now process the elements in the hash
foreach $key (keys %phone) { print "$key ... $phone{key}\n"; }
- You can also get all the values with the values operator
@numbers = values %phones;('8888', '2345', '3456')
Accessing Keys and Values Simultaneously
- Commonly, you may want to access keys and values of a hash at the same time:
- Use the each operator
- Each time each is invoked, it returns the next key-value pair
while (($key, $value) = each %phone) { print "<p>The phone number for $key is $value.\n"; }<p>The phone number for Bob is 8888. <p>The phone number for Alice is 2345. <p>The phone number for Mary is 3456.
Checking if a Hash Contains a Key
- The exists operator checks if a key is part of a hash
$name = 'Joe'; if (exists $phone{$name)) { print "The directory contains a listing for $name."; } else { print "No listing for $name." }
Removing Key-Value Pairs
- The delete operator removes a key-value pair from a hash
- It returns the deleted value
$value = delete $phone{'Mary'};
Complete Example
- Assume that we have a file containing descriptions of CDs
- For each CD we list the artist, title, record label, and year of release
- The fields of each data record are separated by colons (":")
Bareneaked Ladies:Stunt:Warner:1998 Blue Rodeo:The Days in Between:2000 Great Big Sea:Up:Warner:1995 Natalie Merchant:Motherland:Elektra:2001 Rheostatics:Night of the Shooting Star:Drog:2001 The Grapes of Wrath: Now and Again:Capitol:1989
- The following script reads the records into a hash, in which the keys are years
#!/usr/bin/perl while () { chomp; $year = (split /:/)[3]; $years{$year}++; } foreach (sort keys %years) { print "In $_, $years{$_} CDs were released.\n"; }
- The resulting data structure and output are shown below
In 1989, 1 CDs were released. In 1995, 1 CDs were released. In 1998, 1 CDs were released. In 2000, 1 CDs were released. In 2001, 2 CDs were released.
References
- Arrays and hashes can only contain scalars
- Problem: How can we construct nested data structures?
- an array of an array (eg a matrix)
- a hash of a hash (eg a table with rows and columns)
- Solution: Create references to variables (or values) that point to the variables (or values)
- References themselves are scalars, and can therefore appear in an array or hash
References (Syntax)
- Prefix the variable name with "\" to create a reference to it
@fruits = qw(apple orange pear); $refFruits = \@fruits- This can be visualized as a pointer to the variable
![]()
- Later, the reference can be used to access the variable by prefixing it with a symbol for the type of value it refers to ("$" for scalars, "@" for arrays, and "%" for hashes)
- Note that there are also subroutine ("&") and file ("*") references
$fruit = @$refFruits[1];
Nested Data Structures
- For example, we want to create a directory of user profiles
- The directory should be realized as a hash, using the user name as its key
- Each user profile is itself represented by a hash
- Create the user profiles, then add references to them to the directory
%profile1 = (name => "Alice", email => "alice\@home.com"); %profile2 = (name => "Bob", email => "bob\@home.com"); $directory{"Alice"} = \%profile1; $directory{"Bob"} = \%profile2;
Accessing Elements
- Later, we can access profiles like this
$refProfile = $directory{"Alice"}; %profile = %$refProfile; $email = $profile{"email"};- This can be expressed more succinctly using a block (as defined earlier)
$email = ${ $directory{"Alice"} }{"email"};- The expression between the curly parentheses { and } can be any Perl expression that returns a reference (e.g. any function call)
Accessing Elements (Syntactic Sugar)
- Since this notation can become cumbersome, a simpler syntax is available
$email = $directory{"Alice"}->{"email"};- Syntax for using the "->" operator:
$refArray->[0] = "Bob"; # Array element $refHash->{"key"} = "value"; # Hash element $refCode->(1, 2, 3); # Subroutine call
Anonymous Arrays and Hashes
- Above we had to create intermediary variables that weren't used later on
- Instead, we can create references anonymously
- Syntax of an anonymous array:
$refArray = [ element-1, element-2, ... ];
- Syntax of an anonymous hash:
$refHash = { key-1 => "value-1", key-2 => "value-2", ... };
Anonymous Arrays and Hashes (Example)
- This is an example of an anymous array:
$refArray = [ 1, 2, 3 ];- And here is an example of an anoymous hash:
$refHash = { "Adam" => "Eve", "Clyde" => "Bonnie", };
Nested Data Structures (Again)
- Using anonymous hashes we can define the directory of user profiles succinctly
$directory = ( "Alice" => { name => "Alice", email => "alice\@home.com" }, "Bob" => { name => "Bob", email => "bob\@home.com" }, );
Nested Data Structures (Example)
- Suppose you are asked to create a matrix
- Since there is no matrix data type in perl, you use the following solution
- Use one array to store the rows of the matrix
- Create an anonymous array for each row
$matrix = ( [1, 2, 3], [0, 4, 5], [0, 0, 6], );- Access matrix elements using the "->" operator:
$matrix[0]->[0] == 1 $matrix[1]->[1] == 4 $matrix[2]->[2] == 6
Complete Example
- Given the same set of CD data records as above, we now create a data structure using references; for our reference, here is the data
Bareneaked Ladies:Stunt:Warner:1998 Blue Rodeo:The Days in Between:2000 Great Big Sea:Up:Warner:1995 Natalie Merchant:Motherland:Elektra:2001 Rheostatics:Night of the Shooting Star:Drog:2001 The Grapes of Wrath: Now and Again:Capitol:1989
- The following script reads the records into an array of hashes
#!/usr/bin/perl # parse @attrs = qw(artist title label year); while () { chomp; my %rec; # creates a new local variable, see section on # subroutines for a full discussion @rec{@attrs} = split /:/; push @cds, \%rec; } # munge foreach (@cds) { $artists{$_->{artist}}++; } # output foreach (sort keys %artists) { print "$_: $artists{$_}\n"; }
- This results in a more flexible representation of the data
- All the data is stored in the data structure (not just what was needed for a particular application)
- No assumptions are made on how the data is accessed (whether by year or by artist doesn't matter)
![]()
- This is an example of separating parsing from processing or "munging" the data