Posted on Leave a comment

Demonstrating Perl with Tic-Tac-Toe, Part 4

This is the final article to the series demonstrating Perl with Tic-Tac-Toe. This article provides a module that can compute better game moves than the previously presented modules. For fun, the modules chip1.pm through chip3.pm can be incrementally moved out of the hal subdirectory in reverse order. With each chip that is removed, the game will become easier to play. The game must be restarted each time a chip is removed.

An example Perl program

Copy and paste the below code into a plain text file and use the same one-liner that was provided in the the first article of this series to strip the leading numbers. Name the version without the line numbers chip3.pm and move it into the hal subdirectory. Use the version of the game that was provided in the second article so that the below chip will automatically load when placed in the hal subdirectory. Be sure to also include both chip1.pm and chip2.pm from the second and third articles, respectively, in the hal subdirectory.

00 # artificial intelligence chip
01 02 package chip3;
03 require chip2;
04 require chip1;
05 06 use strict;
07 use warnings;
08 09 sub moverama {
10 my $game = shift;
11 my @nums = $game =~ /[1-9]/g;
12 my $rama = qr/[1973]/;
13 my %best;
14 15 for (@nums) {
16 my $ra = $_;
17 next unless $ra =~ $rama;
18 $best{$ra} = 0;
19 for (@nums) {
20 my $ma = $_;
21 next unless $ma =~ $rama;
22 if (($ra-$ma)*(10-$ra-$ma)) {
23 $best{$ra} += 1;
24 }
25 }
26 }
27 28 @nums = sort { $best{$b} <=> $best{$a} } keys %best;
29 30 return $nums[0];
31 }
32 33 sub hal_move {
34 my $game = shift;
35 my $mark = shift;
36 my @mark = @{ shift; };
37 my $move;
38 39 $move = chip2::win_move $game, $mark, \@mark;
40 41 if (not defined $move) {
42 $mark = ($mark eq $mark[0]) ? $mark[1] : $mark[0];
43 $move = chip2::win_move $game, $mark, \@mark;
44 }
45 46 if (not defined $move) {
47 $move = moverama $game;
48 }
49 50 if (not defined $move) {
51 $move = chip1::hal_move $game;
52 }
53 54 return $move;
55 }
56 57 sub complain {
58 print 'Just what do you think you\'re doing, ',
59 ((getpwnam($ENV{'USER'}))[6]||$ENV{'USER'}) =~ s! .*!!r, "?\n";
60 }
61 62 sub import {
63 no strict;
64 no warnings;
65 66 my $p = __PACKAGE__;
67 my $c = caller;
68 69 *{ $c . '::hal_move' } = \&{ $p . '::hal_move' };
70 *{ $c . '::complain' } = \&{ $p . '::complain' };
71 72 if (&::MARKS->[0] ne &::HAL9K) {
73 @{ &::MARKS } = reverse @{ &::MARKS };
74 }
75 }
76 77 1;

How it works

Rather than making a random move or making a move based on probability, this final module to the Perl Tic-Tac-Toe game uses a more deterministic algorithm to calculate the best move.

The big takeaway from this Perl module is that it is yet another example of how references can be misused or abused, and as a consequence lead to unexpected program behavior. With the addition of this chip, the computer learns to cheat. Can you figure out how it is cheating? Hints:

  1. Constants are implemented as subroutines.
  2. References allow data to be modified out of scope.

Final notes

Line 12 demonstrates that a regular expression can be pre-compiled and stored in a scalar for later use. This is useful as performance optimization when you intend to re-use the same regular expression many times over.

Line 59 demonstrates that some system library calls are available directly in Perl’s built-in core functionality. Using the built-in functions alleviates some overhead that would otherwise be required to launch an external program and setup the I/O channels to communicate with it.

Lines 72 and 73 demonstrate the use of &:: as a shorthand for &main::.

The full source code for this Perl game can be cloned from the git repository available here: https://pagure.io/tic-tac-toe.git

Posted on Leave a comment

Demonstrating Perl with Tic-Tac-Toe, Part 3

The articles in this series have mainly focused on Perl’s ability to manipulate text. Perl was designed to manipulate and analyze text. But Perl is capable of much more. More complex problems often require working with sets of data objects and indexing and comparing them in elaborate ways to compute some desired result.

For working with sets of data objects, Perl provides arrays and hashes. Hashes are also known as associative arrays or dictionaries. This article will prefer the term hash because it is shorter.

The remainder of this article builds on the previous articles in this series by demonstrating basic use of arrays and hashes in Perl.

An example Perl program

Copy and paste the below code into a plain text file and use the same one-liner that was provided in the the first article of this series to strip the leading numbers. Name the version without the line numbers chip2.pm and move it into the hal subdirectory. Use the version of the game that was provided in the second article so that the below chip will automatically load when placed in the hal subdirectory.

00 # advanced operations chip
01 02 package chip2;
03 require chip1;
04 05 use strict;
06 use warnings;
07 08 use constant SCORE=>'
09 ┌───┬───┬───┐
10 │ 3 │ 2 │ 3 │
11 ├───┼───┼───┤
12 │ 2 │ 4 │ 2 │
13 ├───┼───┼───┤
14 │ 3 │ 2 │ 3 │
15 └───┴───┴───┘
16 ';
17 18 sub get_prob {
19 my $game = shift;
20 my @nums;
21 my %odds;
22 23 while ($game =~ /[1-9]/g) {
24 $odds{$&} = substr(SCORE, $-[0], 1);
25 }
26 27 @nums = sort { $odds{$b} <=> $odds{$a} } keys %odds;
28 29 return $nums[0];
30 }
31 32 sub win_move {
33 my $game = shift;
34 my $mark = shift;
35 my $tkns = shift;
36 my @nums = $game =~ /[1-9]/g;
37 my $move;
38 39 TRY: for (@nums) {
40 my $num = $_;
41 my $try = $game =~ s/$num/$mark/r;
42 my $vic = chip1::get_victor $try, $tkns;
43 44 if (defined $vic) {
45 $move = $num;
46 last TRY;
47 }
48 }
49 50 return $move;
51 }
52 53 sub hal_move {
54 my $game = shift;
55 my $mark = shift;
56 my @mark = @{ shift; };
57 my $move;
58 59 $move = win_move $game, $mark, \@mark;
60 61 if (not defined $move) {
62 $mark = ($mark eq $mark[0]) ? $mark[1] : $mark[0];
63 $move = win_move $game, $mark, \@mark;
64 }
65 66 if (not defined $move) {
67 $move = get_prob $game;
68 }
69 70 return $move;
71 }
72 73 sub complain {
74 print "My mind is going. I can feel it.\n";
75 }
76 77 sub import {
78 no strict;
79 no warnings;
80 81 my $p = __PACKAGE__;
82 my $c = caller;
83 84 *{ $c . '::hal_move' } = \&{ $p . '::hal_move' };
85 *{ $c . '::complain' } = \&{ $p . '::complain' };
86 }
87 88 1;

How it works

In the above example Perl module, each position on the Tic-Tac-Toe board is assigned a score based on the number of winning combinations that intersect it. The center square is crossed by four winning combinations – one horizontal, one vertical, and two diagonal. The corner squares each intersect one horizontal, one vertical, and one diagonal combination. The side squares each intersect one horizontal and one vertical combination.

The get_prob subroutine creates a hash named odds (line 21) and uses it to map the numbers on the current game board to their score (line 24). The keys of the hash are then sorted by their score and the resulting list is copied to the nums array (line 27). The get_prob subroutine then returns the first element of the nums array ($nums[0]) which is the number from the original game board that has the highest score.

The algorithm described above is an example of what is called a heuristic in artificial intelligence programming. With the addition of this module, the Tic-Tac-Toe game can be considered a very rudimentary artificial intelligence program. It is really just playing the odds though and it is quite beatable. The next module (chip3.pm) will provide an algorithm that actually calculates the best possible move based on the opponent’s counter moves.

The win_move subroutine simply tries placing the provided mark in each available position and passing the resulting game board to chip1’s get_victor subroutine to see if it contains a winning combination. Notice that the r flag is being passed to the substitution operation (s/$num/$mark/r) on line 41 so that, rather than modifying the original game board, a new copy of the board containing the substitution is created and returned.

Arrays

It was mentioned in part one that arrays are variables whose names are prefixed with an at symbol (@) when they are created. In Perl, these prefixed symbols are called sigils.

Context

In Perl, many things return a different value depending on the context in which they are accessed. The two contexts to be aware of are called scalar context and list context. In the following example, $value1 and $value2 are different because @nums is accessed first in scalar context and then in list context.

$value1 = @nums;
($value2) = @nums;

In the above example, it might seem like @nums should return the same value each time it is accessed, but it doesn’t because what is accessing it (the context) is different. $value1 is a scalar, so it receives the scalar value of @nums which is its length. ($value2) is a list, so it receives the list value of @nums. In the above example, $value2 will receive the value of the first element of the nums array.

In part one, the below statement from the get_mark subroutine copied the numbers from the current Tic-Tac-Toe board into an array named nums.

@nums = $game =~ /[1-9]/g

Since the nums array in the above statement receives one copy of each board number in each of its elements, the count of the board numbers is equal to the length of the array. In Perl, the length of an array is obtained by accessing it in scalar context.

Next, the following formula was used to compute which mark should be placed on the Tic-Tac-Toe board in the next turn.

$indx = (@nums+1) % 2;

Because the plus operator requires a single value (a scalar) on its left hand side, not a list of values, the nums array evaluates to its length, not the list of its values. The parenthesis, in the above example, are just being used to set the order of operations so that the addition (+) will happen before the modulo (%).

Copying

In Perl you can create a list for immediate use by surrounding the list values with parenthesis and separating them with commas. The following example creates a three-element list and copies its values to an array.

@nums = (4, 5, 6);

As long as the elements of the list are variables and not constants, you can also copy the elements of an array to a list:

($four, $five, $six) = @nums;

If there were more elements in the array than the list in the above example, the extra elements would simply be discarded.

Different from lists in scalar context

Be aware that lists and arrays are different things in Perl. A list accessed in scalar context returns its last value, not its length. In the following example, $value3 receives 3 (the length of @nums) while $value4 receives 6 (the last element of the list).

$value3 = @nums;
$value4 = (4, 5, 6);

Indexing

To access an individual element of an array or list, suffix it with the desired index in square brackets as shown on line 29 of the above example Perl module.

Notice that the nums array on line 29 is prefixed with the dollar sigil ($) rather than the at sigil (@). This is done because the get_prob subroutine is supposed to return a single value, not a list. If @nums[0] were used instead of $nums[0], the subroutine would return a one-element list. Since a list evaluates to its last element in scalar context, this program would probably work if I had used @nums[0], but if you mean to retrieve a single element from an array, be sure to use the dollar sigil ($), not the at sigil (@).

It is possible to retrieve a subset from an array (or a list) rather than just one value in which case you would use the at sigil and you would provide a series of indexes or a range instead of a single index. This is what is known in Perl as a list slice.

Hashes

Hashes are variables whose names are prefixed with the percent sigil (%) when they are created. They are subscripted with curly brackets ({}) when accessing individual elements or subsets of elements (hash slices). Like arrays, hashes are variables that can hold multiple discrete data elements. They differ from arrays in the following ways:

  1. Hashes are indexed by strings (or anything that can be converted to a string), not numbers.
  2. Hashes are unordered. If you retrieve a list of their keys, values or key-value pairs, the order of the listing will be random.
  3. The number of elements in the hash will be equal to the number of keys that have been assigned values. If a value is assigned to index 99 of an array that has only three elements (indexes 0-2), the array will grow to a length of 100 elements (indexes 0-99). If a value is assigned to a new key in a hash that has only three elements, the hash will grow by only one element.

As with arrays, if you mean to access (or assign to) a single element of a hash, you should prefix it with the dollar sigil ($). When accessing a single element, Perl will go by the type of the subscript to determine the type of variable being accessed – curly brackets ({}) for hashes or square brackets ([]) for arrays. The get_prob subroutine in the above Perl module demonstrates assigning to and accessing individual elements of a hash.

Perl has two special built-in functions for working with hashes – keys and values. The keys function, when provided a hash, returns a list of all the hash’s keys (indexes). Similarly, the values function will return a list of all the hash’s values. Remember though that the order in which the list is returned is random. This randomness can be seen when playing the Tic-Tac-Toe game. If there is more than one move available with the highest score, the computer will chose one at random because the keys function returns the available moves from the odds hash in random order.

On line 27 of the above example Perl module, the keys function is being used to retrieve the list of keys from the odds hash. The keys of the odds hash are the numbers that were found on the current game board. The values of the odds hash are the corresponding probabilities that were retrieved from the SCORE constant on line 24.

Admittedly, this example could have used an array instead of a string to store and retrieve the scores. I chose to use a string simply because I think it presents the layout of the board a little nicer. An array would likely perform better, but with such a small data set, the difference is probably too small to measure.

Sort

On line 27, the list of keys from the odds hash is being feed to Perl’s built-in sort function. Beware that Perl’s sort function sorts lexicographically by default, not numerically. For example, provided the list (10, 9, 8, 1), Perl’s sort function will return the list (1, 10, 8, 9).

The behavior of Perl’s sort function can be modified by providing it a code block as its first parameter as demonstrated on line 27. The result of the last statement in the code block should be a number less-than, equal-to, or greater-than zero depending on whether element $a should be placed before, concurrent-with, or after element $b in the resulting list respectively. $a and $b are pairs of elements from the provided list. The code in the block is executed repeatedly with $a and $b set to different pairs of elements from the original list until all the pairs have been compared and sorted.

The <=> operator is a special Perl operator that returns -1, 0, or 1 depending on whether the left argument is numerically less-than, equal-to, or greater-than the right argument respectively. By using the <=> operator in the code block of the sort function, Perl’s sort function can be made to sort numerically rather than lexicographically.

Notice that rather than comparing $a and $b directly, they are first being passed through the odds hash. Since the values of the odds hash are the probabilities that were retrieved from the SCORE constant, what is being compared is actually the score of $a versus the score of $b. Consequently, the numbers from the original game board are being sorted by their score, not their value. Numbers with an equal score are left in the same random order that the keys function returned them.

Notice also that I have reversed the typical order of the parameters to <=> in the code block of the sort function ($b on the left and $a on the right). By switching their order in this way, I have caused the sort function to return the elements in reverse order – from greatest to least – so that the number(s) with the highest score will be first in the list.

References

References provide an indirect means of accessing a variable. They are often used when making copies of the variable is either undesirable or impractical. References are a sort of short cut that allows you to skip performing the copy and instead provide access to the original variable.

Why to use references

There is a cost in time and memory associated with making copies of variables. References are sometimes used as a means of reducing that cost. Be aware, however, that recent versions of Perl implement a technology called copy-on-write that greatly reduces the cost of copying variables. This new optimization should work transparently. You don’t have to do anything special to enable the copy-on-write optimization.

Why not to use references

References violate the action-at-a-distance principle that was mentioned in part one of this series. References are just as bad as global variables in terms of their tendency to trip up programmers by allowing data to be modified outside the local scope. You should generally try to avoid using references. But there are times when they are necessary.

How to create references

An example of passing a reference is provided on line 59 of the above Perl module. Rather than placing the mark array directly in the list of parameters to the win_move subroutine, a reference to the array is provided instead by prefixing the variable’s sigil with a backslash (\).

It is necessary to use a reference (\@mark) on line 59 because if the array were placed directly on the list, it would expand such that the first element of the mark array would become the third parameter to the win_move function, the second element of the mark array would become the fourth parameter to the win_move function, and so on for as many elements as the mark array has. Whereas an array will expand in list context, a reference will not. If the array were passed in expanded form, the receiving subroutine would need to call shift once for each element of the array. Also, the receiving function would not be able to tell how long the original array was.

Three ways to dereference references

In the receiving subroutine, the reference has to be dereferenced to get at the original values. An example of dereferencing an array reference is provided on line 56. On line 56, the shift statement has been enclosed in curly brackets and the opening bracket has been prefixed with the array sigil (@).

There is also a shorter form for dereferencing an array reference that is demonstrated on line 43 of the chip1.pm module. The short form allows you to omit the curly brackets and instead place the array sigil directly in front of the sigil of the scalar that holds the array reference. The short form only works when you have an array reference stored in a scalar. When the array reference is coming from a function, as it is on line 56 of the above Perl module, the long form must be used.

There is yet a third way of dereferencing an array reference that is demonstrated on line 29 of the game script. Line 29 shows the MARKS array reference being dereferenced with the arrow operator (->) and an index enclosed in square brackets. The MARKS array reference is missing its sigil because it is a constant. You can tell that what is being dereferenced is an array reference because the arrow operator is followed by square brackets ([]). Had the MARKS constant been a hash reference, the arrow operator would have been followed by curly brackets ({}).

There are also corresponding long and short forms for dereferencing hash references that use the hash sigil (%) instead of the array sigil. Note also that hashes, just like arrays, need to be passed by reference to subroutines unless you want them to expand into their constituent elements. The latter is sometimes done in Perl as a clever way of emulating named parameters.

A word of caution about references

It was stated earlier that references allow data to be modified outside of their declared scope and, just as with global variables, this non-local manipulation of the data can be confusing to the programmer(s) and thereby lead to unintended bugs. This is an important point to emphasize and explain.

On line 35 of the win_move subroutine, you can see that I did not dereference the provided array reference (\@mark) but rather I chose to store the reference in a scalar named tkns. I did this because I do not need to access the individual elements of the provided array in the win_move subroutine. I only need to pass the reference on to the get_victor subroutine. Not making a local copy of the array is a short cut, but it is dangerous. Because $tkns is only a copy of the reference, not a copy of the original data being referred to, if I or a later program developer were to write something like $tkns->[0] = ‘Y’ in the win_move subroutine, it would actually modify the value of the mark array in the hal_move subroutine. By passing a reference to its mark array (\@mark) to the win_move subroutine, the hal_move subroutine has granted access to modify its local copy of @mark. In this case, it would probably be better to make a local copy of the mark array in the win_move subroutine using syntax similar to what is shown on line 56 rather than preserving the reference as I have done for the purpose of demonstration on line 35.

Aliases

In addition to references, there is another way that a local variable created with the my or state keyword can leak into the scope of a called subroutine. The list of parameters that you provide to a subroutine is directly accessible in the @_ array.

To demonstrate, the following example script prints b, not a, because the inc subroutine accesses the first element of @_ directly rather than first making a local copy of the parameter.

#!/usr/bin/perl sub inc { $_[0]++;
} MAIN: { my $var = 'a'; inc $var; print "$var\n";
}

Aliases are different from references in that you don’t have to dereference them to get at their values. They really are just alternative names for the same variable. Be aware that aliases occur in a few other places as well. One such place is the list returned from the sort function – if you were to modify an element of the returned list directly, without first copying it to another variable, you would actually be modifying the element in the original list that was provided to the sort function. Other places where aliases occur include the code blocks of functions like grep and map. The grep and map functions are not covered in this series of articles. See the provided links if you want to know more about them.

Final notes

Many of Perl’s built-in functions will operate on the default scalar ($_) or default array (@_) if they are not explicitly provided a variable to read from or write to. Line 40 of the above Perl module provides an example. The numbers from the nums array are sequentially aliased to $_ by the for keyword. If you chose to use these variables, in most cases you will probably want to retrieve your data from $_ or @_ fairly quickly to prevent it being accidentally overwritten by a subsequent command.

The substitution command (s/…/…/), for example, will manipulate the data stored in $_ if it is not explicitly bound to another variable by one of the =~ or !~ operators. Likewise, the shift function operates on @_ (or @ARGV if called in the global scope) if it is not explicitly provided an array to operate on. There is no obvious rule to which functions support this shortcut. You will have to consult the documentation for the command you are interested in to see if it will operate on a default variable when not provided one explicitly.

As demonstrated on lines 55 and 56, the same name can be reused for variables of different types. Reusing variable names generally makes the code harder to follow. It is probably better for the sake of readability to avoid variable name reuse.

Beware that making copies of arrays or hashes in Perl (as demonstrated on line 56) is shallow by default. If any of the elements of the array or hash are references, the corresponding elements in the duplicated array or hash will be references to the same original data. To make deep copies of data structures, use one of the Clone or Storable Perl modules. An alternative workaround that may work in the case of multi-dimensional arrays is to emulate them with a one-dimensional hash.

Similar in form to Perl’s syntax for creating lists – (1, 2, 3) – unnamed array references and unnamed hash references can be constructed on the fly by bounding a comma-separated set of elements in square brackets ([]) or curly brackets ({}) respectively. Line 07 of the game script demonstrates an unnamed (anonymous) array reference being constructed and assigned to the MARKS constant.

Notice that the import subroutine at the end of the above Perl module (chip2.pm) is assigning to some of the same names in the calling namespace as the previous module (chip1.pm). This is intentional. The hal_move and complain aliases created by chip1’s import subroutine will simply be overridden by the identically named aliases created by chip2’s import subroutine (assuming chip2.pm is loaded after chip1.pm in the calling namespace). Only the aliases are updated/overridden. The original subroutines from chip1 will still exist and can still be called with their full names – chip1::hal_move and chip1::complain.

Posted on Leave a comment

Demonstrating PERL with Tic-Tac-Toe, Part 2

The astute observer may have noticed that PERL is misspelled. In a March 1, 1999 interview with Linux Journal, Larry Wall explained that he originally intended to include the letter “A” from the word “And” in the title “Practical Extraction And Report Language” such that the acronym would correctly spell the word PEARL. However, before he released PERL, Larry heard that another programming language had already taken that name. To resolve the name collision, he dropped the “A”. The acronym is still valid because title case and acronyms allow articles, short prepositions and conjunctions to be omitted (compare for example the acronym LASER).

Name collisions happen when distinct commands or variables with the same name are merged into a single namespace. Because Unix commands share a common namespace, two commands cannot have the same name.

The same problem exists for the names of global variables and subroutines within programs written in languages like PERL. This is an especially significant problem when programmers try to collaborate on large software projects or otherwise incorporate code written by other programmers into their own code base.

Starting with version 5, PERL supports packages. Packages allow PERL code to be modularized with unique namespaces so that the global variables and functions of the modularized code will not collide with the variables and functions of another script or module.

Shortly after its release, PERL5 software developers all over the world began writing software modules to extend PERL’s core functionality. Because many of those developers (currently about 15,000) have made their work freely available on the Comprehensive Perl Archive Network (CPAN), you can easily extend the functionality of PERL on your PC so that you can perform very advanced and complex tasks with just a few commands.

The remainder of this article builds on the previous article in this series by demonstrating how to install, use and create PERL modules on Fedora Linux.

An example PERL program

See the example program from the previous article below, with a few lines of code added to import and use some modules named chip1, chip2 and chip3. It is written in such a way that the program should work even if the chip modules cannot be found. Future articles in this series will build on the below script by adding the additional modules named chip2 and chip3.

You should be able to copy and paste the below code into a plain text file and use the same one-liner that was provided in the previous article to strip the leading numbers.

00 #!/usr/bin/perl
01 02 use strict;
03 use warnings;
04 05 use feature 'state';
06 07 use constant MARKS=>[ 'X', 'O' ];
08 use constant HAL9K=>'O';
09 use constant BOARD=>'
10 ┌───┬───┬───┐
11 │ 1 │ 2 │ 3 │
12 ├───┼───┼───┤
13 │ 4 │ 5 │ 6 │
14 ├───┼───┼───┤
15 │ 7 │ 8 │ 9 │
16 └───┴───┴───┘
17 ';
18 19 use lib 'hal';
20 use if -e 'hal/chip1.pm', 'chip1';
21 use if -e 'hal/chip2.pm', 'chip2';
22 use if -e 'hal/chip3.pm', 'chip3';
23 24 sub get_mark {
25 my $game = shift;
26 my @nums = $game =~ /[1-9]/g;
27 my $indx = (@nums+1) % 2;
28 29 return MARKS->[$indx];
30 }
31 32 sub put_mark {
33 my $game = shift;
34 my $mark = shift;
35 my $move = shift;
36 37 $game =~ s/$move/$mark/;
38 39 return $game;
40 }
41 42 sub get_move {
43 return (<> =~ /^[1-9]$/) ? $& : '0';
44 }
45 46 PROMPT: {
47 no strict;
48 no warnings;
49 50 state $game = BOARD;
51 52 my $mark;
53 my $move;
54 55 print $game;
56 57 if (defined &get_victor) {
58 my $victor = get_victor $game, MARKS;
59 if (defined $victor) {
60 print "$victor wins!\n";
61 complain if ($victor ne HAL9K);
62 last PROMPT;
63 }
64 }
65 66 last PROMPT if ($game !~ /[1-9]/);
67 68 $mark = get_mark $game;
69 print "$mark\'s move?: ";
70 71 if ($mark eq HAL9K and defined &hal_move) {
72 $move = hal_move $game, $mark, MARKS;
73 print "$move\n";
74 } else {
75 $move = get_move;
76 }
77 $game = put_mark $game, $mark, $move;
78 79 redo PROMPT;
80 }

Once you have the above code downloaded and working, create a subdirectory named hal under the same directory that you put the above program. Then copy and paste the below code into a plain text file and use the same procedure to strip the leading numbers. Name the version without the line numbers chip1.pm and move it into the hal subdirectory.

00 # basic operations chip
01 02 package chip1;
03 04 use strict;
05 use warnings;
06 07 use constant MAGIC=>'
08 ┌───┬───┬───┐
09 │ 2 │ 9 │ 4 │
10 ├───┼───┼───┤
11 │ 7 │ 5 │ 3 │
12 ├───┼───┼───┤
13 │ 6 │ 1 │ 8 │
14 └───┴───┴───┘
15 ';
16 17 use List::Util 'sum';
18 use Algorithm::Combinatorics 'combinations';
19 20 sub get_moves {
21 my $game = shift;
22 my $mark = shift;
23 my @nums;
24 25 while ($game =~ /$mark/g) {
26 push @nums, substr(MAGIC, $-[0], 1);
27 }
28 29 return @nums;
30 }
31 32 sub get_victor {
33 my $game = shift;
34 my $marks = shift;
35 my $victor;
36 37 TEST: for (@$marks) {
38 my $mark = $_;
39 my @nums = get_moves $game, $mark;
40 41 next unless @nums >= 3;
42 for (combinations(\@nums, 3)) {
43 my @comb = @$_;
44 if (sum(@comb) == 15) {
45 $victor = $mark;
46 last TEST;
47 }
48 }
49 }
50 51 return $victor;
52 }
53 54 sub hal_move {
55 my $game = shift;
56 my @nums = $game =~ /[1-9]/g;
57 my $rand = int rand @nums;
58 59 return $nums[$rand];
60 }
61 62 sub complain {
63 print "Daisy, Daisy, give me your answer do.\n";
64 }
65 66 sub import {
67 no strict;
68 no warnings;
69 70 my $p = __PACKAGE__;
71 my $c = caller;
72 73 *{ $c . '::get_victor' } = \&{ $p . '::get_victor' };
74 *{ $c . '::hal_move' } = \&{ $p . '::hal_move' };
75 *{ $c . '::complain' } = \&{ $p . '::complain' };
76 }
77 78 1;

The first thing that you will probably notice when you try to run the program with chip1.pm in place is an error message like the following (emphasis added):

$ Can't locate Algorithm/Combinatorics.pm in @INC (you may need to install the Algorithm::Combinatorics module) (@INC contains: hal /usr/local/lib64/perl5/5.30 /usr/local/share/perl5/5.30 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5) at hal/chip1.pm line 17.
BEGIN failed--compilation aborted at hal/chip1.pm line 17.
Compilation failed in require at /usr/share/perl5/if.pm line 15.
BEGIN failed--compilation aborted at game line 18.

When you see an error like the one above, just use the dnf command to search Fedora’s package repository for the name of the system package that provides the needed PERL module as shown below. Note that the module name and path from the above error message have been prefixed with */ and then surrounded with single quotes.

$ dnf provides '*/Algorithm/Combinatorics.pm'
...
perl-Algorithm-Combinatorics-0.27-17.fc31.x86_64 : Efficient generation of combinatorial sequences
Repo : fedora
Matched from:
Filename : /usr/lib64/perl5/vendor_perl/Algorithm/Combinatorics.pm

Hopefully it will find the needed package which you can then install:

$ sudo dnf install perl-Algorithm-Combinatorics

Once you have all the needed modules installed, the program should work.

How it works

This example is admittedly quite contrived. Nothing about Tic-Tac-Toe is complex enough to need a CPAN module. To demonstrate installing and using a non-standard module, the above program uses the combinations library routine from the Algorithm::Combinatorics module to generate a list of the possible combinations of three numbers from the provided set. Because the board numbers have been mapped to a 3×3 magic square, any set of three numbers that sum to 15 will be aligned on a column, row or diagonal and will therefore be a winning combination.

Modules are imported into a program with the use and require commands. The only difference between them is that the use command automatically calls the import subroutine (if one exists) in the module being imported. The require command does not automatically call any subroutines.

Modules are just files with a .pm extension that contain PERL subroutines and variables. They begin with the package command and end with 1;. But otherwise, they look like any other PERL script. The file name should match the package name. Package and file names are case sensitive.

Beware that when you are reading online documentation about PERL modules, the documentation often veers off into topics about classes. Classes are built on modules, but a simple module does not have to adhere to all the restrictions that apply to classes. When you start seeing words like method, inheritance and polymorphism, you are reading about classes, not modules.

There are two subroutine names that are reserved for special use in modules. They are import and unimport and they are called by the use and no directives respectively.

The purpose of the import and unimport subroutines is typically to alias and unalias the module’s subroutines in and out of the calling namespace respectively. For example, line 17 of chip1.pm shows the sum subroutine being imported from the List::Util module.

The constant module, as used on lines 07 of chip1.pm, is also altering the caller’s namespace (chip1), but rather than importing a predefined subroutine, it is creating a special type of variable.

All the identifiers immediately following the use keywords in the above examples are modules. On my system, many of them can be found under the /usr/share/perl5 directory.

Notice that the above error message states “@INC contains:” followed by a list of directories. INC is a special PERL variable that lists, in order, the directories from which modules should be loaded. The first file found with a matching name will be used.

As demonstrated on line 19 of the Tic-Tac-Toe game, the lib module can be used to update the list of directories in the INC variable.

The chip1 module above provides an example of a very simple import subroutine. In most cases you will want to use the import subroutine that is provided by the Exporter module rather than implementing your own. A custom import subroutine is used in the above example to demonstrate the basics of what it does. Also, the custom implementation makes it easy to override the subroutine definitions in later examples.

The import subroutine shown above reveals some of the hidden magic that makes packages work. All variables that are both globally scoped (that is, created outside of any pair of curly brackets) and dynamically scoped (that is, not prefixed with the keywords my or state) and all global subroutines are automatically prefixed with a package name. The default package name if no package command has been issued is main.

By default, the current package is assumed when an unqualified variable or subroutine is used. When get_move is called from the PROMPT block in the above example, main::get_move is assumed because the PROMPT block exists in the main package. Likewise, when get_moves is called from the get_victor subroutine, chip1::get_moves is assumed because get_victor exists in the chip1 package.

If you want to access a variable or subroutine that exists in a different package, you either have to use its fully qualified name or create a local alias that refers to the desired subroutine.

The import subroutine shown above demonstrates how to create subroutine aliases that refer to subroutines in other packages. On lines 73-75, the fully qualified names for the subroutines are being constructed and then the symbol table name for the subroutine in the calling namespace (the package in which the use statement is being executed) is being assigned the reference of the subroutine in the local package (the package in which the import subroutine is defined).

Notice that subroutines, like variables, have sigils. The sigil for subroutines is the ampersand symbol (&). In most contexts, the sigil for subroutines is optional. When working with references (as shown on lines 73-75 of the import subroutine) and when checking if a subroutine is defined (as shown on lines 57 and 71 of the PROMPT block), the sigil for subroutines is required.

The import subroutine shown above is just a bare minimum example. There is a lot that it doesn’t do. In particular, a proper import subroutine would not automatically import any subroutines or variables. Normally, the user would be expected to provide a list of the routines to be imported on the use line and that list is available to the import subroutine in the @_ array.

Final notes

Lines 25-27 of chip1.pm provide a good example of PERL’s dense notation problem. With just a couple of lines code, the board numbers on which a given mark has been placed can be determined. But does the statement within the conditional clause of the while loop perform the search from the beginning of the game variable on each iteration? Or does it continue from where it left off each time? PERL correctly guesses that I want it to provide the position ($-[0]) of the next mark, if any exits, on each iteration. But exactly what it will do can be very difficult to determine just by looking at the code.

The last things of note in the above examples are the strict and warnings directives. They enable extra compile-time and runtime debugging messages respectively. Many PERL programmers recommend always including these directives so that programming errors are more likely to be spotted. The downside of having them enabled is that some complex code will sometimes cause the debugger to erroneously generate unwanted output. Consequently, the strict and/or warnings directives may need to be disabled in some code blocks to get your program to run correctly as demonstrated on lines 67 and 68 of the example chip1 module. The strict and warnings directives have nothing to do with the program and they can be omitted. Their only purpose is to provide feedback to the program developer.

Posted on Leave a comment

Demonstrating PERL with Tic-Tac-Toe, Part 1

Larry Wall’s Practical Extraction and Reporting Language (PERL) was originally developed in 1987 as a general-purpose Unix scripting language that borrowed features from C, sh, awk, sed, BASIC, and LISP. In the late 1990s, before PHP became more popular, PERL was commonly used for CGI scripting. PERL is still the go-to tool for many sysadmins who need something more powerful than sed or awk when writing complex parsing and automation scripts. It has a somewhat high learning curve due to its dense notation. But a recent survey indicates that PERL developers earn 54 per cent more than the average developer. So it may still be a worthwhile language to learn.

PERL is far too complex to cover in any significant detail in this magazine. But this short series of articles will attempt to demonstrate a few of the most basic features of the language so that you can get a sense of what the language is like and the kind of things it can do.

An example PERL program

PERL was originally a language optimized for scanning arbitrary text files, extracting information from those text files, and printing reports based on that information. To demonstrate how this core feature of PERL works, a very simple Tic-Tac-Toe game is provided below. The below program scans a textual representation of a Tic-Tac-Toe board, extracts and manipulates the numbers on the board, and prints the modified result to the console.

00 #!/usr/bin/perl
01 02 use feature 'state';
03 04 use constant MARKS=>[ 'X', 'O' ];
05 use constant BOARD=>'
06 ┌───┬───┬───┐
07 │ 1 │ 2 │ 3 │
08 ├───┼───┼───┤
09 │ 4 │ 5 │ 6 │
10 ├───┼───┼───┤
11 │ 7 │ 8 │ 9 │
12 └───┴───┴───┘
13 ';
14 15 sub get_mark {
16 my $game = shift;
17 my @nums = $game =~ /[1-9]/g;
18 my $indx = (@nums+1) % 2;
19 20 return MARKS->[$indx];
21 }
22 23 sub put_mark {
24 my $game = shift;
25 my $mark = shift;
26 my $move = shift;
27 28 $game =~ s/$move/$mark/;
29 30 return $game;
31 }
32 33 sub get_move {
34 return (<> =~ /^[1-9]$/) ? $& : '0';
35 }
36 37 PROMPT: {
38 state $game = BOARD;
39 40 my $mark;
41 my $move;
42 43 print $game;
44 45 last PROMPT if ($game !~ /[1-9]/);
46 47 $mark = get_mark $game;
48 print "$mark\'s move?: ";
49 50 $move = get_move;
51 $game = put_mark $game, $mark, $move;
52 53 redo PROMPT;
54 }

To try out the above program on your PC, you can copy-and-paste the above text into a plain text file and save and run it. The line numbers will have to be removed before the program will work. Of course, the command that one uses to perform that sort of textual extraction and reporting is perl.

Assuming that you have saved the above text to a file named game.txt, the following command can be used to strip the leading numbers from all the lines and write the modified version to a new file named game:

$ cat game.txt | perl -npe 's/...//' > game

The above command is a very small PERL script and it is an example of what is called a one-liner.

Now that the line numbers have been removed, the program can be run by entering the following command:

$ perl game

How it works

PERL is a procedural programming language. A program written in PERL consists of a series of commands that are executed sequentially. With few exceptions, most commands alter the state of the computer’s memory in some way.

Line 00 in the Tic-Tac-Toe program isn’t technically part of the PERL program and it can be omitted. It is called a shebang (the letter e is pronounced soft as it is in the word shell). The purpose of the shebang line is to tell the operating system what interpreter the remaining text should be processed with if one isn’t specified on the command line.

Line 02 isn’t strictly necessary for this program either. It makes available an advanced command named state. The state command creates a variable that can retain its value after it has gone out of scope. I’m using it here as a way to avoid declaring a global variable. It is considered good practice in computer programming to avoid using global variables where possible because they allow for action at a distance. If you didn’t follow all of that, don’t worry about it. It’s not important at this point.

PERL scopes, blocks and subroutines

Scope is a very important concept that one needs to be familiar with when reading and writing procedural programs. In PERL, scope is often delineated by a pair of curly brackets. Within the global scope, the above Tic-Tac-Toe program defines four sub-scopes on lines 15-21, 23-31, 33-35 and 37-54. The first three scopes are prefixed with subroutine declarations and the last scope is prefixed with the label PROMPT.

Scopes serve multiple purposes in programming languages. One purpose of a scope is to group a set of commands together as a unit so that they can be called repeatedly with a single command rather than having to repeat several lines of code each time in the program. Another purpose is to enhance the readability of the program by denoting a restricted area where the value of a variable can be updated.

Within the scope that is labeled PROMPT and defined on lines 37-54 of the above Tic-Tac-Toe program, a variable named mark is created using the my keyword (line 40). After it is created, it is assigned a value by calling the get_mark subroutine (line 47). Later, the put_mark subroutine is called (line 51) to change the value in the square that was chosen by the get_move subroutine on line 50.

Hopefully it is obvious that the mark that put_mark is setting is meant to be the same mark that get_mark retrieved earlier. As a programmer though, how do I know that the value of mark wasn’t changed when the get_move subroutine was called? This example program is small enough that every line can be examined to make that determination. But most programs are much larger than this example and having to know exactly what is going on at all points in the program’s execution can be overwhelming and error-prone. Because mark was created with the my keyword, its value can only be accessed and modified within the scope that it was created (or a sub-scope). It doesn’t matter what subroutines at parallel or higher scopes do; even if they change variables with the same name in their own scopes. This property of scopes — restricting the range of lines on which the value of a variable can be updated — improves the readability of the code by allowing the programmer to focus on a smaller section of the program without having to be concerned about what is happening elsewhere in the program.

Lines 04 and 05 define the MARKS and BOARD variables, respectively. Because they are not within any curly bracket pairing, they exist in the global scope. It is permissible to create constant variables in the global scope because they are read-only and therefore not subject to the action at a distance concern. In PERL, it is traditional to name constants in all upper case letters.

Notice that scopes can be nested such that variables defined in outer scopes can be accessed and modified from within inner scopes. This is why the MARKS and BOARD variables can be accessed within the get_mark subroutine and PROMPT block respectively — they are sub-scopes of the global scope.

The statements in the program are executed in order from top to bottom and left to right. Each statement is terminated with a semi-colon (;). The semi-colon can be omitted from the last statement in any scope and from after the last block of many statements that define the flow of the program such as sub, if and while.

In PERL nomenclature, scopes are called blocks. Scope is the more general term that is typically used in online references like Wikipedia, but the remainder of this article will use the more perlish term blocks.

The statements within the first three blocks are not immediately executed as the program is evaluated from top to bottom. Rather, they are associated with the subroutine name preceding the block. This is the function of the sub keyword — it associates a subroutine name with a block of statements so that they can be called as a unit elsewhere in the program. The three subroutines get_mark (lines 15-21), put_mark (lines 23-31), and get_move (lines 33-35) are called on lines 47, 51 and 50 respectively.

The PROMPT block is not associated with a subroutine definition or other flow-control statement, so the statements within it are immediately executed in sequence when the program is run.

PERL regular expressions

If there is one feature that is more central to PERL than any other it is regular expressions. Notice that in the example Tic-Tac-Toe program every block contains a =~ (or !~) operator followed by some text surrounded with forward slashes (/). The text within the forward slashes is called a regular expression and the operator binds the regular expression to a variable or data stream.

It is important to note that there are different regular expression syntaxes. Some editors and command-line tools (for example, grep) allow the user to select which regular expression syntax they prefer to use. PERL-Compatible Regular Expressions (PCRE) are by far the most powerful.

Regular expressions used in matching operations

The result of applying the regular expression to a variable or data stream is usually a value that, when used in a flow-control statement such as if or while, will evaluate to true or false depending on whether or not the match succeeded. There are modifiers that can be appended to the closing slash of a regular expression to change its return value.

Line 45 of the Tic-Tac-Toe program provides a typical example of how a regular expression is used in a PERL program. The regular expression [1-9] is being applied to the variable game which holds the in-memory representation of the Tic-Tac-Toe game board. The expression is a character class that matches any character in the range from 1 to 9 (inclusive). The result of the regular expression will be true only if a character from 1 to 9 is present in what is being evaluated. On line 45, the !~ operator applies the regular expression to the game variable and negates its sense such that the result will be true only if none of the characters from 1 to 9 are present. Because the regular expression is embedded within the conditional clause of the if statement modifier, the statement last PROMPT is only executed if there are no characters in the range from 1 to 9 left on the game board.

The last statement is one of a few flow-control statements in PERL that allow the program execution sequence to jump from the current line to another line somewhere else in the program. Other flow-control statements that work in a similar fashion include next, continue, redo and goto (the goto statement should be avoided whenever possible because it allows for spaghetti code).

In the example Tic-Tac-Toe program, the last PROMPT statement on line 45 causes program execution to resume just after the PROMPT block. Because there are no more statements in the program, the program will terminate.

The label PROMPT was chosen arbitrarily. Any label (or none at all) could have been used.

The redo PROMPT statement at the end of the PROMPT block causes program execution to jump back to the beginning of the PROMPT block.

Notice that the state keyword like the my keyword creates a variable that can only be accessed or modified within the block that it is created (or a nested sub-block if any exist). Unlike the my keyword, variables created with the state keyword keep their former value when the blocks they are in are called repeatedly. This is the behavior that is needed for the game variable because it is being updated incrementally each time the PROMPT block is run. The mark and move variables are meant to be different on each iteration of the PROMPT block, so they do not need to be created with the state keyword.

Regular expressions used for input validation

Another common use of regular expressions is for input validation. Line 34 of the example Tic-Tac-Toe program provides an example of a regular expression being used for input validation. The expression on line 34 is similar to the one on line 45. It is also checking for characters from 1 to 9. However, it is performing the check against the null filehandle (<>); it is using the =~ operator; and it is prefixed and suffixed with the zero-width assertions ^ and $ respectively.

The null filehandle, when accessed as it is on line 34, will cause the program to pause until one line of input is provided. The regular expression will evaluate to true only if the line contains one character in the range from 1 to 9. The assertions ^ and $ do not match any characters. Rather, they match the beginning and end positions, respectively, on the line. The regular expression effectively reads: “Begin (^) with one character in the range from 1 to 9 ([1-9]) and end ($)”.

Because it is embedded in the conditional clause of the ternary operator, line 34 will return either what was matched ($&) if the match succeeded or the character zero (0) if it failed. If the input were not validated in this way, then the user could submit their opponent’s mark rather than a number on the board.

Regular expressions used for filtering data

Line 17 demonstrates using the global modifier (g) on a regular expression. With the global modifier, the regular expression will return the number of matches instead of true or false. In list context, it returns a list of all the matched substrings.

Line 17 uses a regular expression to copy all the numbers in the range from 1 to 9 from the game variable into the array named nums. Line 18 then uses the modulo operator with the integer 2 as its second argument to determine whether the length of the nums array is even or odd. The formula on line 18 will result in 0 if the length of nums is odd and 1 if the length of nums is even. Finally, the computed index (indx) is used to access an element of the MARKS array and return it. Using this formula, the get_mark function will alternately return X or O depending on whether there are an odd or even number of positions left on the board.

Regular expressions used for substituting data

Line 28 demonstrates yet another common use of regular expressions in PERL. Rather than being used in a match operator (m), the regular expression on line 28 is being used in a substitution operator (s). If the value in the move variable is found in the game variable, it will be substituted with the value of the mark variable.

PERL sigils and data types

The last things of note that are used in the example Tic-Tac-Toe program are the sigils ($ and @) that are placed before the variable names. When creating a variable, the sigil indicates the type of variable being created. It is important to note that a different sigil can be prefixed to the variable name when it is accessed to indicate whether one or many items should be returned from the variable.

There are three built-in data types in PERL: scalars ($), arrays (@) and associative arrays (%). Scalars hold a single data item such as a number, character or string of characters. Arrays are numerically indexed sets of scalars. Associative arrays are arrays that are indexed by scalars rather than numbers.