Exercise 5 : Processing input
Laboratory Exercise for Introduction to Perl
This assumes you have already set up your initial directory with the init-perllab command (instructions).
Printing selected fields (loginpid.pl)
- Change to the Lab5 directory:
cd /scratch/mylogin/PerlLab/Lab5
The directory path may be different if you installed the files in another location. - If this directory does not exist, run the
init-perllab command again:
/g/software/bin/init-perllab
The setup program may warn that it is only installing Lab5 and skipping existing directories. Simply hit Enter to confirm. Once complete, change to the newly created directory:
cd /scratch/mylogin/PerlLab/Lab5
The directory path may be different if you installed the files in another location. - List the directory contents:
ls -l
You should see the files: INSTRUCTIONS README Solutions countlogin.pl loginpid.pl psout - Look at the loginpid.pl file:
cat loginpid.pl
The contents of this file are as follows (note the line numbers have been added for clarity and are not in the actual file):- #! /usr/bin/perl
- # loginpid.pl - print the login name and process id (PID) from
- # each line of "psout". Print a count of processes.
- #
- # The "psout" file contains the output of the ps command. The
- # first line contains column headers. Each line after that contains
- # an entry for each process running on the machine. The first field
- # is the username (login id), the second field is the process id (PID),
- # and the remaining fields have additional information. For the purpose
- # of this program, you only need to capture the first two columns and
- # keep a total of the number of processes. Print the header row
- # (the first two titles) but don't count it.
- #
- # USER PID START TIME COMMAND
- # krish 396 Oct01 0:03 sshd: krish@pts/3
- # krish 397 Oct01 0:00 -ksh
- # bedros 1299 Oct06 0:00 /bin/bash /g/home/pbs/torq...
- #
- # This selection has 3 processes and the first two columns are:
- #
- # USER PID
- # krish 396
- # krish 397
- # bedros 1299
- use strict;
- use warnings;
- use IO::File;
- # open the file "psout" for reading, $fh contains a filehandle
- my $count = 0;
- # read the file
- # for each line (including header), print only the first 2 fields
- # suggestion: use s/// substitution to replace full line with
- # the first two fields, using grouping ( )
- # count the number of lines (not including the header line)
- # suggestion: use a while(<$fh>) loop
- # close the file
- # print total
- Look at the
psout
file:
less psout
The less command allows you to view the file a page at a time. - Edit the loginpid.pl program so it will
print only the first two columns of
as well as counting the total number of processes.
Note that the first line contains column headers. You should
print the column headers for the first two columns, but don't
count this as a process. Each remaining line contains an
entry for a process running on the machine. The first column
is the username (login id) of the owner of the process. The
second column is a unique process id (PID).
See Extracting matches in the Perl regular expression tutoral, or search for this section in the man page available on eos:
man perlretut
Building a table of counts (countlogin.pl)
- Look at the countlogin.pl file:
cat countlogin.pl
The contents of this file are as follows (note the line numbers have been added for clarity and are not in the actual file):- #! /usr/bin/perl
- # countlogin.pl - count the number of login processes for each user name
- # in the ps output saved in the file "psout"
- #
- # The "psout" file contains the output of the ps command. The
- # first line contains column headers. Each line after that contains
- # an entry for each process running on the machine. The first field
- # is the username (login id), the second field is the process id (PID),
- # and the last field is the command. You can determine that a process
- # is a login shell because it starts with a minus sign ("-"), followed
- # by the name of the shell.
- #
- # USER PID START TIME COMMAND
- # krish 396 Oct01 0:03 sshd: krish@pts/3
- # krish 397 Oct01 0:00 -ksh
- # bedros 1299 Oct06 0:00 /bin/bash /g/home/pbs/torq...
- #
- # In this sample, only process # 397, which is owned by user "krish"
- # is a login shell. For this program, you can ignore all the other
- # processes which do not match the pattern for a login shell.
- #
- use strict;
- use warnings;
- use IO::File;
- # open the file "psout" for reading, $fh contains a filehandle
- # read the first line and ignore it (scalar prevents the read operation
- # from defaulting to reading the whole file as a list)
- # initialize the hash table you will use to count the logins
- my %count = ();
- # read the remainder of the file
- # skip lines which are not a shell (a dash "-" followed by a word at
- # the end of the line
- # check to see if the hash entry exists using:
- # exists $count{$login}
- # if it does not exist, create it and initialize it to 0
- # increment the count for that user name
- #
- # Suggestion: while (<$fh>) loop
- # close the file
- # print a list of the usernames and the process count for each
- Edit the program so it will open the file
"psout",
read the contents, keep a running count of the number of
processes by user (i.e., a hash table), and,
when the file has been completely read, print the final
total of the number of processes per user.
The output should look something like:
outputamrish 1 aya3706 1 bedros 1 c0s2008 1 chao1 1 dba1359 1 diego07 1 donzis 1 eosagus 3 etrufan 1 kjacks 4 krish 1 lmkli 1 m0m391a 1 natesal 2 ntp 1 pingluo 1 qlf1582 1 rlb3511 2 s0j3095 1 tskim 1 vtunesag 1 xfs 1
Note: To test whether the hash table already contains a value for a given key, use the function exists(). To get a list of the contents of a hash table, use the function keys(). To sort that list, use the function sort().