While Perl5 is largely backwards compatible with Perl4, certain Perl4 functions have been deprecated in favor of more generic Perl5 functionality. A prime example of this is the binding of associative arrays to DBM or NDBM databases. The new Perl5 OO syntax provides a clean mechanism for binding Perl data objects to a variety of UNIX database formats.
     # deprecated Perl4 syntax
     dbmopen(%hash, "/home/hal/mydata",0644) ||
          die "Can't open database!\n";
This would associate %hash with the contents of the named
NDBM (or DBM on older UNIXes) database. If your system did not have
support from either database format, then this call generated a fatal
error - not exactly "graceful degradation."
The Perl5 library now includes (again, assuming your system supports
these formats).pm files for NDBM and SDBM databases. It is easy enough
to add support for Berkeley DB and GNU GDBM file formats
(documentation is included in the Perl5 source tree), though you will
have to end up remaking Perl.
Here is a simple Perl5 function for interfacing with an NDBM file:
     use NDBM_File;
     $DBM_INSERT = 0;     # system dependent &
     $DBM_REPLACE = 1;    # not in NDBM_File.pm
     sub open_db {
          my($filename) = @_;
          tie(%HASH, NDBM_File, $filename, $DBM_REPLACE, 0644);
     }
The first line loads the NDBM library NDBM_File.pm (note
that the file name is not quoted and does not include
the.pm extension.) The two constants defined next should
really appear in NDBM_File.pm; as a workaround, we define
them near the top of our program for ease of maintenance.
The interesting piece, however, is this new tie()
function in the open_db() subroutine. In Perl5,
tie() is a generic function for associating a data object
with a package of methods for operating on that object. If that last
sentence is in no way meaningful to you, just take it on faith that
this is the way we deal with NDBM style files nowadays.
In a database context, the first argument to tie() is the
associative array we are going to link to the database. The next
argument should be the name of the package which contains the
functions for operating on the database - invariably this is the same
name that we put in the use statement. The rest of the arguments are
whatever is expected to be handed to the C function call which opens
your database (check the manual page if you are not sure what the
appropriate arguments are).
In the case of NDBM files, the arguments are a filename, a special
flag value ($DBM_INSERT means you can insert new records
but not overwrite existing records, $DBM_REPLACE means
you can do both), and a mode value for when the database files need to
be created. When an NDBM database is created, two files are actually
generated: one called filename.dir and one called
filename.pag.
The corresponding close_db() function is trivial:
     sub close_db {
          untie(%HASH);
     }
untie() simply breaks the link between the associative
array and the database file. As a side effect, the file is closed
after any changes have been flushed out to disk.
keys() or
values() on arrays associated with NDBM files. If the
file is large, you will get back a huge list and wait a long time. If
you need to iterate over an entire database, use each():
     open_db("/home/hal/mydata");
     while ((key, value) = each(%HASH)) {
          # do work here
     }
     close_db();
Note that you must never add elements to the database while you are in
the middle of looping with each(). If you do, you may end
up looping forever or generating a fatal error.
One of the (many) problems with NDBM style databases is that you can
only associate a single data item with each key. Suppose, however,
that we wanted to associate many pieces of information with a
particular key. For example, suppose my database contained information
about hosts at my site: for each host I would want to include its IP
address, room location, owner, type of machine, and OS. One simple
method would be to place a delimiter between each of these fields and
use join() and split() to insert and extract
information (pack() and unpack() work well
if your data is of fixed size):
     # insert
     $HASH{"myserver"} = join("::", $addr, $rm, $who, $type, $os);
     # extract @info = split(/::/, $HASH{"myserver"});
If you want to be really tricky, you can marshal your data into a
self-extracting format that you can eval() when you want
to extract it. Here is a simple case:
     # original information
     %record = ("host" => "myserver",
          "address" => "172.16.15.1",
          "room" => "130",
          "owner" => "Bob Smith",
          "type" => "Sparc 10",
          "os" => "Solaris 2.4");
     # marshal and insert $data = "\%record=(";
     foreach $key (keys(%record)) { $data .= "'$key'=>'$record{$key}',";
     }
     $data .= ");";
     $HASH{$record{"host"}} = $data;
     # extract and print
     eval("$HASH{'myserver'}");
     foreach $key (keys(%record)) {
          print "$key\t$record{$key}\n";
     }
Note that marshalling arbitrary data, e.g., hashes with non-scalar
data values, generally requires a library of recursive functions.
     use Fcntl;
     sub lock_file {
          my($file, $type) = @_;
          my($flag, $struct);
          open($file, ">> $file") || return(undef);
          $flag = ($type eq "r") ? F_RDLCK : F_WRLCK;
          $struct = pack("ssx32", $flag, 0);
          return(fcntl($file, F_SETLKW, $struct));
     }
     sub unlock_file {
          my($file) = @_;
          my($struct, $status);
          $struct = pack("ssx32", F_UNLCK, 0);
          $status = fcntl($file,F_SETLKW, $struct);
          close($file);
          return($status);
     }
The careful reader will note a dirty trick happening with indirect
file handles (for more information on indirect file handles consult
the Perl documentation or my earlier column "
Know All the Angles"). The functions themselves are
straightforward: lock_file() expects a filename and a
string which indicates the type of lock (r for a read
lock and anything else for a write lock), while
unlock_file() just takes a file name. We use fcntl()
style locking since this type of locking works across networked file
systems.
I recommend using a file other than the database files themselves for locking purposes. This way you avoid problems when first instantiating the database and when other processes pull the database out from under you while you are blocking for a lock. Note also that NDBM style databases have no concept row or table locking: you end up having to lock the entire NDBM file. This is another area in which these databases are inferior to modern relational database systems.
Reproduced from ;login: Vol. 20 No. 6, December 1995.
Back to Table of Contents
12/3/96ah