"How do I insert a line into a file?"

This is a very frequently asked question. It appears in [perlfaq5], along with related questions "How do I delete a line from a file?" and "How do I change one line in a file?" It sounds like it should be easy, but it isn't.

The problem is that although we think of files as made of lines, the operating system usually thinks of them as made of bytes. You can overwrite a byte, but not a line. If you want to replace a line, you either have to overwrite every byte exactly, or you have to copy the following part of the file up (if you're deleting) or down (if you're inserting). There isn't even an easy way to find line 57 in a file; you have to read through the file counting newline characters until you get to the place you want.

The FAQ starts with a rather snotty remark about how "Perl is not a text editor." It follows with a 500-word article sketching several more-or-less difficult ways to do this. Most of them involve throwing away the original file and replacing it with a modified copy.

At last, there is a better way.

The new Tie::File module makes a file look like a Perl array. Each array element is one line of the file. If you read the array, you get a line from the file. If you modify the array, the file is modified as you requested.

It's safe. It's reliable. It's efficient.

Best of all, it's easy.

Let's take an example. Supose you want to go through a file and replace PERL with perl everywhere. One wasy way is to use Perl's -i option:

        perl -i.bak -lpe 's/PERL/perl/g' file
This is convenient, but it has the drawback that is rewrites the entire file. If you want to do this as part of a larger program, it's rather less convenient, and a lot more bizarre. The FAQ suggests:
        {
          local ($^I, @ARGV) = ('.bak', 'file');
          while (<>) { 
            s/PERL/perl/g;
            print;
          }
        }
You get rather poor error checking if you do this---the open is implicit, so there's no way to catch the error if it fails.

Here's the Tie::File version:

        tie @lines, 'Tie::File', 'file' or die ...;
        for (@lines) {
          s/PERL/perl/g;
        }
        untie @lines;
Not only is this simpler (what the heck is local($^I), anyway?) but it's a lot more efficient. Unlike perl -i, which promises to modify the file "in place", and then actually creates a totally new file from scratch, Tie::File really does modify the file in place. If the file is ten megabytes long and contains PERL ten times, the -i solution writes ten megabytes; Tie::File writes just the ten records that changed.

Here's another common task; people ask about this in comp.lang.perl.misc every week: I have some text, in $text, and I want to insert it into an HTML just after the line that says <!-- insert here -->. Again, I could use -i, which rewrites the whole file. Or I can use Tie::File:

        tie @lines, 'Tie::File', 'file' or die ...;
        for (@lines) {
          if (/<!-- insert here -->/) {
            $_ .= $text;
            last;
          }
        }
        untie @lines;
Instead of rewriting the entire file, this only rewrites what is necessary: The part of the file after the comment. If $text happens to be empty, it rewrites only one record.

Now let's suppose you have a datatbase with several columns, and the first column is the key. For concreteness, let's say it's the Unix password file, and the key is the username. Suppose you have a program that needs to look up data in this database.

One good way to do this is to read the database into a hash, and use the usernames as the hash keys, like this:

        open DB, "< $database" or die ...;

        while (<DB>) {
          chomp;
          my ($username) = split /:/;
          $db{$username} = $_;
        }

        sub lookup {
          my $user = shift;
          return $db{$user};
        }
The major drawback of this approach is that if the database is big, you will run out of memory for the hash. (That is probably not a consideration with the password file, but many other databases are bigger.) But you can use Tie::File here to get an easy and efficient solution that works no matter how big the database is:

        tie @DB, 'Tie::File', $database or die ...;
        my $n = 0;
        for (@DB) {
          my ($username) = split /:/, $_;
          $recno{$username} = $n++;
        }

        sub lookup {
          my $user = shift;
          return $DB[$recno{$user}];
        }
We're still using a hash, and the usernames are still the keys. But instead of associating the data with the usernames (which would take a lot of space) we only associate a record number with each username. If we look up $recno{'merlyn'}, we don't get the information for merlyn directly. Instead, we get a number like 1123, which tells us that merlyn's data is on line 1123 of the data file. Then we look at $DB[1123] and Tie::File immediately recovers the data for us. We get fast access to every record without storing the entire database in memory.

Of course, with Tie::File, we're not limited to only reading the database; we can modify it also:

        sub replace_data {
          my ($user, $new_data) = @_;
          $DB[$recno{$user}] = $new_data;
        }

        sub update_password {
          my ($user, $new_password) = @_;
          my $crypted_password = crypt($new_password, random_salt());
          my @data = split /:/, lookup($user);
          $data[1] = $crypted_password;
          replace_data($user, join(':', @data));
        }
When we call replace_data, the data in the file is overwritten in place with the new data.

Tie::File arrays support all the Perl array operations, including push, pop, shift, unshift, splice, and $#a = $N.

Tie::File is available on CPAN and also from my website. It will be included with Perl 5.8, which will be released in April. It is distributed under the same terms as Perl.

You will like it.