Parsing linux fortune files into mysql

As a part of my recent project to build a web based fortune cookie, I wrote the following script to parse fortune files and insert them into a database table. Most of this code is relevant regardless of what you want to do with the data.

It makes use of database handles which I described here.

The only really new thing in this script is a state machine to handle multiple lines of quote text. I have removed the non essential elements here.

# read each line from the file
while(<FH>){
        # set the state to "in quote"
        $in_quote=1;
        # % indicates the end of a fortune
        if ($_ eq '%'){
                # run query

                # reset our state machine
                $in_quote=0;
                $author='';
                $text='';
        }
        elsif ($in_quote == 1 && $_ =~ m/^[\s|\t]+-- (.*)$/){
                $author="$1";
        }
        elsif ($in_quote == 1 && !($_ =~ m/^$/)){
                $text.="$_<BR />";
        }
}

As you may be able to see, this type of behavior can be used to parse a great many things that have a beginning and ending within a file. A more sophisticated version could be used to parse an apache config file, an xml file or something similar.

#!/usr/bin/perl

use warnings;
use strict;
use DBI;

# variable definition
my $dbuser = "redacted";
my $dbpass = "redacted";
my $db     = "redacted";
my $dbhost = 'localhost';
my $in_quote;
my $text;
my $author;
my $query;
# open the file

open(FH,"$ARGV[0]") || die "cannot open file $ARGV[0]";

# connect to the database

my $dbh = DBI->connect("DBI:mysql:$db:$dbhost", $dbuser, $dbpass) || die "Could not connect to database: $DBI::errstr";
$dbh->{PrintError} = 1; # do this, or check every call for errors

# prepare some queries we are likely to use
# note the use of q{} to protect the quotes in the statement
my $sthInsertRecord = $dbh->prepare(q{
    INSERT INTO fortunes (`text`, `author`)
    VALUE (?,?);
});

# read each line from the file
while(<FH>){
        # handle new lines
        chomp;
        # set the state to "in quote"
        $in_quote=1;
        if ($_ eq '%'){
                # i limited my table column to 255 characters to keep it fast
                if(length($text) <= 255){
                        # execute the query
                        $sthInsertRecord->execute($text, $author);
                }
                # reset our state machine
                $in_quote=0;
                $author='';
                $text='';
        }
        # if there is an author, it is indicated by a leading --
        elsif ($in_quote == 1 && $_ =~ m/^[\s|\t]+-- (.*)$/){
                $author="$1";
        }
        # if it isn't the terminator and it isn't an author then it is text
        elsif ($in_quote == 1 && !($_ =~ m/^$/)){
                $text.="$_<BR />";
        }

}
#clean up
$sthInsertRecord->finish;
$dbh->disconnect();
close(FH);

Web based fortune cookie.

For those of you who enjoy the linux fortune package I would like to present http://fortune.circulous.net. This is a project page for me to test and hone my skills as a linux sysadmin so don’t feel bad about blowing up the server with requests. At the moment, ab rates the server at ~400 requests per second.

The intended purpose of this site is to be set as a user’s home page. It automatically redirects to google after a brief time out to give the user time to read the quote.