Twelve Views of Mark Jason Dominus

For 2003 OSCON I proposed to give nine five-minute talks in a 45-minute slot. To avoid confusing this with the real lightning talks sessions, Nat decided to call my talk ``Nine Views of Mark Jason Dominus''. At first I didn't like that title, which sounded too much like self-promotion. But I warmed up to it after a while and included various pictures (views) of myself in between the nine talks.

Before I wrote the talk, I was invited to give it at YAPC::Israel in a one-hour slot. So I wrote twelve short talks, and then for OSCON I cut out talks 2, 7, and 11. This is the long version.

Table of Contents

Slide 1: Introduction

The slide the audience members saw while they were waiting for the talk to start had nothing on it except the talk title and a panel from The Uncanny X-Men issue #21, in which the villain was named Dominus.

Slide 2: Talk 1: The Coolest Perl Project I Ever Did

Around 1994 I was dating a woman who was a quilter. To make a quilt, you take a lot of small pieces of fabric and sew them into a 'patch', then you sew some patches into 'blocks', and then you sew the blocks together into a quilt top. Then you sandwich the quilt top with a quilt back (usually a single piece of fabric) and cotton batting in between, and that's a quilt.

There are a lot of traditional quilt blocks, with names like log cabin, courthouse steps, flying geese, corn and beans, broken dishes, and so on. I got to wondering if the space of possible blocks had been mostly explored or not. So I wrote some Perl programs to generate quilt blocks. I decided to investigate arrangements of what are known as ``half-square triangles'', where you have squares that are diagonally split into two triangles. I enumerated all the ways to sew together four half-square triangles. Then I took each of these and put together four copies of it into a rotationally-symmetric block. It turns out that there are exactly 72 ways to do this. My enumeration had a bug, so I came up with 73. A fun puzzle is to figure out which one is repeated.

Anyway, as I said, I did this project to impress my girlfriend, and she was indeed impressed. So this was not only the coolest Perl project I ever did, but also the most successful.

When we got married, she actually made the quilt for me and give it to me as a wedding present---you can see a bit of it in the background on that last picture. There's a long tradition of puzzle quilts, so she left in the duplicated block. To make room (because it's very hard to make a quilt with 73 blocks, since 73 is a prime number) she left out one of the other blocks---it's fourth from the left in the fifth row.

Source code and more outputs are here.

Slides 3-7: Talk 2: On Forking

There's been a puzzling trend in the Perl world in the last few years, away from use of the system function, which invokes an external shell command. There are good reasons to avoid system in some circumstances, but people are becoming increasingly dogmatic about avoiding system at all costs.

I was particularly struck by something that happened to me a year or two ago. The program that manufactures my conference slides takes a large text file, splits it into slides, and then runs Seth Golub's txt2html program on each of the text files to convert them to HTML. txt2html is rather slow, so I don't want to run it on every slide every time; instead, the wrapper keeps a backup of each text file, and compares the new version of each text file to the backup to determine which ones need to be run through txt2html.

To do the comparison between the old and the new versions of the text file, the program uses the standard Unix cmp program, which compares two files byte-by-byte and reports whether they are the same.

I mentioned this approach on the #perl IRC channel once, and I was immediately set upon by several people who said I was using the wrong approach, that I should not be shelling out for such a simple operation. Some said I should use the Perl File::Compare module; most others said I should maintain a database of MD5 checksums of the text files, and regenerate HTML for files whose checksums did not match those in the database.

I think the greatest contribution of the Extreme Programming movement may be the saying ``Do the simplest thing that could possibly work.'' Programmers are mostly very clever people, and they love to do clever things. I think programmers need to try to be less clever, and to show more restraint. Using system("cmp -s $file1 $file2") is in fact the simplest thing that could possibly work. It was trivial to write, it's efficient, and it works. MD5 checksums are not necessary. I said as much on IRC.

People have trouble understanding the archaic language of ``sufficient unto the day is the evil thereof,'' so here's a modern rendering, from the New American Standard Bible: ``Do not worry about tomorrow; for tomorrow will care for itself. Each day has enough trouble of its own.'' (Matthew 6:34)

People on IRC then argued that calling cmp on each file was wasteful, and the MD5 approach would be more efficient. I said that I didn't care, because the typical class contains about 100 slides, and running cmp 100 times takes about four seconds. The MD5 thing might be more efficient, but it can't possibly save me more than four seconds per run. So who cares?

So far I have no objection to the discussion; it's just a difference of opinion. But at this point the discussion went awry. The IRC folks just couldn't hear me; they couldn't let go of the idea that cmp was Wrong. They started arguing with me about the time taken up by fork and exec, after I had already pointed out that there was no performance problem to be solved.

A more thoughtful examination of the performance issue shows that even if we do consider it a problem, the MD5 thing may not solve it. system("cmp") does fork and exec, but once cmp is running, it's extremely fast, since what it's doing is extremely simple. Moreover, if the files differ, it can usually quit early, without having to read past the first block of each file. In contrast, MD5 needs to read an entire file and perform a lot of complicated calculations to come up with the checksum, and then it has to read the database for the old checksum. A couple of years later I implemented the MD5 thing, just to see what would happen, and sure enough, the program became slower.

One of Perl's big selling points has been that it's useful as a 'glue language', for gluing other tools together. It still excels as a glue language. Why this desire to absorb all those other tools into Perl?

One of Perl's mottoes has been that Perl might not be as fast as C, but you can write a prototype program in Perl quickly, and then you might discover that there's no need to replace the prototype because it's already fast enough. That's exactly what I did with txt2slides. When I wrote the cmp line, I thought ``Well, this might not be fast enough, but if it isn't, I'll change it later.'' Then it was fast enough, so I didn't bother to change it. That's what's supposed to happen! Why isn't this considered a tremendous success for the Perl Way?

Let's not forget the things that are good about Perl. It's good at interacting with other programs, and it's good for rapid prototyping. Let's not hassle people when they use Perl the way it was designed to be used.

Slide 8: Interlude

This is me at YAPC 19000 in Pittsburgh. My wife made this hat for me for my first appearance at a conference "guru session".

Slides 9-11: Talk 3: New Versions

I released the Text::Template module several years ago, and it was immediately very successful. It's small, simple, fast, and it does a lot of things well. At the time, there were not yet 29 templating systems available on CPAN.

Anyway, the module quickly stabilized. I would get bug reports, and they would turn out to be bugs in the module's users, not in the module; I would get feature requests, and usually it turned out that what the requester wanted was possible, or even easy, without any changes to the module. Since the module was perfect, there was no need to upload new versions of it to CPAN.

But then I started to get disturbing emails. ``Hi, I notice you have not updated Text::Template for nine months. Are you still maintaining it?'' ``Hi, I notice you seem to have stopped work on Text::Template. Have you decided to abandon this approach?'' ``Hi, I was thinking of using Text::Template, but I saw it wasn't being maintained any more, so I decided to use Junk::CrappyTemplate, because I need wanted to be sure of getting support for it.''

I started wondering if maybe the thing to do was to release a new version of Text::Template every month, with no changes, but with an incremented version number. Of course, that's ridiculous. But it seems that people assume that if you don't update the module every month, it must be dead. People seem to think that all software requires new features or frequent bug fixes. Apparently, the idea of software that doesn't get updated because it's finished is inconceivable.

I blame Microsoft.

Slide 12: Interlude

Here I am at YAPC 19100 again. I had taught classes at LISA in San Diego on Tuesday, and then I flew overnight to Pittsburgh for YAPC on Wednesday. When I arrived, Kevin Lenzo told me that Joe Hall had been snowed in at Chicago, and asked if I could do some impromptu talks to fill up his slots in the schedule. I gave seven hours of talks that day.

Slides 13-15: Talk 4: Subclassing

Here's another puzzle about Text::Template. People often write to me with new feature requests. For example, people want the package to preprocess the code in the templates before evaluating it, or to infer the template syntax from the template file extension. Often these things are very application specific, so I don't want to bloat the module with them, and anyway, they can easily be accomplished by subclassing the module and overriding one or another of the methods. Usually I'll even write the subclass and send it with my reply.

However, this straightforward solution is usually rejected. People want me to put the code into Text::Template itself and release a new version.

I had originally concluded that this reluctance to subclass modules was just irrational behavior. After all, one of the major promises of object-oriented programming is the opportunity for code reuse via inheritance. This benefit is being thrown away. A related data point is that very few CPAN modules subclass other CPAN modules. Has-A relationships are common, but Is-A relationships are rare. But after a little more thought, I decided this might not be entirely irrational.

Slide 15 shows an example: We want Text::Template to run all the code fragments through a preprocessor before evaluating them when it is filling out a template. To do this is very easy. You subclass the module and add a new method, which installs a preprocessor function into the Text::Template object when it is called. Then you override the Text::Template::compile method to invoke the preprocessor if there is one, and then call the real compile method. This is all pretty easy.

But there are a couple of big problems here. First, the subclass depends on the fact that the Text::Template object is a blessed hash, which is not guaranteed anywhere. It also depends on the fact that the object is not using the PREPROCESSOR key for anything else, and, moreover, that it won't ever use it for anything else. It also depends on the presence of the compile method, which is undocumented and which therefore might go away in a future version of the module, breaking the subclass.

So you could do the subclass, but you would never be sure that it wouldn't break someday. It depends on too many things that I didn't promise.

I'm not sure what to do about this. Larry said that the problem was Perl's poor object model. I disagreed. A better model will help solve the hash key collision problem, but not the undocumented method problem. I suggested that perhaps one solution would be for modules to start including an explicit SUBCLASSING INTERFACE section in their documentation, spelling out just what guarantees the author would make for subclasses.

Slide 16: Interlude

Here I am at an early Usenix conference, probably around 1992.

Slide 17: Talk 5: Getting Help From Strangers

I get a lot of mail from people I've never met, asking for help. Usually, I try to help, but some of the mail is so awful that all I can do is throw it away. Here's some advice on getting strangers to give you help.

Here we see the wrong way to go about it.

  1. This person has no name, or at least none that they told me. Maybe I'm an old-fuddy-duddy, but I like to know who I'm talking to. If they want their homies to call them ``Stick-on Smooth'', that's fine, but when they write to me they should have a real name.

  2. The Subject is uninformative. It says Your article. I write a lot of articles. Is it a magazine article? A Usenet article? I have no idea.

  3. Then comes the question: ``What does forkish mean?'' Apparently I must have used the word ``forkish'' in some article somewhere. I have no idea any more what I meant by it. So now I have to go use Google to figure out what I was talking about.

  4. ``This is very important.'' Not to me! (Or, as someone said to me afterwards, ``Gosh, I'll be sure to clear my schedule!'') I'm not just being a grump here. If this person would tell me why it was so important, I might agree. But they don't give me a chance to agree.

  5. Having sent a letter that contains a question, you're already implicitly asking for a response. Adding Please respond makes it sound like you're grovelling. The only thing more annoying than Please respond is Please respond ASAP.

Here's my rewrite of this letter into a form that I would have been more likely to answer.

What's different?

  1. The sender has a name. In the talk, I observed that it doesn't even need to be their real name, as long as I can't tell the difference.

  2. There's an explanation of why the sender wants the answer. It doesn't have to be earth-shaking; I make bets with my friends about trivia too, and I might be flattered that I'm being cited as an authority.

  3. There's an explanation of how I came to be involved in this. The sender cites the URL of my article, which makes it easy for me to find out what they're talking about. Better still would have been if they had quoted the relevant paragraph or sentence.

  4. One thing I left out here was 'please'. On further reflection, I don't think this is required if the rest of the note is polite and well-mannered. Contrast this with the original note, which did say 'please' but was neither polite nor well-mannered. The corrected version may not say 'please', but the 'please' is communicated nevertheless. The original said 'please' but managed to communicate the opposite.

Here's a real example of a message that I did reply to, in some detail:

        Date: Wed, 03 Oct 2001 18:29:17 +0100
        From: ``L. Thompson'' <xxxxx@xxx.xxxxx.ac.uk>
        Organization: University of Keele
        To: mjd-perl-questions-id-i8g+br7a9j0+@plover.com
        Subject: Big big stupid question-PLEASE HELP!!!
        Content-Type: text/plain; charset=us-ascii
        Content-Transfer-Encoding: 7bit

        Hey there, I came upon your web page when I was looking for the answer
        to a stupid question which I've started to become really curious about. 
        I really hope I don't offend you by asking, but surely someone with
        computer knowledge, used to stupid questions, might know something?
        My friend's at uni and she doesn't have the first clue about computer
        programming.  So when her computer science tutorial left her with the
        homework question ``How much money is there in the world?'' she was
        naturally a bit confused.  Not being too confident about my own answer
        (we don't know enough about the value of money to answer that one...)
        can you tell me what the hell this has to do with programming and
        computer science?
        Thanks in anticipation for any help you can give,
        Lo.
        xxxxx@keele.ac.uk

Here's one of the more remarkable help requests I received.

The author is trying to get a Master's degree in Armenia. Apparently there is no library in Armenia, because they are asking me for information on ``Lexico-grammatical peculiarities of the language of constitution''. Why me? I have absolutely no idea.

I didn't bother to answer this one. I feel sorry for the Armenian student, because I suppose there is not much money in Armenia, and the library might not be well-stocked. But this is not a problem I can solve, or should be asked to solve. If you're a master's student, and you find that there is a dearth of information on your chosen thesis topic, the correct response is to pick a different topic.

Slide 18: Interlude

This picture was taken around 1991 at a party in Las Vegas, New Mexico. This was the occasion on which I first met Nat Torkington, the OSCON program chair, and also the party at which Nat first met Jenine, who is now his wife.

When the picture came up on the screen at OSCON, some people whistled. But I shamefacedly admitted that I had posted this picture on ratemyface.com, and it had received a rating of only 6.

Slides 19-22: Talk 6: How to Progress

Someone asked on PerlMonks ``How to progress''. Here are my notes on how to become smarter, which is something I hope we'd all like to do.

  1. Read books other people aren't reading
  2. The computer programming community tends to follow fads in which everyone reads the same books at the same time. For example, books on extreme programming or design patterns. I think this is unfortunate. If you read books that other people are not reading, then you will know different things from the things they know, and when they come to you with a problem they can't solved, you might be able to solve it with your different equipment.

    When everyone in the community reads the same books, you can an inward-looking, intellectually impoverished community that can only contemplate its own navel. When we read all different books, we all have more to learn from each other.

    In the talk, I was about to list some books I had read recently that I thought other people hadn't, but then I stopped and said I wouldn't do it because I was afraid people would then go out and read those books.

  3. Read original source materials.
  4. It's all very well to read what someone has said about what someone else said about some topic, but a lot of the time you get a clearer picture if you go to the original source, and find out for yourself what was said about the topic.

    One of the reasons that subjects become important and their inventors become famous is because the inventors were able to describe the subject so clearly.

    I tried reading books about design patterns a while ago, and I was underwhelmed. But a lot of people are interested in design patterns; why is that? I went back to the original Christopher Alexander books and found out why: pattern languages are a brilliant idea and Alexander's books about them are works of genius.

    Here are some other original sources that are excellent: Einstein's book on special relativity is the thing to read if you want to learn about special relativity. Galileo's book on mechanics, Discourses on Two New Sciences, may be the single best technical book I've ever read on any subject.

  5. Read actively.
  6. Many people have the idea that you read a technical book the way you read a trashy novel, by just sitting back to let the ideas flow into your mind. This doesn't work well.

    You will get a lot more out of it if you read actively: Read a sentence. Then reflect for a long time. Ask questions about it. When you're done, read the next sentence.

    I think I explained this really well in my original PerlMonks article, so I'll reproduce the relevant section here:

    Do the same thing when you read the Perl manual. Read a sentence. I will pick a sentence from the Perl manual at random: Line 2000 of perlfunc, whatever that is:

          getpgrp
          Returns the current process group for the specified PID.
    

    Great example. What is a process group? What is it for? When would I want to use getpgrp? What is an actual application in which I would need getpgrp? What is another? What will getpgrp actually return? An integer, or what? What would be the best design here? (Then if the next sentence reveals that they didn't use the design you thought was best, try to figure out which one of you screwed up.) Can this call ever fail? When? How? How will it report the error? What might the return value be in that case? (Then read on to see if you are correct.) Are there permission or security problems here? Would it be dangerous to be able to get the process group ID for someone else's process? And so on.

    Then you go on to the next sentence. If you read the whole manual like this, you will progress. That's what you wanted to know, right? How to progress.

  7. Take notes.
  8. This is not like in college, where you copy down everything the professor says because it might be on the exam. As you read actively, note down the questions you have, things you find puzzling, places where you think you could have designed it better, relationships you notice with other things you've read. If you don't have questions or puzzles, then you are reading the wrong thing and you should read something that you find more engaging.

    Write down ideas you have. Save them in a file. A year later, you will be amazed at all the stuff you thought about that you have forgotten. You will probably enjoy rereading these files, because most people find their own thoughts interesting.

Slide 23: Intermission

The joke here is that the intermission for a 5-minute talk is only 15 seconds long. I thought people would laugh at this more than they did.

Slide 24-29: Talk 7: NP-Complete Problems

Some of the very biggest problems in computer science over the last 30 years have had to do with the idea of NP-completeness. Here's the short explanation. There are a lot of problems that people would like to solve, but no good algorithms are known that work in general. A typical example is the Travelling Salesman problem. In the travelling salesman problem, you have a bunch of cities, a cost to travel between each pair of cities, and a budget. The question is, is there a route that visits each city exactly once, returns to its starting point, and comes in under budget?

NP-complete problems are characterized by the property that it's easy to check proposed solutions to see if they're correct. If someone purports to have an under-budget itinerary for a particular example of the problem, it's quick and trivial to add up the distances and check if they're under budget. But how do you come up with such an itinerary? Nobody knows. The only known algorithm that always works is to examine every possible itinerary, looking for a route that is under budget.

There are hundreds of these NP-complete problems known, and the amazing thing is that an efficient solution to any of them could be turned mechanically into an efficient solution to all the others. But nobody knows a solution to any of them!

Now here's what the talk is about. Sometimes someone will show up in a help forum with a problem they need to solve. Someone else will observe (correctly) that the problem is NP-complete. They will then say ``so you might as well give up.'' This is the wrong conclusion.

Here's an example. Someone was writing a chat system. When a new user logged into the chat system, the author wanted to examine their buddy list and report the largest group of their buddies who were (a) all online at the same time, and (b) all buddies of each other as well. This is called the Clique problem, and it is NP-complete. And sure enough, someone told them that they should give up on this feature.

Just knowing that a problem is NP-complete does not remove your need for it, and you should not be too quick to give up. This is for several reasons. First, there may be something about your particular application that renders the problem intractable. In the case of the chat system person, it turns out that it's not intractable at all. This is because you don't have to compute all the cliques at once. If you assume that you have all the large cliques computed beforehand, then when a new user logs in, you don't have to re-compute everything from scratch; you just have to compute the cliques that involve this new user---and you know that none of them will be more than one person larger than the onces you had before; similarly you can easily compute the changes to the cliques when someone logs out. The total amount of computation over the course of the day may be large, but it is amortized into small amounts whenever someone logs in or out.

NP-completeness is a statement about coming up with optimal solutions. If sub-optimal solutions are acceptable, there may be easy ways to find them. For example, consider the Bin Packing problem. In this problem, you have a bunch of files, and you want to back them up onto floppy diskettes; you want to use as few diskettes as possible. This is NP-complete; the only way to guarantee the minimal number of diskettes is essentially to try every possible way of distributing the files. But there's an easy algorithm that works quite well: sort the files from largest to smallest. Put the largest file on the first diskette. Put the next-largest file on the first diskette if it fits, and on a new diskette if not. Put the third-largest file on the first diskette on which it fits, starting a new diskette if it doesn't fit anywhere. When you're done, you may have used more than the minimum possible number of diskettes, but it turns out you never use more than 122% of the minimum possible number, and in practice, you typically won't use more than about 105% of the minimum possible number.

Travelling Salesman is similar. If you want the least-cost itinerary, all you can do is enumerate every possible itinerary looking for the cheapest one. But there are quick algorithms that are guaranteed to find an itinerary that costs no more than 50% more than the cheapest one.

Lots of NP-complete problems can be solved optimally, and efficiently, in most cases; NP-completeness is a statement about the hardest instances of a problem. 3-Color is an example: You have a bunch of people and their buddy lists, and you want to know if it's possible to divide the people into three groups so that nobody is in the same group with one of their buddies. There are heuristic techniques you can use to simplify the buddy lists until they're trivial---either you end up with four people who are all each other's buddies, in which case the answer is obviously ``no'', or you end up eliminating everything, in which case the answer is obviously ``yes''. On some really complicated arrangements of buddy lists, the heuristics don't work, and they get stuck without simplifying the situation very much. But those sorts of situations are pathological and don't come up in practice very often.

In some NP-complete problems, there are efficient solutions for all the small examples, and nobody cares about the large ones. In the Partition problem, you have a list of numbers, and you want to divide the numbers into two groups so that the sum of the numbers in each group is the same. There's a straightforward algorithm that's quadratic in the size of the largest number. Unless your numbers are really enormous, this algorithm works just fine. In practical examples, the numbers typically represent costs or times, so you don't have to worry about them getting really large.

So next time someone tells you to give up on your problem because it's NP-complete, ignore them.

Slide 30: Interlude

Here I am getting married.

Slide 31: Talk 8: Why I Hate strict

Actually I don't hate strict; I just hate the stupid way people think and talk about it.

strict is another dogma that has become increasingly prevalent in the community over the last few years. Beginners are encouraged to use strict whether or not they know what it does. What's the point of that? Use of strict has somehow become identified as a necessary ingredient of 'good programming practice', and so beginners are encouraged to 'adopt good programming practices'. Sorry, but putting declarations into your program when you don't know what they do is the worst possible programming practice, and this is what we are encouraging beginners to do.

As with the 'glue language' discussion above, my complaint isn't so much about the position itself, whether people should use strict, as with the dogmatism and thoughtlessness with which it's promulgated. People have stopped thinking about strict, and that's what I hate. People come into the Usenet comp.lang.perl.misc group with some question, which they demonstrate with a four-line example program, and other folks there jump on them: ``Why didn't you use strict?'' Well, because it's a four-line example program I concocted as an example in my Usenet article---duh!

Here's another example that's not on the slide. Some time ago there was a review on PerlMonks of the book Perl and CGI for the World Wide Web: Visual Quickstart Guide. The reviewer was very unhappy that none of the examples used strict. I said I couldn't see any benefit that would accrue from the examples using strict. strict does three things:

  1. It enables strict 'refs', which prevents strings from being accidentally used as references. None of the examples in the book used references at all, so there was no reason to use strict 'refs'.
  2. It enables strict 'vars', which prevents global variables from being used without being declared; typically, one declares variables local with the my declaration. While this is good practice in general, the example programs were all very small---less than twenty lines each. In such small programs, there is no practical difference between a global variable and one that has been declared with my. No benefit would have accrued from requiring the use of my declarations of every variable.
  3. It enables strict 'subs', which is of very limited value even at the best of times. strict 'subs' forbids unquoted strings, because such strings ('barewords') can cause long-term maintenance problems. If you have code like
  4.         if ($x eq carrots) {
              ...
            }

    the carrots is taken as a literal string. But if someone later adds a carrots() function to the program, the meaning of this line might change suddenly and unexpectedly, to call carrots() and compare $x with the returned value. This is not too likely, except perhaps in very large and long-lived programs, which is why strict 'subs' is of such limited value. In 20-line book examples, it is of no value whatsoever.

Here's the problem: the reviewer of my book read my criticism, and instead of answering my objections, his response was that everyone should always use strict. Using strict is outside of the bounds of rational discussion for some people.

The example linked from the slides demonstrates this. A programmer posted to comp.lang.perl.misc with an example that manufactured a function name at run time, and then called the function. Many people jumped in to say this was a mistake, some even saying it was a mistake because it caused a strict 'refs' violation. This is nonsense. Violating strict 'refs' is like setting off the fire alarm. It is a warning that something dangerous may be occurring; it is not a problem in itself. The problem is the fire.

What is the fire, the real problem, in this case? I wrote a series of widely-read and often-cited articles about the problems that arise from using values as the names of variables and functions. Before responding to this thread, I read over the articles, checking each problem I mentioned to see if it could occur in the original programmer's code. None of the real problems could actually arise. I said that I was not able to see what the problem was, and asked people to explain it.

The explanations were astonishingly obtuse. Some people reiterated that one should 'always use strict'. Some suggested problems that obviously could not occur. For example, one respondent said:

What if some random string was passed in from the outside and a bad sub was called?

But the original author's code was:

    foreach my $x ("email_zip", "email_txt", "cp_zip", "cp_txt") {
        if ($res = &{"_send_$x"}($domain, $file, $date, $location)) {
            push (@results, $res);
        }
        ...
    }

The subroutine names are not 'passed in from the outside'; they are hardwired into the program. It is completely impossible for a 'bad sub' to be called here; the four subroutines that are called are completely determined at compile time.

That respondent continued:

At least check the sub name against a hash for verification.

This shows that at least this one person wasn't thinking about what he was saying. The subroutine names are derived from a compile-time list of names embedded in the code. If you don't trust this list to be accurate, why would you expect the hash to be accurate? I suggested that maybe you should first check the hash against a second hash to make sure the hash is correct, before using it for verification.

Any time you hear anyone saying ``you should be using strict,'' you can be fairly sure they're not thinking about what they're saying, because strict has these three completely unrelated effects. A more thoughtful remark is ``you should be using strict 'vars''' (or whatever). As an exercise in preventing strict zombie-ism, ask yourself ``Which of the three strict effects do I find most useful? Which do I find least useful?''

Slide 32: Interlude

Here I am as a South Park character.

Slides 33-37: Talk 9: On Fish

The #perl IRC channel has a big problem. People come in asking questions, say, ``How do I remove the first character from a string?'' And the answer they get from the regulars on the channel is something like ``perldoc perlre''.

This isn't particularly helpful, since perlre is a very large reference manual, and even I have trouble reading it. It's sort of like telling someone to read the Camel book when what they want to know is how to get the integer part of a number. Sure, the answer is in there somewhere, but it might take you a year to find it.

The channel regulars have this idiotic saying about how if you give a man a fish he can eat for one day, but if you teach him to fish, he can eat for his whole life. Apparently ``perldoc perlre'' is what passes for ``teaching a man to fish'' in this channel.

I'm more likely to just answer the question (you use $string =~ s/.//s) and someone once asked me why. I had to think about that a while. Two easy reasons are that it's helpful and kind, and if you're not in the channel to be helpful and kind, then what's the point of answering questions at all? It's also easy to give the answer, so why not? I've seen people write long treatises on why the querent should be looking in the manual instead of asking on-channel, which it would have been a lot shorter to just answer the question. That's a puzzle all right.

The channel regulars say that answering people's questions will make them dependent on you for assistance, which I think is bullshit. Apparently they're worried that the same people will come back and ask more and more and more questions. They seem to have forgotten that if that did happen (and I don't think it does) they could stop answering; problem solved.

The channel regulars also have this fantasy that saying perldoc perlre is somehow more helpful than simply answering the question, which I also think is bullshit. Something they apparently haven't figured out is that if you really want someone to look in the manual, saying perldoc perlre is not the way to do it. A much more effective way to get them to look in the manual is to answer the question first, and then, after they thank you, say ``You could have found the answer to that in the such-and-so section of the manual.'' People are a lot more willing to take your advice once you have established that you are a helpful person. Saying perldoc perlre seems to me to be most effective as a way to get people to decide that Perl programmers are assholes and to quit Perl for some other language.

After I wrote the slides for this talk I found an old Usenet discussion in which I expressed many of the same views. One of the Usenet regulars went so far as to say that he didn't answer people's questions because he didn't want to insult their intelligence by suggesting that they would be unable to look in the documentation, and that if he came into a newsgroup with a question and received a straightforward answer to it, he would be offended. I told him that I thought if he really believed that he needed a vacation, because it was totally warped.

Slide 38: Interlude

I'm told that I was cold sober when this picture was taken. I don't remember.

Slide 39: Talk 10: Why Lisp Will Never Win

There are a lot of things wrong with the Lisp user community. comp.lang.lisp is one of the sickest newsgroups I've ever seen. This article conveniently demonstrates two serious problems at once.

At least two or three times a year, comp.lang.lisp has a long discussion about why it is that more people don't use Lisp. In one of those discussions, Peter da Silva suggested that if there were a 'lispscript' utility, analogous to AWK, which allowed people to easily do the sorts of things that AWK, then people would begin using Lisp for casual scripting, and would go from there to longer projects. His example was:

        awk 'BEGIN {FS=":"}; $6=="/sbin/nologin" {print $1}' /etc/passwd

The brilliantly obtuse response is in two parts. First, ``You can do that already'':

I already frequently use it for casual scripting (well, Scheme, mostly). With only a couple of new utility macros & functions, your example could be expressed in Common Lisp as:

        (with-lines-from-file (line "/etc/passwd")
          (let ((fields (string-split line :fs #\:)))
            (when (string= (aref fields 5) "/sbin/nologin")
              (format t "~A~%" (aref fields 0))))))

This solution is a little over 2.5 times as long as the AWK version. But at least it only requires ``a couple of new utility macros and functions''! The only thing more amazing than the degree to which the author missed the point of the exercise here is the degree to which he seems unaware that he missed the point of the exercise.

So problem #1 is a total cluelessness about what other people consider valuable and useful.

But the respondent goes on, and here's a more serious problem:

But seriously, how many ``one-liners'' do you actually write anyway? Not many. And by the time you've done coded up something that's complex enough to be useful, Perl's tricky little code-compression notation usually expands to be about the same length as any other language, and six weeks or six months later I'd much rather be reading Lisp than decoding a Perl puzzle.

How many ``one-liners'' do I actually write? I don't know; maybe a couple dozen a day. But I guess I must be unusual, because as we all know, AWK was a complete failure and vanished into obscurity since it didn't address anyone's real needs. (That was sarcasm.) So problem #2 is that when faced with someone else's problem which Lisp doesn't solve effectively, the response is a mixture of ``that's not a real problem'' and ``you're an idiot for wanting to solve that.''

Another thing to notice is the little slam against Perl. How did Perl get involved in this? Da Silva was discussing Awk, not Perl. But the comp.lang.lisp folks can't stop talking about Perl. They are constantly talking about Perl. I looked into comp.lang.python to see if it was similar, and I found out that people in comp.lang.python hardly ever discuss Perl. I think that shows that comp.lang.lisp is sick and comp.lang.python is healthy: the Lisp folks are interested in Perl, and the Python folks are interested in Python.

Here's the real reason why Lisp won't win. The Lisp programmers don't want it to win. They're always complaining that not enough people are using Lisp, and that Lisp isn't popular. But they humiliate and insult newcomers whenever they appear in the group. (The group regulars would no doubt respond to this that the newcomers deserve this, because they're so stupid and argumentative.) If Lisp did become popular, it would be the worst nightmare of the comp.lang.lisp people.

Lisp is a really excellent language in a lot of ways, but the Lisp world has several huge social problems. I'd like to help, but I don't think I can, because they don't want to hear it from anyone, and least of all from me.

Slide 40: Interlude

I'd just come back from the ophthalmologist.

Slide 41-46: Talk 11: On 'Strongly Typed' Languages

This talk was inspired by a thread in comp.lang.perl.misc, where someone said that Perl was a 'weakly typed' language, and Randal Schwartz disagreed and said it was 'strongly typed'. That surprised me, because I would have used Perl as a perfect example of a 'weakly typed' language, since any kind of data will be silently converted to any other kind at run time. I wanted to dispute Randal, but first I needed to make sure that my understanding of 'strongly typed' was correct. So I did a little research and discovered that my understanding was not correct.

I read a lot of articles and class notes written by a lot of authorities, and discovered that there was no consensus about whether C was a 'strongly typed' or a 'weakly typed' language. Slide 43 has some examples of each.

I found a lot of web pages that contrasted C and Pascal, and asserted that Pascal was a 'strongly typed' language while C was 'weakly typed'. But the type systems of C and Pascal are almost exactly the same!

After reading a lot of articles, I discovered that different people had at least eight different notions of what 'strongly typed language' meant. Sometimes they weren't even sure themselves; I found some articles which defined 'strongly typed language' and then classified languages as strongly or weakly typed in explicit accord with some other definition that contradicted the one they had first given.

My conclusion is that 'strongly typed language' doesn't mean anything at all, and that if you hear someone say that some language is strongly typed, or some other language is weakly typed, you should assume that you don't know what they meant.

Slide 47: Talk 12: A Message for the Aliens

Two researchers put together a message for the aliens and then used the Arecibo radio telescope to shoot it out into space; I forget in what direction. When I found this, I had a great time trying to decipher the message. It's fun to pretend you're an alien, and also fun to see if you're as smart as the researchers thought the aliens would be. If you want to try this, I suggest you put away the rest of this talk and come back to it afterwards, because it contains spoilers.

The message is 23 pages long. I left page 1 up on the screen for a couple of minutes so that people could try to figure it out. You might want to try that now.

Anyway, the first page defines the symbols for the ten numerals. Then there follows a list of prime numbers, from 2 up to 89, and then 23021377-1 ,which is the largest known prime. I like to imagine the aliens' reaction to this. Either they'll be astounded (``How could they figure out that such a big number is prime?'') or unimpressed (``Pff, is that the best they can do?'') I wonder which it will be?

Page 2 explains the definitions of the four basic arithmetic operators. My favorite page is #14, which explains the basic structure of our planet. There's a picture of land, dipping into the ocean, and the chemical composition of land (Mostly SiO2), sea (Mostly H2O, plus Na and Cl), and air (Mostly N2, O2, and Ar), with the height of the atmosphere and depth of the ocean shown. A previous diagram showed pictures of people, and smaller cartoons of people are shown here, standing on the land part, to show what part of the planet we live on. My favorite part of the picture is the parabola over on the right-hand edge. What's going on there? Well, we don't want the aliens to decode the picture sideways or upside down, and the parabola is showing an object accelerating under the influence of gravity, implying a clear 'down' to the picture. Just to make it clearer, the parabola is annotated with the glyph for 'acceleration'.

One interesting thing about the message is that it contains some errors. For example, on page 7, the 3rd glyph from the right in the third row is wrong. That part of the message (mistakenly) asserts the existence of Uranium-208. The glyph is a 1 and it should be a 4. I was rather confused when I got to this part, and wondered what I had done wrong. (Someone in the audience asked ``Did you send a patch?'' I had sent the authors a message pointing out the error, but they did not seem to be very grateful.)

Another stumper appeared on page 16. This page is describing human sensory modalities. For example, the graph at the bottom of the page is explaining the way our cone cells respond to red, green, and blue light. The three peaks on the graph are labeled with the frequencies light that are best received by the three kinds of cone cells.

I was totally mystified by the diagram above this, with the waves. Finally I had to read the paper that explained the message. It turns out that this is supposed to be a picture of sound waves, with the minimum and maximum audible sonic frequencies. I was really annoyed by this, because the picture is wrong. The picture clearly shows transverse waves (in which the medium is moving up and down, perpendicular to the direction in which the wave is travelling) but sound waves are not transverse! They are longitudinal waves, or pressure waves; the particles of air move back and forth, parallel to the direction in which the waves are moving, creating regions of high and low pressure. There's no reason why the picture couldn't have been drawn correctly. Additionally, the illustration could have been annotated with the glyph for 'pressure', which was previously defined, but it wasn't. So not only are the aliens going to think we're bad proofreaders, because of the Uranium-208 mistake, but they'll also think that we don't understand pressure waves.

My conclusion was that the kind of people who like to send messages to the aliens are a little goofy. Or, perhaps that I could have done it better. People who know me won't find it at all surprising that I came to this conclusion.

Slide 48: End

Another illustration from The Uncanny X-Men #21.

It's a fine, fine thing to be named 'Dominus', let me tell you.

Thanks for reading my talk notes.



Return to: Universe of Discourse main page | Perl Paraphernalia | Classes and Talks

mjd-perl-yak@plover.com