For 2003 OSCON I proposed to give nine five-minute talks in a 45-minute slot. To avoid confusing this with the real lightning talks sessions, Nat decided to call my talk ``Nine Views of Mark Jason Dominus''. At first I didn't like that title, which sounded too much like self-promotion. But I warmed up to it after a while and included various pictures (views) of myself in between the nine talks.
Before I wrote the talk, I was invited to give it at YAPC::Israel in a one-hour slot. So I wrote twelve short talks, and then for OSCON I cut out talks 2, 7, and 11. This is the long version.
strict
The slide the audience members saw while they were waiting for the talk to start had nothing on it except the talk title and a panel from The Uncanny X-Men issue #21, in which the villain was named Dominus.
Around 1994 I was dating a woman who was a quilter. To make a quilt, you take a lot of small pieces of fabric and sew them into a 'patch', then you sew some patches into 'blocks', and then you sew the blocks together into a quilt top. Then you sandwich the quilt top with a quilt back (usually a single piece of fabric) and cotton batting in between, and that's a quilt.
There are a lot of traditional quilt blocks, with names like log cabin, courthouse steps, flying geese, corn and beans, broken dishes, and so on. I got to wondering if the space of possible blocks had been mostly explored or not. So I wrote some Perl programs to generate quilt blocks. I decided to investigate arrangements of what are known as ``half-square triangles'', where you have squares that are diagonally split into two triangles. I enumerated all the ways to sew together four half-square triangles. Then I took each of these and put together four copies of it into a rotationally-symmetric block. It turns out that there are exactly 72 ways to do this. My enumeration had a bug, so I came up with 73. A fun puzzle is to figure out which one is repeated.
Anyway, as I said, I did this project to impress my girlfriend, and she was indeed impressed. So this was not only the coolest Perl project I ever did, but also the most successful.
When we got married, she actually made the quilt for me and give it to me as a wedding present---you can see a bit of it in the background on that last picture. There's a long tradition of puzzle quilts, so she left in the duplicated block. To make room (because it's very hard to make a quilt with 73 blocks, since 73 is a prime number) she left out one of the other blocks---it's fourth from the left in the fifth row.
Source code and more outputs are here.
There's been a puzzling trend in the Perl world in the last few years,
away from use of the system
function, which invokes an
external shell command. There are good reasons to avoid
system
in some circumstances, but people are becoming
increasingly dogmatic about avoiding system
at all costs.
I was particularly struck by something that happened to me a year or
two ago. The program that manufactures my conference slides
takes a large text file, splits it into
slides, and then runs Seth Golub's txt2html
program on each of the
text files to convert them to HTML. txt2html
is rather slow, so I
don't want to run it on every slide every time; instead, the wrapper
keeps a backup of each text file, and compares the new version of each
text file to the backup to determine which ones need to be run through
txt2html
.
To do the comparison between the old and the new versions of the text
file, the program uses the standard Unix cmp
program, which
compares two files byte-by-byte and reports whether they are the same.
I mentioned this approach on the #perl
IRC channel once, and I was
immediately set upon by several people who said I was using the wrong
approach, that I should not be shelling out for such a simple
operation. Some said I should use the Perl File::Compare
module;
most others said I should maintain a database of MD5 checksums of the
text files, and regenerate HTML for files whose checksums did not
match those in the database.
I think the greatest contribution of the Extreme Programming movement
may be the saying ``Do the simplest thing that could possibly work.''
Programmers are mostly very clever people, and they love to do clever
things. I think programmers need to try to be less clever, and to
show more restraint. Using system("cmp -s $file1 $file2")
is in
fact the simplest thing that could possibly work. It was trivial to
write, it's efficient, and it works. MD5 checksums are not necessary.
I said as much on IRC.
People have trouble understanding the archaic language of ``sufficient unto the day is the evil thereof,'' so here's a modern rendering, from the New American Standard Bible: ``Do not worry about tomorrow; for tomorrow will care for itself. Each day has enough trouble of its own.'' (Matthew 6:34)
People on IRC then argued that calling cmp
on each file was
wasteful, and the MD5 approach would be more efficient. I said that I
didn't care, because the typical class contains about 100 slides, and
running cmp
100 times takes about four seconds. The MD5 thing
might be more efficient, but it can't possibly save me more than four
seconds per run. So who cares?
So far I have no objection to the discussion; it's just a difference
of opinion. But at this point the discussion went awry. The IRC folks
just couldn't hear me; they couldn't let go of the idea that cmp
was Wrong. They started arguing with me about the time taken up by
fork
and exec
, after I had already pointed out that there was
no performance problem to be solved.
A more thoughtful examination of the performance issue shows that even
if we do consider it a problem, the MD5 thing may not solve it.
system("cmp")
does fork
and exec
, but once cmp
is running,
it's extremely fast, since what it's doing is extremely simple.
Moreover, if the files differ, it can usually quit early, without
having to read past the first block of each file. In contrast, MD5
needs to read an entire file and perform a lot of complicated
calculations to come up with the checksum, and then it has to read the
database for the old checksum. A couple of years later I implemented
the MD5 thing, just to see what would happen, and sure enough, the
program became slower.
One of Perl's big selling points has been that it's useful as a 'glue language', for gluing other tools together. It still excels as a glue language. Why this desire to absorb all those other tools into Perl?
One of Perl's mottoes has been that Perl might not be as fast as C, but
you can write a prototype program in Perl quickly, and then you might
discover that there's no need to replace the prototype because it's
already fast enough. That's exactly what I did with txt2slides
.
When I wrote the cmp
line, I thought ``Well, this might not be fast
enough, but if it isn't, I'll change it later.'' Then it was fast
enough, so I didn't bother to change it. That's what's supposed to
happen! Why isn't this considered a tremendous success for the Perl Way?
Let's not forget the things that are good about Perl. It's good at interacting with other programs, and it's good for rapid prototyping. Let's not hassle people when they use Perl the way it was designed to be used.
This is me at YAPC 19000 in Pittsburgh. My wife made this hat for me for my first appearance at a conference "guru session".
I released the Text::Template
module several years ago, and it was
immediately very successful. It's small, simple, fast, and it does a
lot of things well. At the time, there were not yet 29 templating
systems available on CPAN.
Anyway, the module quickly stabilized. I would get bug reports, and they would turn out to be bugs in the module's users, not in the module; I would get feature requests, and usually it turned out that what the requester wanted was possible, or even easy, without any changes to the module. Since the module was perfect, there was no need to upload new versions of it to CPAN.
But then I started to get disturbing emails. ``Hi, I notice you have
not updated Text::Template
for nine months. Are you still
maintaining it?'' ``Hi, I notice you seem to have stopped work on
Text::Template
. Have you decided to abandon this approach?'' ``Hi,
I was thinking of using Text::Template
, but I saw it wasn't being
maintained any more, so I decided to use Junk::CrappyTemplate
,
because I need wanted to be sure of getting support for it.''
I started wondering if maybe the thing to do was to release a new
version of Text::Template
every month, with no changes, but with an
incremented version number. Of course, that's ridiculous. But it
seems that people assume that if you don't update the module every
month, it must be dead. People seem to think that all software
requires new features or frequent bug fixes. Apparently, the idea of
software that doesn't get updated because it's finished is
inconceivable.
I blame Microsoft.
Here I am at YAPC 19100 again. I had taught classes at LISA in San Diego on Tuesday, and then I flew overnight to Pittsburgh for YAPC on Wednesday. When I arrived, Kevin Lenzo told me that Joe Hall had been snowed in at Chicago, and asked if I could do some impromptu talks to fill up his slots in the schedule. I gave seven hours of talks that day.
Here's another puzzle about Text::Template
. People often write to me
with new feature requests. For example, people want the package to
preprocess the code in the templates before evaluating it, or to infer
the template syntax from the template file extension. Often these
things are very application specific, so I don't want to bloat the
module with them, and anyway, they can easily be accomplished by
subclassing the module and overriding one or another of the methods.
Usually I'll even write the subclass and send it with my reply.
However, this straightforward solution is usually rejected. People
want me to put the code into Text::Template
itself and release a
new version.
I had originally concluded that this reluctance to subclass modules was just irrational behavior. After all, one of the major promises of object-oriented programming is the opportunity for code reuse via inheritance. This benefit is being thrown away. A related data point is that very few CPAN modules subclass other CPAN modules. Has-A relationships are common, but Is-A relationships are rare. But after a little more thought, I decided this might not be entirely irrational.
Slide 15 shows an example: We want Text::Template
to run all the
code fragments through a preprocessor before evaluating them when it
is filling out a template. To do this is very easy. You subclass the
module and add a new method, which installs a preprocessor function
into the Text::Template
object when it is called. Then you
override the Text::Template::compile
method to invoke the
preprocessor if there is one, and then call the real compile
method. This is all pretty easy.
But there are a couple of big problems here. First, the subclass
depends on the fact that the Text::Template
object is a blessed
hash, which is not guaranteed anywhere. It also depends on the fact
that the object is not using the PREPROCESSOR
key for anything
else, and, moreover, that it won't ever use it for anything else. It
also depends on the presence of the compile
method, which is
undocumented and which therefore might go away in a future version of
the module, breaking the subclass.
So you could do the subclass, but you would never be sure that it wouldn't break someday. It depends on too many things that I didn't promise.
I'm not sure what to do about this. Larry said that the problem was
Perl's poor object model. I disagreed. A better model will help
solve the hash key collision problem, but not the undocumented method
problem. I suggested that perhaps one solution would be for modules
to start including an explicit SUBCLASSING INTERFACE
section in
their documentation, spelling out just what guarantees the author
would make for subclasses.
Here I am at an early Usenix conference, probably around 1992.
I get a lot of mail from people I've never met, asking for help. Usually, I try to help, but some of the mail is so awful that all I can do is throw it away. Here's some advice on getting strangers to give you help.
Here we see the wrong way to go about it.
Subject
is uninformative. It says Your article
. I write a
lot of articles. Is it a magazine article? A Usenet article? I
have no idea.
Then comes the question: ``What does forkish mean?'' Apparently I must
have used the word ``forkish'' in some article somewhere. I have no
idea any more what I meant by it. So now I have to go use Google to
figure out what I was talking about.
``This is very important.'' Not to me! (Or, as someone said to me
afterwards, ``Gosh, I'll be sure to clear my schedule!'') I'm not just
being a grump here. If this person would tell me why it was so
important, I might agree. But they don't give me a chance to agree.
Having sent a letter that contains a question, you're already
implicitly asking for a response. Adding Please respond
makes it
sound like you're grovelling. The only thing more annoying than
Please respond
is Please respond ASAP
.
Here's my rewrite of this letter into a form that I would have been more likely to answer.
What's different?
Here's a real example of a message that I did reply to, in some detail:
Date: Wed, 03 Oct 2001 18:29:17 +0100 From: ``L. Thompson'' <xxxxx@xxx.xxxxx.ac.uk> Organization: University of Keele To: mjd-perl-questions-id-i8g+br7a9j0+@plover.com Subject: Big big stupid question-PLEASE HELP!!! Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Hey there, I came upon your web page when I was looking for the answer to a stupid question which I've started to become really curious about. I really hope I don't offend you by asking, but surely someone with computer knowledge, used to stupid questions, might know something? My friend's at uni and she doesn't have the first clue about computer programming. So when her computer science tutorial left her with the homework question ``How much money is there in the world?'' she was naturally a bit confused. Not being too confident about my own answer (we don't know enough about the value of money to answer that one...) can you tell me what the hell this has to do with programming and computer science? Thanks in anticipation for any help you can give, Lo. xxxxx@keele.ac.uk
Here's one of the more remarkable help requests I received.
The author is trying to get a Master's degree in Armenia. Apparently there is no library in Armenia, because they are asking me for information on ``Lexico-grammatical peculiarities of the language of constitution''. Why me? I have absolutely no idea.
I didn't bother to answer this one. I feel sorry for the Armenian student, because I suppose there is not much money in Armenia, and the library might not be well-stocked. But this is not a problem I can solve, or should be asked to solve. If you're a master's student, and you find that there is a dearth of information on your chosen thesis topic, the correct response is to pick a different topic.
This picture was taken around 1991 at a party in Las Vegas, New Mexico. This was the occasion on which I first met Nat Torkington, the OSCON program chair, and also the party at which Nat first met Jenine, who is now his wife.
When the picture came up on the screen at OSCON, some people whistled. But I shamefacedly admitted that I had posted this picture on ratemyface.com, and it had received a rating of only 6.
Someone asked on PerlMonks ``How to progress''. Here are my notes on how to become smarter, which is something I hope we'd all like to do.
When everyone in the community reads the same books, you can an inward-looking, intellectually impoverished community that can only contemplate its own navel. When we read all different books, we all have more to learn from each other.
In the talk, I was about to list some books I had read recently that I thought other people hadn't, but then I stopped and said I wouldn't do it because I was afraid people would then go out and read those books.
One of the reasons that subjects become important and their inventors become famous is because the inventors were able to describe the subject so clearly.
I tried reading books about design patterns a while ago, and I was underwhelmed. But a lot of people are interested in design patterns; why is that? I went back to the original Christopher Alexander books and found out why: pattern languages are a brilliant idea and Alexander's books about them are works of genius.
Here are some other original sources that are excellent: Einstein's book on special relativity is the thing to read if you want to learn about special relativity. Galileo's book on mechanics, Discourses on Two New Sciences, may be the single best technical book I've ever read on any subject.
You will get a lot more out of it if you read actively: Read a sentence. Then reflect for a long time. Ask questions about it. When you're done, read the next sentence.
I think I explained this really well in my original PerlMonks article, so I'll reproduce the relevant section here:
Do the same thing when you read the Perl manual. Read a sentence. I will pick a sentence from the Perl manual at random: Line 2000 of perlfunc, whatever that is:
getpgrp Returns the current process group for the specified PID.Great example. What is a process group? What is it for? When would I want to use getpgrp? What is an actual application in which I would need getpgrp? What is another? What will getpgrp actually return? An integer, or what? What would be the best design here? (Then if the next sentence reveals that they didn't use the design you thought was best, try to figure out which one of you screwed up.) Can this call ever fail? When? How? How will it report the error? What might the return value be in that case? (Then read on to see if you are correct.) Are there permission or security problems here? Would it be dangerous to be able to get the process group ID for someone else's process? And so on.
Then you go on to the next sentence. If you read the whole manual like this, you will progress. That's what you wanted to know, right? How to progress.
Write down ideas you have. Save them in a file. A year later, you will be amazed at all the stuff you thought about that you have forgotten. You will probably enjoy rereading these files, because most people find their own thoughts interesting.
The joke here is that the intermission for a 5-minute talk is only 15 seconds long. I thought people would laugh at this more than they did.
Some of the very biggest problems in computer science over the last 30 years have had to do with the idea of NP-completeness. Here's the short explanation. There are a lot of problems that people would like to solve, but no good algorithms are known that work in general. A typical example is the Travelling Salesman problem. In the travelling salesman problem, you have a bunch of cities, a cost to travel between each pair of cities, and a budget. The question is, is there a route that visits each city exactly once, returns to its starting point, and comes in under budget?
NP-complete problems are characterized by the property that it's easy to check proposed solutions to see if they're correct. If someone purports to have an under-budget itinerary for a particular example of the problem, it's quick and trivial to add up the distances and check if they're under budget. But how do you come up with such an itinerary? Nobody knows. The only known algorithm that always works is to examine every possible itinerary, looking for a route that is under budget.
There are hundreds of these NP-complete problems known, and the amazing thing is that an efficient solution to any of them could be turned mechanically into an efficient solution to all the others. But nobody knows a solution to any of them!
Now here's what the talk is about. Sometimes someone will show up in a help forum with a problem they need to solve. Someone else will observe (correctly) that the problem is NP-complete. They will then say ``so you might as well give up.'' This is the wrong conclusion.
Here's an example. Someone was writing a chat system. When a new user logged into the chat system, the author wanted to examine their buddy list and report the largest group of their buddies who were (a) all online at the same time, and (b) all buddies of each other as well. This is called the Clique problem, and it is NP-complete. And sure enough, someone told them that they should give up on this feature.
Just knowing that a problem is NP-complete does not remove your need for it, and you should not be too quick to give up. This is for several reasons. First, there may be something about your particular application that renders the problem intractable. In the case of the chat system person, it turns out that it's not intractable at all. This is because you don't have to compute all the cliques at once. If you assume that you have all the large cliques computed beforehand, then when a new user logs in, you don't have to re-compute everything from scratch; you just have to compute the cliques that involve this new user---and you know that none of them will be more than one person larger than the onces you had before; similarly you can easily compute the changes to the cliques when someone logs out. The total amount of computation over the course of the day may be large, but it is amortized into small amounts whenever someone logs in or out.
NP-completeness is a statement about coming up with optimal solutions. If sub-optimal solutions are acceptable, there may be easy ways to find them. For example, consider the Bin Packing problem. In this problem, you have a bunch of files, and you want to back them up onto floppy diskettes; you want to use as few diskettes as possible. This is NP-complete; the only way to guarantee the minimal number of diskettes is essentially to try every possible way of distributing the files. But there's an easy algorithm that works quite well: sort the files from largest to smallest. Put the largest file on the first diskette. Put the next-largest file on the first diskette if it fits, and on a new diskette if not. Put the third-largest file on the first diskette on which it fits, starting a new diskette if it doesn't fit anywhere. When you're done, you may have used more than the minimum possible number of diskettes, but it turns out you never use more than 122% of the minimum possible number, and in practice, you typically won't use more than about 105% of the minimum possible number.
Travelling Salesman is similar. If you want the least-cost itinerary, all you can do is enumerate every possible itinerary looking for the cheapest one. But there are quick algorithms that are guaranteed to find an itinerary that costs no more than 50% more than the cheapest one.
Lots of NP-complete problems can be solved optimally, and efficiently, in most cases; NP-completeness is a statement about the hardest instances of a problem. 3-Color is an example: You have a bunch of people and their buddy lists, and you want to know if it's possible to divide the people into three groups so that nobody is in the same group with one of their buddies. There are heuristic techniques you can use to simplify the buddy lists until they're trivial---either you end up with four people who are all each other's buddies, in which case the answer is obviously ``no'', or you end up eliminating everything, in which case the answer is obviously ``yes''. On some really complicated arrangements of buddy lists, the heuristics don't work, and they get stuck without simplifying the situation very much. But those sorts of situations are pathological and don't come up in practice very often.
In some NP-complete problems, there are efficient solutions for all the small examples, and nobody cares about the large ones. In the Partition problem, you have a list of numbers, and you want to divide the numbers into two groups so that the sum of the numbers in each group is the same. There's a straightforward algorithm that's quadratic in the size of the largest number. Unless your numbers are really enormous, this algorithm works just fine. In practical examples, the numbers typically represent costs or times, so you don't have to worry about them getting really large.
So next time someone tells you to give up on your problem because it's NP-complete, ignore them.
Here I am getting married.
strict
Actually I don't hate strict
; I just hate the stupid way people
think and talk about it.
strict
is another dogma that has become increasingly prevalent in
the community over the last few years. Beginners are encouraged to
use strict
whether or not they know what it does. What's the point
of that? Use of strict
has somehow become identified as a
necessary ingredient of 'good programming practice', and so beginners
are encouraged to 'adopt good programming practices'. Sorry, but
putting declarations into your program when you don't know what they
do is the worst possible programming practice, and this is what we
are encouraging beginners to do.
As with the 'glue language' discussion above, my complaint isn't so
much about the position itself, whether people should use strict
,
as with the dogmatism and thoughtlessness with which it's promulgated.
People have stopped thinking about strict
, and that's what I
hate. People come into the Usenet comp.lang.perl.misc
group with
some question, which they demonstrate with a four-line example
program, and other folks there jump on them: ``Why didn't you use
strict
?'' Well, because it's a four-line example program I
concocted as an example in my Usenet article---duh!
Here's another example that's not on the slide. Some time ago there
was a review on PerlMonks of the book Perl and CGI for the World
Wide Web: Visual Quickstart Guide. The reviewer was very unhappy that
none of the examples used strict
. I said I couldn't see any
benefit that would accrue from the examples using strict
.
strict
does three things:
strict 'refs'
, which prevents strings from
being accidentally used as references. None of the examples in the
book used references at all, so there was no reason to use strict
'refs'
.
strict 'vars'
, which prevents global variables
from being used without being declared; typically, one declares
variables local with the my
declaration. While this is good
practice in general, the example programs were all very small---less
than twenty lines each. In such small programs, there is no
practical difference between a global variable and one that has been
declared with my
. No benefit would have accrued from requiring the
use of my
declarations of every variable.
strict 'subs'
, which is of very limited value
even at the best of times. strict 'subs'
forbids unquoted strings,
because such strings ('barewords') can cause long-term maintenance
problems. If you have code likeif ($x eq carrots) { ... }
the carrots
is taken as a literal string. But if someone later
adds a carrots()
function to the program, the meaning of this line
might change suddenly and unexpectedly, to call carrots()
and
compare $x
with the returned value. This is not too likely, except
perhaps in very large and long-lived programs, which
is why strict 'subs'
is of such limited value. In 20-line book
examples, it is of no value whatsoever.
Here's the problem: the reviewer of my book read my criticism, and
instead of answering my objections, his response was that everyone
should always use strict
. Using strict
is outside of the bounds
of rational discussion for some people.
The example linked from the slides demonstrates this. A programmer
posted to comp.lang.perl.misc
with an example that manufactured a
function name at run time, and then called the function. Many people
jumped in to say this was a mistake, some even saying it was a mistake
because it caused a strict 'refs'
violation. This is nonsense.
Violating strict 'refs'
is like setting off the fire alarm. It is
a warning that something dangerous may be occurring; it is not
a problem in itself. The problem is the fire.
What is the fire, the real problem, in this case? I wrote a series of widely-read and often-cited articles about the problems that arise from using values as the names of variables and functions. Before responding to this thread, I read over the articles, checking each problem I mentioned to see if it could occur in the original programmer's code. None of the real problems could actually arise. I said that I was not able to see what the problem was, and asked people to explain it.
The explanations were astonishingly obtuse. Some people reiterated
that one should 'always use strict
'. Some suggested
problems that obviously could not occur. For example, one respondent
said:
What if some random string was passed in from the outside and a bad sub was called?
But the original author's code was:
foreach my $x ("email_zip", "email_txt", "cp_zip", "cp_txt") { if ($res = &{"_send_$x"}($domain, $file, $date, $location)) { push (@results, $res); } ... }
The subroutine names are not 'passed in from the outside'; they are hardwired into the program. It is completely impossible for a 'bad sub' to be called here; the four subroutines that are called are completely determined at compile time.
That respondent continued:
At least check the sub name against a hash for verification.
This shows that at least this one person wasn't thinking about what he was saying. The subroutine names are derived from a compile-time list of names embedded in the code. If you don't trust this list to be accurate, why would you expect the hash to be accurate? I suggested that maybe you should first check the hash against a second hash to make sure the hash is correct, before using it for verification.
Any time you hear anyone saying ``you should be using strict
,'' you
can be fairly sure they're not thinking about what they're saying,
because strict
has these three completely unrelated effects. A
more thoughtful remark is ``you should be using strict 'vars'
'' (or
whatever). As an exercise in preventing strict
zombie-ism, ask
yourself ``Which of the three strict
effects do I find most useful?
Which do I find least useful?''
Here I am as a South Park character.
The #perl
IRC channel has a big problem. People come in asking
questions, say, ``How do I remove the first character from a string?''
And the answer they get from the regulars on the channel is something
like ``perldoc perlre
''.
This isn't particularly helpful, since perlre
is a very large
reference manual, and even I have trouble reading it. It's sort of
like telling someone to read the Camel book when what they want to
know is how to get the integer part of a number. Sure, the answer is
in there somewhere, but it might take you a year to find it.
The channel regulars have this idiotic saying about how if you give a man a fish he can eat for one day, but if you teach him to fish, he can eat for his whole life. Apparently ``perldoc perlre'' is what passes for ``teaching a man to fish'' in this channel.
I'm more likely to just answer the question (you use $string =~
s/.//s
) and someone once asked me why. I had to think about that a
while. Two easy reasons are that it's helpful and kind, and if you're
not in the channel to be helpful and kind, then what's the point of
answering questions at all? It's also easy to give the answer, so
why not? I've seen people write long treatises on why the querent
should be looking in the manual instead of asking on-channel, which it
would have been a lot shorter to just answer the question. That's a
puzzle all right.
The channel regulars say that answering people's questions will make them dependent on you for assistance, which I think is bullshit. Apparently they're worried that the same people will come back and ask more and more and more questions. They seem to have forgotten that if that did happen (and I don't think it does) they could stop answering; problem solved.
The channel regulars also have this fantasy that saying perldoc
perlre
is somehow more helpful than simply answering the question,
which I also think is bullshit. Something they apparently haven't
figured out is that if you really want someone to look in the manual,
saying perldoc perlre
is not the way to do it. A much more
effective way to get them to look in the manual is to answer the
question first, and then, after they thank you, say ``You could have
found the answer to that in the such-and-so section of the manual.''
People are a lot more willing to take your advice once you have
established that you are a helpful person. Saying perldoc perlre
seems to me to be most effective as a way to get people to decide that
Perl programmers are assholes and to quit Perl for some other
language.
After I wrote the slides for this talk I found an old Usenet discussion in which I expressed many of the same views. One of the Usenet regulars went so far as to say that he didn't answer people's questions because he didn't want to insult their intelligence by suggesting that they would be unable to look in the documentation, and that if he came into a newsgroup with a question and received a straightforward answer to it, he would be offended. I told him that I thought if he really believed that he needed a vacation, because it was totally warped.
I'm told that I was cold sober when this picture was taken. I don't remember.
There are a lot of things wrong with the Lisp user community.
comp.lang.lisp
is one of the sickest newsgroups I've ever seen.
This article conveniently demonstrates two serious problems at once.
At least two or three times a year, comp.lang.lisp
has a long
discussion about why it is that more people don't use Lisp. In one of
those discussions, Peter da Silva suggested that if there were a
'lispscript' utility, analogous to AWK, which allowed people to easily
do the sorts of things that AWK, then people would begin using Lisp
for casual scripting, and would go from there to longer projects. His
example was:
awk 'BEGIN {FS=":"}; $6=="/sbin/nologin" {print $1}' /etc/passwd
The brilliantly obtuse response is in two parts. First, ``You can do that already'':
I already frequently use it for casual scripting (well, Scheme, mostly). With only a couple of new utility macros & functions, your example could be expressed in Common Lisp as:
(with-lines-from-file (line "/etc/passwd") (let ((fields (string-split line :fs #\:))) (when (string= (aref fields 5) "/sbin/nologin") (format t "~A~%" (aref fields 0))))))
This solution is a little over 2.5 times as long as the AWK version. But at least it only requires ``a couple of new utility macros and functions''! The only thing more amazing than the degree to which the author missed the point of the exercise here is the degree to which he seems unaware that he missed the point of the exercise.
So problem #1 is a total cluelessness about what other people consider valuable and useful.
But the respondent goes on, and here's a more serious problem:
But seriously, how many ``one-liners'' do you actually write anyway? Not many. And by the time you've done coded up something that's complex enough to be useful, Perl's tricky little code-compression notation usually expands to be about the same length as any other language, and six weeks or six months later I'd much rather be reading Lisp than decoding a Perl puzzle.
How many ``one-liners'' do I actually write? I don't know; maybe a couple dozen a day. But I guess I must be unusual, because as we all know, AWK was a complete failure and vanished into obscurity since it didn't address anyone's real needs. (That was sarcasm.) So problem #2 is that when faced with someone else's problem which Lisp doesn't solve effectively, the response is a mixture of ``that's not a real problem'' and ``you're an idiot for wanting to solve that.''
Another thing to notice is the little slam against Perl. How did
Perl get involved in this? Da Silva was discussing Awk, not Perl.
But the comp.lang.lisp folks can't stop talking about Perl.
They are constantly talking about Perl. I looked into
comp.lang.python
to see if it was similar, and I found
out that people in comp.lang.python hardly ever discuss Perl.
I think that shows that comp.lang.lisp is sick and
comp.lang.python is healthy: the Lisp folks are interested in
Perl, and the Python folks are interested in Python.
Here's the real reason why Lisp won't win. The Lisp programmers don't want it to win. They're always complaining that not enough people are using Lisp, and that Lisp isn't popular. But they humiliate and insult newcomers whenever they appear in the group. (The group regulars would no doubt respond to this that the newcomers deserve this, because they're so stupid and argumentative.) If Lisp did become popular, it would be the worst nightmare of the comp.lang.lisp people.
Lisp is a really excellent language in a lot of ways, but the Lisp world has several huge social problems. I'd like to help, but I don't think I can, because they don't want to hear it from anyone, and least of all from me.
I'd just come back from the ophthalmologist.
This talk was inspired by a thread in comp.lang.perl.misc
, where
someone said that Perl was a 'weakly typed' language, and Randal
Schwartz disagreed and said it was 'strongly typed'. That surprised
me, because I would have used Perl as a perfect example of a 'weakly
typed' language, since any kind of data will be silently converted to
any other kind at run time. I wanted to dispute Randal, but first I
needed to make sure that my understanding of 'strongly typed' was
correct. So I did a little research and discovered that my
understanding was not correct.
I read a lot of articles and class notes written by a lot of authorities, and discovered that there was no consensus about whether C was a 'strongly typed' or a 'weakly typed' language. Slide 43 has some examples of each.
I found a lot of web pages that contrasted C and Pascal, and asserted that Pascal was a 'strongly typed' language while C was 'weakly typed'. But the type systems of C and Pascal are almost exactly the same!
After reading a lot of articles, I discovered that different people had at least eight different notions of what 'strongly typed language' meant. Sometimes they weren't even sure themselves; I found some articles which defined 'strongly typed language' and then classified languages as strongly or weakly typed in explicit accord with some other definition that contradicted the one they had first given.
My conclusion is that 'strongly typed language' doesn't mean anything at all, and that if you hear someone say that some language is strongly typed, or some other language is weakly typed, you should assume that you don't know what they meant.
Two researchers put together a message for the aliens and then used the Arecibo radio telescope to shoot it out into space; I forget in what direction. When I found this, I had a great time trying to decipher the message. It's fun to pretend you're an alien, and also fun to see if you're as smart as the researchers thought the aliens would be. If you want to try this, I suggest you put away the rest of this talk and come back to it afterwards, because it contains spoilers.
The message is 23 pages long. I left page 1 up on the screen for a couple of minutes so that people could try to figure it out. You might want to try that now.
Anyway, the first page defines the symbols for the ten numerals. Then there follows a list of prime numbers, from 2 up to 89, and then 23021377-1 ,which is the largest known prime. I like to imagine the aliens' reaction to this. Either they'll be astounded (``How could they figure out that such a big number is prime?'') or unimpressed (``Pff, is that the best they can do?'') I wonder which it will be?
Page 2 explains the definitions of the four basic arithmetic operators. My favorite page is #14, which explains the basic structure of our planet. There's a picture of land, dipping into the ocean, and the chemical composition of land (Mostly SiO2), sea (Mostly H2O, plus Na and Cl), and air (Mostly N2, O2, and Ar), with the height of the atmosphere and depth of the ocean shown. A previous diagram showed pictures of people, and smaller cartoons of people are shown here, standing on the land part, to show what part of the planet we live on. My favorite part of the picture is the parabola over on the right-hand edge. What's going on there? Well, we don't want the aliens to decode the picture sideways or upside down, and the parabola is showing an object accelerating under the influence of gravity, implying a clear 'down' to the picture. Just to make it clearer, the parabola is annotated with the glyph for 'acceleration'.
One interesting thing about the message is that it contains some errors. For example, on page 7, the 3rd glyph from the right in the third row is wrong. That part of the message (mistakenly) asserts the existence of Uranium-208. The glyph is a 1 and it should be a 4. I was rather confused when I got to this part, and wondered what I had done wrong. (Someone in the audience asked ``Did you send a patch?'' I had sent the authors a message pointing out the error, but they did not seem to be very grateful.)
Another stumper appeared on page 16. This page is describing human sensory modalities. For example, the graph at the bottom of the page is explaining the way our cone cells respond to red, green, and blue light. The three peaks on the graph are labeled with the frequencies light that are best received by the three kinds of cone cells.
I was totally mystified by the diagram above this, with the waves. Finally I had to read the paper that explained the message. It turns out that this is supposed to be a picture of sound waves, with the minimum and maximum audible sonic frequencies. I was really annoyed by this, because the picture is wrong. The picture clearly shows transverse waves (in which the medium is moving up and down, perpendicular to the direction in which the wave is travelling) but sound waves are not transverse! They are longitudinal waves, or pressure waves; the particles of air move back and forth, parallel to the direction in which the waves are moving, creating regions of high and low pressure. There's no reason why the picture couldn't have been drawn correctly. Additionally, the illustration could have been annotated with the glyph for 'pressure', which was previously defined, but it wasn't. So not only are the aliens going to think we're bad proofreaders, because of the Uranium-208 mistake, but they'll also think that we don't understand pressure waves.
My conclusion was that the kind of people who like to send messages to the aliens are a little goofy. Or, perhaps that I could have done it better. People who know me won't find it at all surprising that I came to this conclusion.
Another illustration from The Uncanny X-Men #21.
It's a fine, fine thing to be named 'Dominus', let me tell you.
Thanks for reading my talk notes.
Return to: Universe of Discourse main page | Perl Paraphernalia | Classes and Talks
mjd-perl-yak@plover.com