Perl Line of the Day

This is not the Obfuscated Perl Code Contest; that contest is a joke. If these examples are obfuscated, it is not because I was trying to be funny. (OK, it is not primarily because I was trying to be funny.) All code samples guaranteed authentic, usually from my own programs. Sometimes I determined that they were too obfuscated to live, and replaced them; other times they didn't make it into the final version for other reasons. All were intended seriously and were under serious consideration for at least a while.

1 July, 1999


  1 while s/[(][^()]*[)]//;

Here's another crack at the old problem of trying to write a regex to recognize strings that have balanced parentheses. It's something of a cheat, because it isn't really a single regex, and it destroys the test string.

It works by locating a (...) pair that has no parentheses inside it; this is therefore one of the maximally-nested sets of parentheses. It then removes this maximally-nested set and anything between them. The 1 while makes it into a loop that locates and removes the maximally-nested parenthesis pair repeatedly until no such pairs remain. If the resulting string contains any parentheses, they must not have been correctly matched, and so the original string was unbalanced. If the end product contains no parentheses at all, the original string must have been correctly balanced, because parentheses were only removed in matching pairs.

Here's an example. Suppose the original string was a(b(c)d(e(f)g)h)i(j(k)l(m)n)o. Then the successive iterations of the while loop will produce:

  a(b(c)d(e(f)g)h)i(j(k)l(m)n)o
  a(bd(e(f)g)h)i(j(k)l(m)n)o
  a(bd(eg)h)i(j(k)l(m)n)o
  a(bdh)i(j(k)l(m)n)o
  ai(j(k)l(m)n)o
  ai(jl(m)n)o
  ai(jln)o
  aio

The resulting string contains no parentheses, so it's correctly balanced. On the other hand, let's see what happens to a(b(c)d(e(f(g)h)i(j(k)l(m)n)o):

  a(b(c)d(e(f(g)h)i(j(k)l(m)n)o)
  a(bd(e(f(g)h)i(j(k)l(m)n)o)
  a(bd(e(fh)i(j(k)l(m)n)o)
  a(bd(ei(j(k)l(m)n)o)
  a(bd(ei(jl(m)n)o)
  a(bd(ei(jln)o)
  a(bd(eio)
  a(bd

Here it would have been easy enough to detect the imbalance because there are too many open parentheses and not enough close parentheses. Sometimes you see people suggest this expression:

  if ( tr/)// == tr /(// ) { # Balanced ... }

But this isn't correct because it says that )( is balanced, which it isn't. But the Line of the Day behaves correctly on this example, and also on more complicated examples:

  a(b(c)d(e(f)g)h)i(j)k)l(m(n)o
  a(bd(e(f)g)h)i(j)k)l(m(n)o
  a(bd(eg)h)i(j)k)l(m(n)o
  a(bdh)i(j)k)l(m(n)o
  ai(j)k)l(m(n)o
  aik)l(m(n)o
  aik)l(mo

Older Lines of the Day


Return to: Universe of Discourse main page | What's new page | Perl Paraphernalia

mjd-perl-lotd@plover.com