Date: Wed, 03 Oct 2001 17:24:27 -0400 From: Benjamin Goldberg Subject: Re: RegExp /g shall substitute more often Message-Id: <3BBB820B.B9ABE17F@earthlink.net> Markus Dehmann wrote: > > The two solutions are nice, but because actually my problem is more > complex they don't help. Your fault for not describing the actual problem. > Look-behind works only for a fixed length but my pattern that stands > before B has in fact a variable length! This is my real problem that > is more complex than just AAA: > > Suppose you have this string: > $_ = "bla-R carter-R bush-R clinton-R dunnowman-R clinton-R bla-R"; > > There is always a name "tagged" with 'R'. Now my rule says: Always tag > clinton with a D if there a two names tagged with R before him. My > output shall be: > > bla-R carter-R bush-R clinton-D dunnowman-R clinton-D bla-R I see... so what you *want* is to transform Rafael Garcia-Suarez's solution: s/(?<=A)A/B/g; into: s/(?<=\w+-R \w+-R clinton-)R/D/g; [This won't work because perl doesn't do variable length lookbehind] > But s/(\w+-R \w+-R clinton-)R/$1D/; delivers, of course, only > bla-R carter-R bush-R clinton-D dunnowman-R clinton-R bla-R That doesn't work for the same reason why: s/(A)A/$1B/g; doesn't work. > The rule doesn't apply to the second clinton because he has not two > names tagged with R before him anymore. Erm, no, that's not the reason. It's because in yours, it starts looking for two Rs starting *after* the substitution... yours has to match and capture those Rs. > This: > s/(?<=\w+-R \w+-R clinton-)R/$1D/; > produces an error: "variable length lookbehind not implemented" Right. And even if it was implemented, it would be wrong, since you've got that stupid $1 in the replacement. If you understood what lookaheads and lookbehinds did, you would know that you don't need it. > What can I do? On solution is to change it to a lookahead, as Jeff Pinyan did. $_ = "bla-R carter-R bush-R clinton-R dunnowman-R clinton-R bla-R"; $_ = reverse $_; s/R(?=-notnilc R-\w+ R-\w+)/D/g; $_ = reverse $_; Another is to not use s///, but do it in a more 'naive' way: my @names = split; for( my $i = $#names; $i >= 2; --$i ) { if( $names[$i] eq "clinton-R" && $names[$i-1] =~ /-R$/ && $names[$i-2] =~ /-R$/ ) { $names[$i] = "clinton-D"; } } print "@names\n"; Whether this is sufficiently fast or not, I don't know, but it's almost certainly the least obfuscated version. -- "I think not," said Descartes, and promptly disappeared.