Date: Sat, 08 Sep 2001 04:40:06 -0400 From: Benjamin Goldberg Subject: Re: array questions Message-Id: <3B99D966.6BC53BA@earthlink.net> Dave Tweed wrote: > > "Randal L. Schwartz" wrote: > > Well, "deeper" copy, but not technically a "deep" copy. For the > > difference, see my article on deep copies ... > > Yes, I made a copy of whatever scalars were in @child. If those were > themselves references, I didn't do anything about that. > > I figured I explained it at an appropriate level of detail for arrays > of arrays, and if anyone had more complex structures, they would pick > up on the pattern. > > In my experience, the required level of copying depth is going to be > highly dependent on the specific application. Consider: > > @a = ('scalar1', 'scalar2'); > @aa = (\@a, \@a); > @bb = &deep_copy (\@aa); > > @bb ends up with a very different structure than the one in @aa. If it gives a different data structure than the original, then, IMHO, it's wrong. Here's a better [but untested] version: sub deep_copy { my $this = shift; if (not ref $this or UNIVERSAL::isa($this, "CODE")) { return $this; }; my $addr = addr($this); my $cache = shift || {}; return $cache->{$addr} if exists $cache->{$addr}; if (UNIVERSAL::isa($this,"SCALAR") or UNIVERSAL::isa($this, "REF")) { # make a copy of the scalar and return a reference to that my $temp = $cache->{$addr} = do { my $x; \$x }; if( tied $$this ) { tie $$temp, __PACKAGE__, deep_copy(tied $$this, $cache); } else { if( ref($this) eq "Regexp" ) { $$temp = qr/$this/; } else { $$temp = deep_copy($$this, $cache); } } bless $temp, ref $this; } elsif (UNIVERSAL::isa($this, "ARRAY")) { my $temp = $cache->{$addr} = []; if( tied @$this ) { tie @$temp, __PACKAGE__, deep_copy(tied @$this, $cache); } else { foreach(@$this) { if( tied $_ ) { # make use of $#array+1 == @array tie $$temp[@$temp], __PACKAGE__, deep_copy(tied $_, $cache); } elsif( UNIVERSAL::isa(\$_, "GLOB") { require Symbol; push @$temp, *{gensym()}; deep_copy(\$_, $cache, \$$temp[-1]); } else { push @$temp, deep_copy($_, $cache); } } } bless $temp, ref $this; } elsif (UNIVERSAL::isa($this, "HASH")) { my $temp = $cache->{$addr} = {}; if( tied %$this ) { tie %$temp, __PACKAGE__, deep_copy(tied %$this, $cache); } else { foreach( keys %$this ) { if( tied $$this{$_} ) { tie $$temp{$_}, __PACKAGE__, deep_copy(tied $$this{$_}, $cache); } elsif( UNIVERSAL::isa(\$$this{$_}, "GLOB") { require Symbol; $$temp{$_} = *{gensym()}; deep_copy(\$$this{$_}, $cache, \$$temp{$_}); } else { $$temp{$_} = deep_copy($$this{$_}, $cache); } } } bless $temp, ref $this; } elsif (UNIVERSAL::isa($this, "GLOB")) { require Symbol qw(gensym); my $temp = $cache->{$addr} = shift || gensym; for(qw(SCALAR ARRAY HASH CODE IO)) { *$temp = deep_copy(*$this{$_}, $cache) if *$this{$_}; } tie *$temp, __PACKAGE__, deep_copy(tied *$this, $cache); bless $temp, ref $this; } elsif (UNIVERSAL::isa($this, "IO")) { local (*FOO, *BAR) = $this; open(BAR, "<+&FOO) or do { require Carp; Carp::carp "Couldn't dup $this: $!"; } $cache->{$addr} = bless *BAR{IO}, ref $this; } else { require Carp; $addr =~ s/\(0x[0-9a-fA-F]*\)\z//; Carp::croak "Unrecognized type: $addr"; } } It's possible that I've made typoes, logic bugs, etc. In particular, I'm not sure I have all the logic behind the use of globs right. The reason for this is that you can store a GLOB object [both a real one and a reference to one] inside a scalar. Eg: $x = *foo; print *foo, "\n", # *main::foo print $x, "\n"; # *main::foo print \$x, "\n"; # GLOB(0x80e4a5c) print \*$x, "\n"; # GLOB(0x80e4a5c) print \*foo, "\n"; # GLOB(0x80e4a50) This is an especially annoying problem when you have an array or hash whose members are typeglobs [not references to ones, actual ones]. Also, I'm not entirely sure about the handling of Regexp objects. They think they're SCALARs, but they're C structs. So you can't deref, copy, and bless, like you can with normal scalars. -- "I think not," said Descartes, and promptly disappeared.