Inside-out objects

[ Perl tips index ]
[ Subscribe to Perl tips ]

The de-facto choice for Perl objects has been to use blessed hashes. A hash makes it easy to both add new attributes and access existing ones. Unfortunately, this ease of access is also one of the greatest problems with a hash-based structure. In this tip we'll cover an alternative Perl object structure known as an inside-out object.

Problems with blessed hashes

In a perfect world everyone would obey the rules and only use the documented interface for each class. Unfortunately, the world isn't always perfect. Maybe a developer will bypass the interface to try and squeeze some extra performance out of their code. Maybe they used Data::Dumper to inspect your object and wrote their code to extract attributes without ever reading your documentation.

When you change your object's implementation, any code that bypasses your documented interface will break. Of course, the miscreant developer who wrote the bad code was fired years ago for not conforming to coding guidelines, but it's your changes that just caused the system to break. Even if you can convince your boss that it isn't your fault, it will still be your job to make things work.

The other problem with using hashes comes down to simple typographical errors. Let's pretend that one of your attributes is address, but somewhere in your code you make an accidental typo, forgetting a 'd': adress:

        sub get_address {
                my ($this) = @_;
                return $this->{address};
        }
        sub set_address {
                my ($this, $value) = @_;
                $this->{adress} = $value;       # Oops!
        }

The above code doesn't result in a warning. Perl is perfectly happy to add a new element to our hash, but since nothing else refers to the key it will never be used, resulting in a difficult to find bug.

Wouldn't it be great if we could have compile-time checking of attributes, rather than relying upon run-time checks and the correctness of developers? With inside-out objects, we can.

What is an inside-out object?

Inside-out objects are known by many names, including flyweight objects and inverted indices. Rather than storing all of our attributes inside our single object, we instead have a single hash for each attribute, and our object has an entry in each hash. The following example demonstrates the differences in structure:

  # Traditional hash-based objects.
  $person1 = { firstname => "Paul",    surname => "Fenwick"    };  # Object 1
  $person2 = { firstname => "Jacinta", surname => "Richardson" };  # Object 2
  $person3 = { firstname => "Damian",  surname => "Conway"     };  # Object 3
  # Inside-out objects.
                 # Object 1          # Object 2             # Object 3
  %firstname = ( 12345 => "Paul",    23456 => "Jacinta",    34567 => "Damian" );
  %surname   = ( 12345 => "Fenwick", 23456 => "Richardson", 34567 => "Conway" );

Error checking

Inside-out objects provide excellent error checking, because if we make a mistake in writing an attribute name we receive an error at compile time:

        use strict;
        use Class::Std;
        my %address;
        # ...
        sub set_address {
                my ($this, $value) = @_;
                $adress{ident $this} = $value;          # Oops!
        }
        
        # Trying to compile the above code results in an error:
        # Global symbol "%adress" requires explicit package name at ...

Automatic attribute checking is a big improvement in preventing what is otherwise a very common and frustrating bug. However the benefits don't stop there. Inside-out objects provide much better encapsulation than regular hash based objects.

Strong encapsulation

An inside-out object contains none of its own data; instead this has been moved into a series of hashes that are stored inside the class. By ensuring these are declared lexically (using my %attribute) we can be sure that nothing outside of the class is able to access these attributes.

Strong encapsulation means that a misguided developer can't bypass our interface and access attributes directly. There's simply no way that external code can access those attributes. They're simply not in scope.

Attribute access

We connect our attributes to our object by using a unique key. Since we're trying to ensure object integrity, our ideal key would be fixed and unchangeable for each object. The simplest solution would be to give each object a sequential number upon generation and mark it as read-only. Unfortunately this would make it very easy for external code to guess possible key values and break encapsulation. Ideally we want our key to be hard to fake. One solution is to use a module such as Data::UUID which generates globally unique identifiers. Another is to realise that every Perl variable already comes with something unique and verifiable -- its memory address.

The Scalar::Util module provides us with the refaddr function, which returns the memory address pointed to by a given reference. Alternately the Class::Std module provides exactly the same function named ident (since the memory address is used as an identifier for the object).

An example

We now have enough information to build ourselves our very own inside-out object. Imagine a playing card as an object: it would have a suit and a face value (rank).

        package PlayingCard;
        use strict;
        use warnings;
        use Scalar::Util qw/refaddr/;
        # Using an enclosing block ensures that the attributes declared
        # are *only* accessible inside the same block.  This is only really
        # necessary for files with more than one class defined in them.
        {
                my %suit_of;
                my %rank_of;
                
                sub new {
                        my ($class, $rank, $suit) = @_;
                        
                        # This strange looking line produces an
                        # anonymous blessed scalar.
                        
                        my $this = bless \do{my $anon_scalar}, $class;
                        
                        # Attributes are stored in their respective
                        # hashes.  We should also be checking that
                        # $suit and $rank contain acceptable values for
                        # our class.
                        
                        $suit_of{refaddr $this} = $suit;
                        $rank_of{refaddr $this} = $rank;
                        
                        return $this;
                }
                
                sub get_suit {
                        my ($this) = @_;
                        return $suit_of{refaddr $this};
                }
                
                sub get_rank {
                        my ($this) = @_;
                        return $rank_of{refaddr $this};
                }
        }
        1;

\do{my $anon_scalar}

One of the strangest lines in our code contains \do{my $anon_scalar} . This odd construct simply declares a lexical variable using my. The name of our scalar is irrelevant, since it immediately goes out of scope at the end of the block. Normally this would seem fruitless, but the enclosing do {} block returns the last statement evaluated, in our case the freshly created scalar. By taking a reference to this scalar (using the backslash operator) our scalar avoids destruction and lives on without a name.

Note that our scalar itself is completely empty, it doesn't contain anything, and we never use its contents. It exists simply to be blessed into the appropriate class, and for our own code to use its memory address for attribute lookups.

A problem with inside-out objects

Inside-out objects compare favourably with regular objects. They scale better in terms of memory usage, and with minor modifications can be tuned to provide even faster performance, albeit with the loss of some integrity benefits. However you're unlikely to notice these benefits unless it's absolutely critical that your application needs to run very fast or very small. So what's the catch?

Think about what happens when an object is destroyed. With a regular hash-based object the only reference to the object's attributes is lost with the object itself, and Perl handles the clean-up for us. When we have an inside-out object, nothing cleans up the attributes when the object is destroyed. Instead, we have to write our own DESTROY method. We also need to worry about making sure our parent and sibling DESTROY methods are called as well. If we don't, then our objects will leak memory, and that's bad.

For our PlayingCard we would need to add the following, or code like it:

        use NEXT;
        sub DESTROY {
                my ($this) = @_;
                $this->EVERY::_destroy;
        }
        sub _destroy {
                my ($this) = @_;
                delete $suit_of{ident $this};
                delete $rank_of{ident $rank};
                
        }

All our derived classes will need to write their own _destroy method to clean up any additional attributes that have been defined.

Inheritance and attributes

An additional advantage of inside-out objects is that each class has its own private area in which to store attributes. This means that derived classes don't need to worry about clashes with parents or siblings, and vice-versa. It also makes it possible, although possibly unwise, for derived classes to have attributes of the same name, but with different values (something which is impossible for standard hash-based objects).

If we do decide to use attributes of the same name in more than one class in our inheritance tree, we need to think about how we will ensure that each class gets the correct value during construction and initialisation. The best way to do this depends on our implementation.

Inside-out objects do not avoid the problems associated with multiple methods in the inheritance tree having the same names. Fortunately, we can use NEXT in such situations, just as we do with standard hash-based inheritance.

Helper modules

The basic structure of any inside-out object is essentially the same, just as the basic structure for hash-based objects is essentially the same. As such a number of builder modules have been created to remove the repetitive code and make it quicker for you to start writing the real code. Two particularly good modules for inside-out objects are:

[ Perl tips index ]
[ Subscribe to Perl tips ]


This Perl tip and associated text is copyright Perl Training Australia. You may freely distribute this text so long as it is distributed in full with this Copyright noticed attached.

If you have any questions please don't hesitate to contact us:

Email: contact@perltraining.com.au
Phone: 03 9354 6001 (Australia)
International: +61 3 9354 6001

Valid XHTML 1.0 Valid CSS