Pages

Tuesday, 22 February 2011

Resuming a build or checkout

I have recently converted the GeoTools developers guide to Sphinx:
- http://docs.geotools.org/latest/developer/

I mostly focused on just porting the content that was there. There  are a couple interesting observations to be made when picking up a long running document on the Internet.

Back when that document was written (say 2003) the internet was a less consistently useful place. GeoTools with its habit of using new tools (maven!) new techniques (refactoring! testing!) and new ideas (factory pattern) was often needed to serve as an initial orientation for developers in addition to documenting the running of the project.

These days we would not bother to explain, after all stack overflow is a click away. (Update: apparently all the cool kids are on gis.stackexchange.com).

Never the less I picked up a few build tips.

Restarting Build

Part of the joy of living in Australia is the consistently amusing weather. For hackers this can result a tendency for long running builds deploys or checkouts to be interrupted by fire or flood. Little did I know that the developers guide indicated how to restart a build.

More realistically it is nice to build with -o (for offline) in order to go a bit faster.

> mvn install -o -Dall
  (failure due to missing jar when building modules/library/data )
> mvn install -rf modules/library/data

Restarting GIT SVN
In a related note if you are using "git svn" and have your initial clone fail 1/2 way through. You can resume using git fetch:

> git svn clone http://svn.osgeo.org/geotools/trunk/
  (fail due to network connection dropping out)
> cd trunk
> git svn fetch

Wednesday, 2 February 2011

Equals vs Equals vs Equals Exact

Not all equals are created equal. At least if you are a JTS Topology Suite user.

GeoTools is looking to update to the latest JTS 1.12 - and it brings with it a much requested change. I seem to recall arguing with Martin over beer in 2004 about the confusing topic of checking if two geometry are "equal".

To start out with in Java there are two methods that are part of what an "Object" is:
  • equals
  • hashcode
The first confusion is that these are really "one" method; as if you implement one you are honour bound to implement the other. They are a matched set with the equals returning true between two objects *must* imply that the same hashcode is produced for both.

This is a little bit more than a matter of developer pride; if you don't implement these methods properly it has consequences. Mostly for using your objects in a collection. Equals is used to check if the element is already in the collection for example; and hashcode is used when sorting the object into a safe spot for storage (and used again when you go looking for it quickly).

Out of the box "Object" provides an implementation of equals and hashcode based on the memory location. This is what JTS has done for Geometry; so two geometry objects were only equal if they were in fact the same geometry object.

So out of the box these two are the same:
object.equals( value )
object == value

But JTS provides a bit more choice and a chance for you to get things wrong:
  • Geometry.equals( Object ) - checks that the objects are identical; same as the java "==" operator.
  • Geometry.hashCode() - based on the envelope for speed! Very important when storing Geometry in a HashSet or HashMap.
  • Geometry.equals( Geometry ) - checks that the two geometry mean the same thing in a mathematical sense (ie form the same shape). This one is really slow as it involves actual work; it does do a fast envelope comparison first which is really appreciated.

    LINESTRING( 0 0,  2 2, 5 5 ) equals LINESTRING( 0 0, 5 5 )

  • Geometry.equalsExact( Geometry ) - checks that two geometry objects have the same representation of a shape

    LINESTRING( 0 0, 5 5 ) equalsExact LINESTRING( 0 0, 5 5 )
In Java code this means you can occasionally get in trouble when calling the wrong version of equals accidentally.

geometry.equals( value ); // what is being tested Geometry or Object?
geometry.equals( (Geometry) value ); // nice and explicit
geometry.equals( (Object) value); // nice and explicit

Here is what things look like today:

// will check object identity
System.out.println( geom1.equals( value ) ); 
System.out.println( geom1 == value );

// will check geometry shape (slow!)
System.out.println( geom1.equals( (Geometry) value )); 

// will check that the internal structure matches
System.out.println( geom1.equalsExact( (Geometry) value )); 

In JTS 1.12 things have changed a bit:

// will check object identity
System.out.println( geom1 == value ); 

// will check geometry shape (slow!)
System.out.println( geom1.equals( (Geometry) value )); // deprecated!
System.out.println( geom1.equalsTopo( (Geometry) value ));

// will check that the internal structure matches
System.out.println( geom1.equalsExact( (Geometry) value )); 
System.out.println( geom1.equals( value ) );

Hopefully JTS 1.12 will be easier for people to learn; equals is implemented as most people expect now.  The side effect is that JTS Geometry will behave like a proper "data object" and will work as expected in collection classes like HashMap and HashSet.

One real benefit is that equals( Geometry ) will show up as deprecated in your IDE; so you can tell when something funny is up.

Tips for updating:

  • Before you start find all instances of Geometry.equals( Object ) and replace with "==" operator. You may need to change a few HashMaps to IdentityHashMaps to maintain performance.
  • You should now be able to safely update
  • You can then change Geometry.equals( Geometry ) calls to equalsTopo
  • Personally I would leave any equalsExact calls in place; as they are nice and clear