PetaPoco - Mapping One-to-Many and Many-to-One Relationships

Friday, 3 June 2011

PetaPoco now supports Multi-Poco mapping where one row in a result set can be mapped onto two or more POCOs. But how do we handle one-to-many and many-to-one relationships?

This post is about PetaPoco - "a tiny ORMish thing for your POCOs". Read more on the PetaPoco Project Page.

First, a quick note - it's been a while coming, but PetaPoco v4 is now available in NuGet

As previously described PetaPoco now supports multi-poco queries. I've had a number of questions about how or whether PetaPoco maps one-to-many and many-to-one relationships in these types of queries.

The short answer is it doesn't - but you can do it yourself if you want.

That said, please make sure you really need this before going to the trouble of setting it up. If you're just doing typical join queries then getting the instance identity right for returned POCOs is often not necessary. The point of multi-poco queries is to avoid having to define new or extend existing POCO objects just to capture the result set of a join - it's not really intended to provide instance identity. By doing the mapping I'm about to describe you'll actually be taking a small performance hit.

Instance Identity and Discarded POCOs

So what exactly do I mean when I say "Instance Identity"? I mean that if a particular record is returned in two or more places from a query that the same POCO instance is returned in all cases, or that the instance of the POCO uniquely identifies that record. For example, say you're doing a join on articles to authors, if two articles have the same author then both will reference the same author object instance.

PetaPoco's multi-poco queries always create a new POCO instance for each part of the row. So in the above example, a new author object will be created for each row. To get the instance identity right, we will end up discarding the duplicates - so don't think of one-to-many and many-to-one mappings as an improvement in efficiency - only use it if the more accurate object graph is useful to you.

Relator Callbacks

After creating the individual POCOs that make up row PetaPoco calls a "relator callback" whose job it is to connect the objects for that row into an object graph.

The simplest approach for this is to simply assign the RHS object to a property on the LHS object. This is what PetaPoco's auto-mapper does. It's a simple and fast approach but it doesn't provide the object identity we're talking about.

I this post we'll be essentially looking at how to write smarter relator callbacks that do maintain object identity.

The Auto-Mapper and Simple Relationships

Before we get into writing relator callbacks, let's see what a simple auto-mapped multi-poco query looks like:

var posts = db.Fetch<post, author>(@"
        SELECT * FROM posts 
        LEFT JOIN authors ON posts.author = authors.id ORDER BY posts.id
        ");

With auto-mapping, the first generic argument is the return type. So this example will return a List<post> and so long as the post object has an property of type author PetaPoco will connect it to the created author object.

Writing the relator callback ourself looks like this:

var posts = db.Fetch<post, author, post>(
        (p,a)=> { p.author_obj = a; return p; },
        @"SELECT * FROM posts 
        LEFT JOIN authors ON posts.author = authors.id ORDER BY posts.id
        ");

Note two things about the above:

  1. There's an extra argument in the generic type list - <post, author, post>. The last argument indicates the type of returned collection element. With a custom relator you might decide to use a different class to represent the joined row.
  2. The lambda function does the wiring up of post to author.

(In all these examples, I'll be using a simple article to author relationship which is pretty intuitive to understand. If you want more detail, see these test cases)

Many-To-One Relationships

To implement many-to-one relationships all we need to do is keep a map of the RHS objects and re-use the same one each time.

var authors = new Dictionary<long, author>();
var posts = db.Fetch<post, author, post>(
    (p, a) =>
    {
        // Get existing author object
        author aExisting;
        if (authors.TryGetValue(a.id, out aExisting))
            a = aExisting;
        else
            authors.Add(a.id, a);

        // Wire up objects
        p.author_obj = a;
        return p;
    },
    "SELECT * FROM posts LEFT JOIN authors ON posts.author = authors.id ORDER BY posts.id"
    );

The implementation is pretty simple: look for a previous instance of the same author and if found use it in preference to the one supplied by PetaPoco. If we don't have a previous instance use the supplied one and store it for use later.

Of course if you need to do this from more than one place it will soon get tedious so lets wrap this up as a helper class:

class PostAuthorRelator
{
    // A dictionary of known authors
    Dictionary<long, author> authors = new Dictionary<long, author>();

    public post MapIt(post p, author a)
    {
        // Get existing author object, or if not found store this one
        author aExisting;
        if (authors.TryGetValue(a.id, out aExisting))
            a = aExisting;
        else
            authors.Add(a.id, a);

        // Wire up objects
        p.author_obj = a;
        return p;
    }
}

Now we can run our query like this:

var posts = db.Fetch<post, author, post>(
    new PostAuthorRelator().MapIt,
    "SELECT * FROM posts LEFT JOIN authors ON posts.author = authors.id ORDER BY posts.id"
    );

Much nicer. Moving on....

One-To-Many Relationships

In one-to-many relationships, we want to populate each LHS object with a collection of objects from the RHS. Flipping our above example, say we want a list of authors where each has a collection of that author's articles:

SELECT * FROM authors 
LEFT JOIN posts ON posts.author = authors.id ORDER BY posts.id

With this query we're going to get the author details repeated down the LHS of the result set and the articles on the right. The authors on the left need to be collapsed down into a single POCO, the articles on the right need to be collected into a list for each author.

The returned collection will actually have less items than the number of rows returned by the database so the relator callback needs to be able to "hold back" the current author until it detects a new one.

To support this, PetaPoco allows a relator callback to return null to indicate that it's not ready to serve up the current record. To flush out the final record PetaPoco will also call the relator one final time at the end of the result set, passing null for all parameters (but it only does this if the relator returns null at least once during the result set - this saves simpler relators from having to check for null parameters).

So let's look at a one-to-many relator:

class AuthorPostRelator
{
    public author current;
    public author MapIt(author a, post p)
    {
        // Terminating call.  Since we can return null from this function
        // we need to be ready for PetaPoco to callback later with null
        // parameters
        if (a == null)
            return current;

        // Is this the same author as the current one we're processing
        if (current != null && current.id == a.id)
        {
            // Yes, just add this post to the current author's collection of posts
            current.posts.Add(p);

            // Return null to indicate we're not done with this author yet
            return null;
        }

        // This is a different author to the current one, or this is the 
        // first time through and we don't have an author yet

        // Save the current author
        var prev = current;

        // Setup the new current author
        current = a;
        current.posts = new List<post>();
        current.posts.Add(p);

        // Return the now populated previous author (or null if first time through)
        return prev;
    }
}

I think the comments above explain pretty clearly what's happening - we're simply holding back the author until we detect a new one and adding the articles to a collection of posts on the current author object. We use it like this:

var authors = db.Fetch<author, post, author>(
    new AuthorPostRelator().MapIt,
    "SELECT * FROM authors LEFT JOIN posts ON posts.author = authors.id ORDER BY posts.id"
    );

Bi-directional Mapping and Mapping More than Two Object

In the above examples I either map an author to a post or add a post to an author's list. There's no reason the relator couldn't do both so that the references both ways are created. I didn't include this in the examples in order to demonstrate the point at hand but you get the idea.

Finally, the above examples show how to relate two objects. If you're joining more than two tables you'll need something more complex but it's really just extensions of the above.

« PetaPoco - What's new in v4.0 PetaPoco - Partial Record Updates »

4 Comments

This is all nice and dandy, but I still think that instance identity should be preserved. It would make things much cleaner and there would be no need to do these hacks. Especially the last one (one to many example) is hackish, because we have to assure that results are ordered so that first entities don't mix. They have to be ordered correctly. This may be a problem, when you have to order results in other ways than identity of the first entity. What would you do then? You'd have to return empty result and fill an external collection of entities. This would easily be avoided if PetaPoco would preserve instance identity out of the box.

The thing is that libraries (rather often) set some sort of a standard how to work with them and then we have to adopt OUR code to make it actually work. This brings in the problem of learning curve. Instance identity makes this library a tad worse.

If you ask me the best solution would be to: 1. Support instance identity out of the box (you're doing auto mapping as well) because it's not so much of an overhead so you could say that this slows down your lib. 2. provide methods that allow for more complex manipulation using custom relators

If #1 would at least support simple single property identity it would cover 95% (or even more) of all scenarios. Complex identity keys would then be covered with custom relators.

I vote for out of the box instance identity to be part of PetaPoco and not my code.

3 June 2011 08:35 AM

When I said Instance identity makes this library a tad worse. I actually meant: not supporting instance identity out of the box makes this library a tad worse. You're actually forcing majority of developers to write more code than needed. I'm simply saying that a choice would be nice. Both Fetch methods could be implemented (either as two methods or overload of the same one) that would support one or the other. And we could decide which one to use based on our needs.

3 June 2011 08:43 AM
bennie

Brilliant examples, clear explaination, easy to follow. Thanks.

29 November 2012 09:20 AM
Richard Edwards

In your simple auto mapper example, is there any way to use the Paged fetch for that? I noticed that all of the PetaPoco Page methods only include a single type.

4 December 2012 05:03 PM

Leave a comment

Name (required)
Email (required, not shown, for gravatar)
Website (optional)
Your Message
Leave these blank: