design – Error handling for repository: exceptions or wrapping return value?

The repository pattern is defined as:

REPOSITORY: A mechanism for encapsulating storage, retrieval, and search behavior which emulates a collection of objects.

– Eric Evans in DDD

Getting objects from the repository

Typically, you’d have methods to retrieve items such as getByID() returning a single item, getAll(), or some kind of find() returning a collection of items that match some criteria.

Let’s start with the collection based methods:

  • Suppose that your repository is empty. getAll() would then be expected to return an empty set. This emulates best a collection of objects.

  • Suppose now that you query using some criteria that isn’t met by any record. Again, the most logic return would be an empty collection. If you would return an error insted, you’d risk to be inconsistent: what would you return for a dynamic query that asks for no specific criteria on an empty repository without contracting the equivalent behavior of getAll()?

For the item based getById(). Here the choices are not so clear cut:

  • You could argue that the single item is just a convenience, to avoid working with collections made of a single element. The best option would be to return a null when nothing is found (i.e. similar to an empty set, for no items). This is in particular useful if IDs are managed externally (e.g. provided by user, or meaningful) and if the developers are used to handle the returned result with caution.
  • You could as well argue that getById() is in general used with known IDs and it’s in general assumed that they exist. In this situation a null might be unexpected and generate risks if it is forgotten only once to check the validity of the returned object. In this situation, the best option would be to throw an exception.

So the choice should be driven by the practices and expectation in your context. Personally I have a preference for the null although it requires more discipline.

Changing the content of the repository

Typically you’d have some kind of add(), remove(), and perhaps update() that would change the repository content. When you use these function, you generally expect that they work.

A frequent practice is to have methods with no return value (void). In this case, you have no choice but to raise an exception in case of an unexpected error. This is very practical in fact, if you have to manage transactions that span several repositories, since you can easily catch any error in a transaction and rollback the whole transaction.

But you’ll also find some repository implementations that provide these methods but returning an error code instead. It’s more flexible for error handling, no problem, but if most of the time you need to check the error code and raise an exception to rollback some transactions, you’ll end-up wiht lot of boiler-plate code.

The folowing strong arguments are in favor of the first approach:

I personally don’t like these methods to answer Boolean results as do full-fledged collections. That’s because in some cases answering true to an add-type operation does not guarantee success. The true results may still be subject to a transaction commit on the data store. Thus, void may be the more accurate return type in the case of a Repository.

– Vaughn Vernon in Implementing DDD

Your additional question about validation

The scenario that you describe is not crystal clear, and does not seem to be related to the repository pattern itself. Unfortunately we know to little about your specific case and I invite you to formulate a distinct question only focused on that last aspect. Nevertheless here some first thoughts:

Performing validations in the front-end to guide user, but nevertheless doing a thorough validation in the back-end is a very common situation (You cannot rely on front-end validation for ensuring corporate consistency).

An error in the back-end validation does not fatally have to cause user entry to get lost. You may probably structure the interaction in such a way to inform the user of the back-end problem and propose to edit/resubmit the data. THere are more tricky strategies to cope with concurrency errors.