Tuesday, December 18, 2012

Hibernate Sessions

I have been using Hibernate for a very long time; at least nine years if not more. It is perhaps the best known ORM tool in the Java/.NET world today. There are many alternatives but none have the feature set or maturity of Hibernate. Hibernate is perhaps not the easiest tool to use, partly due to its long history, and partly because the problem it tries to solve is complicated, but you can get started with some basic information quite nicely. All criticism aside, it is still a fine tool for a general purpose ORM approach, and perhaps the best one to use for bigger applications.

Sessions seem like an appropriate place to start when navigating the Hibernate waters, so let us examine that topic.

Sessions

Working with Hibernate is built around sessions. This is the Unit of Work pattern, and to use Hibernate correctly, you must understand this concept well.

The pattern you use to work with Hibernate is as follows:
  • Open session
  • Begin transaction
  • Using the session, query persistent mapped objects.
  • and/or add new objects to session.
  • and/or delete queried objects
  • and/or modify objects queried or added.
  • Commit transaction
  • Close session.
This is your logical transaction, the Unit of Work that you perform, per action, in your application. Note that while this is often the same as a database transaction, these concepts are not equivalent. You may, if you so choose, open a session and transaction, query objects, make changes, flush those changes to a database (commit), open another transaction, make more changes, flush (another commit), and so on. One session, many transactions. This allows you to keep tracking all the changes that you are making but stage your database changes in several steps.

The same basic steps in C# code using NHibernate:

using (var session = sessionFactory.OpenSession())
using (var tx = session.BeginTransaction())
{
 var customer = session.Load<Customer>(customerId);
 customer.Name = "New Name";

 var newCustomer = new Customer(anotherCustomerId, "Another Customer");
 session.save(newCustomer);

 tx.Commit();
}

What do you get by doing this?
  • all mapped objects that are introduced to sessions are tracked for their changes.
  • once the transaction in committed, all those changes will be persisted to the database.
  • all objects within the session are cached. Objects accessed by their IDs come from the cache if already there. This includes all session.Load/Get calls and objects loaded via relationships by their IDs.
  • Hibernate can batch your updates to the database. Say you made 100 changes to objects, if you had set your batch size to 100, you will likely update everything in one database round trip. Hibernate is also smart enough to flush changes to the database when it needs to so you don't have to worry when to flush things manually. It is enough just to commit the transactions and close sessions as was explained.

Problems with Sessions

If you deviate from the mentioned session usage pattern, you will run into issues, and you will complicate your life tremendously.

Objects Outside Sessions

Hibernate does not know how to deal with any objects that it is not tracking within a session. You know when you have messed something up with the session management if you run into "non unique object", "not persistent object", "non transient object", "lazy loading", or "no active session" exceptions.

What these kinds of exceptions mean is that you are trying to interact with Hibernate with objects that were not introduced to the session, session was already closed, or the objects were introduced in another session (which now is obviously closed). All the objects that you want Hibernate to track, should be loaded, queried, modified, deleted in the same session. If not, then you must explicitly reintroduce (merge) objects back to the session. If you have lazy loading references, or collections, they can only be accessed within the same active session. Since lazy loading is an important concept to be utilized with Hibernate, it is also perhaps the most common scenario where the problems arise.

Lazy Loading and UI Rendering

People often use Hibernate loaded objects while rendering some type of user interface. A web page is typically rendered by passing some Hibernate objects to the view template engine. The problem with this is that objects may have lazy loading members that are loaded at the time of access only, not when the parent object was originally loaded. When a web page is rendered from the template, in may be that the Hibernate session is already closed because the control has already moved away from the code that programmers write (in a controller for example).

For this type of approach to work, the session must be open, even during page rendering. This is often referenced as the Open Session in View pattern. However, I can't openly recommend this pattern, while it solves the problem. It overlooks the fact that views are often composed from more than one action in real life, and are more naturally represented by individual sessions, one per action. Also, domain/Hibernate models are not often the same thing as view models and it might be better to actually translate domain (Hibernate) models to view models and back again to domain models. I am using the "domain model" here quite liberally, but distinctively as a separate concept from view models.

Many of us who use the MVC (Model View Controller) frameworks for our web application often struggle with the "model" concept. Typical frameworks do not really require anything from the M in the MVC, so developers are often left to come up with their own idea what the M means. Sadly, this will also lead to misuse of tools like Hibernate and more generally to mixing different layer concepts. The M in MVC is the mental model of the user (what user sees on the screen), not an internal representation of the domain (programmer's mental model). The domain model, incidentally, is what you load and manipulate with the help of Hibernate, but it is not the M in MVC. The controller (C) is where the translation between the mental models happens.

Session Infrastructure

Managing sessions means repeating a lot of boilerplate code, unless you use infrastructure. You want to hide session management in common scenarios so you do not need to worry about it a whole lot. Obviously, you must still be aware that all of that machinery is still there under the hood.

One of the best pieces of advice in this regard comes from Ayende@Rahien blog series about odorless and frictionless code

Note that this is done for .NET MVC, but many frameworks have similar hookup points to do infrastructure work. The action filter code in the example wraps around your action call and repeats the usage pattern boilerplate code that I explained. This is simply an "around advice" or "interceptor" in AOP terms.

It also demonstrates the point about "actions". There can be many actions in a web request, where each action executes a particular job for the page. Think of the actions like little windows or blocks on a web page that typical designs render, each separate from another. Sessions should last only so long as the action does, same with transactions.



Thursday, November 15, 2012

Singleton or Simpleton?

Everyone knows that the Singleton pattern is bad, right? But is it really?

What is a Singleton?

The Design Patterns book says: Ensure a class only has one instance, and provide a global point of access to it. The motivation for this pattern is to have one instance of a class. The book continues that the class itself should make sure that there is only one instance.

Implementation

A lot of space in the book is dedicated for the implementation. It appears that it is quite tricky to create only one instance of a class. The book's example, and probably most implementations in various languages require use of static variables and methods. A typical implementation would look something like this:

public sealed class Singleton
{
   private static volatile Singleton _instance;
   private static object _syncLock = new Object();

   public static Singleton Instance
   {
      get 
      {
         if (_instance == null) 
         {
            lock (_syncLock) 
            {
               if (_instance == null) 
                  _instance = new Singleton();
            }
         }

         return _instance;
      }
   }

   private Singleton() {
      //Prevent new Singleton().
   }
}


  • Lazy initialization with double locking. Create one only when it is needed.
  • Volatile instance variable to ensure atomic creation and assignment.
  • Thread safe. Uses a separate lock object.
  • Prevents new operator.

Getting it right is not that easy and usually requires quite a bit of knowledge.

Is it Evil?

Let us list some of the most commonly mentioned "evil" things:
  • The Singleton code is hard to get right and it instantly becomes boilerplate that you copy to other Singletons over and over.
  • Singletons are shared instances, and so they must be thread safe. This easily forgotten fact causes many unexpected problems.
  • Can't really subclass a Singleton - you will break Liskov and other OOD principles to do this.
  • Related to the previous; you can not make static instantiation functions abstract or virtual, thus negating any kind of abstract factory type behavior. You are stuck to tying the creation of the instance with the Singleton class itself. You can try returning an interface but it still requires a base class to know about its derivates directly or indirectly.
  • Static method access allows you to hide API dependencies. Your API does not have to declare which singletons it is using inside. The behaviors that classes using singletons demonstrate can be unexpected and surprising - from the outset it is not obvious that code executes database queries, or does some other "magic" under the hood. Note that this does NOT remove dependencies, but it makes them implicit!
  • Switching singleton implementations is hard and so, also, unit testing becomes hard. It might not be easy to do unit testing when you can not mock the behavior of the singleton, or you many not replace the dependent behavior at all because the code under test calls a singleton.
  • Using Singletons (like any global) lavishly means that you are hard wiring your software leading to monolithic designs. When widely used, software starts to resemble procedural designs familiar from C and comparable languages.
  • Singletons are Singletons, until they are not. We sometimes make a wrong design choice, or a bad prediction; what used to be ONE, no longer is not. Sadly, you are now stuck with it, and your only reasonable option in short term might be to break the Singleton and create many copies, usually by delegation or embedded factory inside the instance() function.
  • Ever-lasting reference to the single instance can not be garbage collected without active "memory management". If your Singletons are big, and hold references to other things, you might be holding onto more stuff that you originally planned. 

But It is Popular!

All of that does not make the pattern very appealing, but still, there are always Singletons floating around. For one, some designs call for one instance of a class. It is quite legitimate to have such a requirement but somehow I doubt that the single instance argument made it popular.

Once developers learned how to create Singletons, as the Design Patterns book showed how, it became a very popular pattern to use. The appeal of easy access to one instance of a class made sure of it "success". Many design questions were now easy to answer - you just call for the Singleton when ever you need functionality of the object, right there in the code where you needed  it. You can do away with properties and constructor arguments, and you no longer have to concern yourself with having to hand in the dependencies the old fashioned way. From outside, classes look quite neat without those pesky argument lists and property pollution. Looks are deceiving.

It just became another way to go back to programming with "global variables". I also call these kinds of static accessor methods as a way to do "Russian doll design". You'd have to keep opening the Russian dolls to get to the bottom of the rather surprising functionality.

Implementation Dictates Context

What really hurts this pattern is its implementation. It is fine to have a requirement of a single instance, but it may not be fine to implement it with static variables and methods. When we talk about patterns, we also must remember what they are; design that solves a common problem within a context. Yes, within a context.

The Singleton pattern as presented above only exists in one particular context based on its implementation. That context is very limited to the utility of "static" variables and functions in the particular programming language. Yet, people treat it as a general purpose solution, and that ultimately dooms this pattern. Its utility might be very limited, contrary to its popularity.

Consider what happens across processes, or even class loaders that exist for many languages that have Virtual Machines. You can not guarantee a single instance easily with this kind of implementation and you may be surprised to find your application misbehaving if you expected to be safe from such things.

Utility Based on Context

Rather than bashing this pattern endlessly, let us expand the horizons a little bit and concentrate on the idea of patterns in a context. The Singleton "idea" is actually one that we use all the time, widely, in bounded contexts. The idea of Singleton - one instance in a context - is very useful.

To fix our approach to Singletons, we must first separate the creation of a Singleton from its implementation. In short, this means that the application of static members or functions is not allowed, and that any class can be a Singleton, or not, without us having to change anything about that class. We will only change the creation of that class, usually by configuration.

This is a solved problem these days, and has been for quite a long time. Popular Dependency Injection (DI) frameworks support the Singleton in different contexts. For web applications we might use "Request", "Session", or "Application" context, and DI containers like Spring can manage the context for you. We can also have the more traditional Singleton within a single container.

Anyone who uses DI frameworks has used these kinds of Singletons, and used them widely. Yet, we do not really think about it a whole lot, because we do not have to. The "evil" parts have gone away. Using Singletons like this, in a context, works naturally and there is no ultimate requirement to have just ONE - just one in a context.

Legitimate Uses of the Traditional Singleton

Since we talked about the context, there must be a context where the traditional Singleton approach is still valid. Let's call this a single instance in a process or class loader.

Typical examples include getting access to cross cutting features such as instrumentation or infrastructure that has to have a single point of entry in that context. A typical DI context might not be enough to solve this problem, and even the DI contexts must be created and initiated somewhere.

The traditional implementation still has many down sides with instantiation of different implementations, so what many approaches to "access roots" prefer is some kind of dynamic "service locator" that allows a configurable implementation:

var instance = Locator.GetInstance<IMySingleton>();

The service locator can choose the appropriate implementation, and can also choose how many instances are created. The access still clearly uses statics, which brings us many of the same woes that Singletons did, such as Russian doll design.

This may not be so bad with the named cross cutting infrastructure, such as logging, or configuration. Purists would still prefer to use techniques like AOP (Aspect Oriented Programming) to provide "transparent" logging and instrumentation, but this might not give enough insight to what you should log, for example. For practical purposes, you see log instantiation code similar to the example above.

Many, if not all, DI containers also double as service locators. However, this is not their primary mode of operation and they should not be generally used to fulfill this purpose.

But what about the actual traditional Singleton? It may just be that there are no real dependable uses for it, or  it should be very rare. We are still forced to deal with such code because we deal with libraries and other dependencies that use the Singleton pattern. Their use should be limited to as few spots in your code as possible.

Static Injection

I am going to mention one special case that came up recently, which involves DI containers and Singletons.

Having to inject dependencies during object creation can have its own problems. Sometimes people talk about Spring configuration hell, where injecting dependencies becomes nightmarish in the sea of hundreds and even thousands of classes. 

To combat some of these issues, it might be acceptable to use Singletons and static injection features of DI containers. Here you would let the DI container to call the instance factory function of the Singleton to create the instance, and then inject its dependencies. It is then possible to use such Singleton classes directly instead of injecting the instances. This approach can have the same alluring appeal to be replicated everywhere as the Singleton itself, but it also has most if not all of the problems.

Yet, this approach can have limited use:

  • Make sure that the Singletons are used only in one closed and limited context. These Singletons may not be shared or reused between instances of several classes, for example.
  • Do not use this approach for your business code, infrastructure, or code that is widely used as lower layers of your software where you can likely never guarantee that Singleton is the right way to go. Using this approach is OK only on the top application layer.

Wednesday, September 12, 2012

The Allure of Easy Answers

People are attracted to easy answers for difficult problems. We often refer these "easy answers" as fads once they gain popularity and some level of authority. The management fads are quite popular in our IT community for the simple reason that they attempt to answer the old problem of creating hyper productivity.

Many IT organizations are mired in problems creating results. If only those pesky developers would deliver what I want! Expectations turn into disappointments when the time lines are not what companies what, or  when companies can't fit their dreams into the budgets that they desire.

Agile! Yes, that will rocket our organization to unparalleled productivity. It's also very simple. I can see it fits on a pamphlet and it tells you what to do. Besides, they have Scrum Master courses, where within days you can now turn your organization into a well-oiled machine. All the experts say so, and we can hire Agile coaches who will come and rescue us if we get off the message.

Ok, so you are "doing" the Agile. You follow all the rules, and have your Scrums and Retrospectives, and Planning, and you tally up all the points and doodads. Cool, there is a graph that now tell exactly when we are going to be done! This is what management has been asking for all the time.

When the magic wears off then it does not look all that special. It sort of kind of works, but so did other things, sorta kinda. And, overall it probably is an improvement if you were stuck in Big Upfront Design. But, you can probably get the same just by having Medium Upfront Design, some work list, high level estimates that you keep updating, and some regular check point.

I remember tracking work on a spread sheet that would calculate the delivery date based on the task estimates ten years ago. This was the early days of Agile awareness for the greater public. However, any developer worth some gravy would already know the tools to be successful - hence the spread sheet. It's just that when you suddenly throw in a layer of bureaucracy and a Project Management Office in the mix when things usually take a nose dive. Welcome loss of sanity.

Suddenly some publicized Agile stuff starts to sound very good because it is marketable. You can sell that to a manager in a nice package, and you can give your graphs and velocity numbers to them, and boast about great progress. Yay! That's the antidote against PMO and layers of confusion!

But, I must admit that I am tired of the masquerade. I completely understand the Agile principle because that's what competent developers do on a daily basis, and have done so for a long time. It is just that when all those things that developers do to make things successful is marginalized, externalized and parcel wrapped into a "management fad" that the basic ideas of building software systems get lost somewhere. It's like a regression more than advancement when you start reciting Scrum commandments as the only gospel.

The problem is that fads provide easy and attractive answers. Unfortunately, you can only deliver successful stuff with great people. It's not the amount of people, but the kind of people you have. You still have to do the hard work, put the hours in, and be good at great many things. People are inconsistent, and they do not follow rules. Even if they follow rules, they follow them differently each time, with variable results. If something gets on the way, people stop doing it.

Because people are the way they are, you need adaptability, some light weight process, and interaction with feedback. That paired with good people will give you results. That same thing with wrong people hardly ever will. In fact, the process probably does not matter much. You can succeed with waterfall consistently if you have the right people.