As Hibernate uses JDBC in the background, all the operations we perform on entites, like persisting, deleting or modifying state, result in a set of SQL queries being executed against a database. Naturally we want to run as few queries as possible, but because object and relational worlds are very different, the way we configure entities may sometimes be sub-optimal and result in N+1 Selects or Cartesian Product Problems. This article focuses on JPA N+1 Selects Problem and the ways of preventing it.

Code Example

N+1 Selects problem is best illustrated with an example.
Lets examine the following Doctor entity. Each Doctor can be booked to many Appointments. The relation between Doctor and Appointment is a bi-directional many to one.

@Entity
public class Doctor {
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    private String name;

    @OneToMany(mappedBy = "doctor", fetch = FetchType.LAZY)
    private Collection<Appointment> appointments;

The Appointment entity will look as follows

@Entity
public class Appointment {
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    private LocalDateTime appointmentTime;

    @JoinColumn(nullable = false)
    @ManyToOne
    private Doctor doctor;

Now examine the following test method. It persists 10 doctors, each with single Appointment. In another transaction it fetches all the available Doctors and asserts the size of Appointment collection is non-zero.

@Test
public void whenLazyLoadingCollection_andAccessingItems_thenNPlusOneSelects() {
	createDoctorsWithAppointments(10, 1);

	doctorDao.executeInTransaction(entityManager -> {
		List<Doctor> doctors = entityManager.createQuery("select d from Doctor d", Doctor.class).getResultList();
		for (Doctor doctor : doctors) {
			Assert.assertTrue(doctor.getAppointments().size() > 0);
		}
	});
}

Now take a look how many queries Hibernate generated

select doctor0_.id as id1_1_, doctor0_.name as name2_1_ from Doctor doctor0_
select appointmen0_.doctor_id as doctor_i3_0_0_, appointmen0_.id as id1_0_0_, appointmen0_.id as id1_0_1_, appointmen0_.appointmentTime as appointm2_0_1_, appointmen0_.doctor_id as doctor_i3_0_1_ from Appointment appointmen0_ where appointmen0_.doctor_id=?
select appointmen0_.doctor_id as doctor_i3_0_0_, appointmen0_.id as id1_0_0_, appointmen0_.id as id1_0_1_, appointmen0_.appointmentTime as appointm2_0_1_, appointmen0_.doctor_id as doctor_i3_0_1_ from Appointment appointmen0_ where appointmen0_.doctor_id=?
select appointmen0_.doctor_id as doctor_i3_0_0_, appointmen0_.id as id1_0_0_, appointmen0_.id as id1_0_1_, appointmen0_.appointmentTime as appointm2_0_1_, appointmen0_.doctor_id as doctor_i3_0_1_ from Appointment appointmen0_ where appointmen0_.doctor_id=?
select appointmen0_.doctor_id as doctor_i3_0_0_, appointmen0_.id as id1_0_0_, appointmen0_.id as id1_0_1_, appointmen0_.appointmentTime as appointm2_0_1_, appointmen0_.doctor_id as doctor_i3_0_1_ from Appointment appointmen0_ where appointmen0_.doctor_id=?
select appointmen0_.doctor_id as doctor_i3_0_0_, appointmen0_.id as id1_0_0_, appointmen0_.id as id1_0_1_, appointmen0_.appointmentTime as appointm2_0_1_, appointmen0_.doctor_id as doctor_i3_0_1_ from Appointment appointmen0_ where appointmen0_.doctor_id=?
select appointmen0_.doctor_id as doctor_i3_0_0_, appointmen0_.id as id1_0_0_, appointmen0_.id as id1_0_1_, appointmen0_.appointmentTime as appointm2_0_1_, appointmen0_.doctor_id as doctor_i3_0_1_ from Appointment appointmen0_ where appointmen0_.doctor_id=?
select appointmen0_.doctor_id as doctor_i3_0_0_, appointmen0_.id as id1_0_0_, appointmen0_.id as id1_0_1_, appointmen0_.appointmentTime as appointm2_0_1_, appointmen0_.doctor_id as doctor_i3_0_1_ from Appointment appointmen0_ where appointmen0_.doctor_id=?
select appointmen0_.doctor_id as doctor_i3_0_0_, appointmen0_.id as id1_0_0_, appointmen0_.id as id1_0_1_, appointmen0_.appointmentTime as appointm2_0_1_, appointmen0_.doctor_id as doctor_i3_0_1_ from Appointment appointmen0_ where appointmen0_.doctor_id=?
select appointmen0_.doctor_id as doctor_i3_0_0_, appointmen0_.id as id1_0_0_, appointmen0_.id as id1_0_1_, appointmen0_.appointmentTime as appointm2_0_1_, appointmen0_.doctor_id as doctor_i3_0_1_ from Appointment appointmen0_ where appointmen0_.doctor_id=?
select appointmen0_.doctor_id as doctor_i3_0_0_, appointmen0_.id as id1_0_0_, appointmen0_.id as id1_0_1_, appointmen0_.appointmentTime as appointm2_0_1_, appointmen0_.doctor_id as doctor_i3_0_1_ from Appointment appointmen0_ where appointmen0_.doctor_id=?

N+1 Selects Problem

Hibernate executes a single select statement to fetch all rows from Doctor table. Because Appointments are lazily loaded, Hibernate needs another select statement to fetch Appointments for each Doctor. A total of 11 queries is executed for 10 doctors. Accessing all the items of lazily loaded entity association leads to what we call an N+1 Selects Problem. When the number of Doctors is large, this problem becomes very serious.

Preventing N+1 selects problem

It’s worth noting that by default, @OneToMany and @ManyToMany associations are fetched lazily. In order to avoid N+1 selects problem, its best to override the LAZY Fetch Plan whenever you know you’ll need to access all the elements of associated entity collection. It’s best achieved by optimizing the JPQL select query to fetch all the associated entities at once.

select d from Doctor d join fetch d.appointments

This results in a single sql query for the whole data set.

select doctor0_.id as id1_1_0_, 
appointmen1_.id as id1_0_1_, 
doctor0_.name as name2_1_0_, 
appointmen1_.appointmentTime as appointm2_0_1_, 
appointmen1_.doctor_id as doctor_i3_0_1_, 
appointmen1_.doctor_id as doctor_i3_0_0__, 
appointmen1_.id as id1_0_0__ 
from Doctor doctor0_ 
inner join 
Appointment appointmen1_ on doctor0_.id=appointmen1_.doctor_id

If you’re willing to fetch the full set of Doctors including the ones who do not have any Appointments it’s best to use left join fetch query.

select d from Doctor d left join fetch d.appointments

As you can observe in the Cartesian Product Problem article, changing globally to EAGER fetch plan for @OneToMany and @ManyToMany associations is not really a good idea, this section advises to use a query whenever you’re sure you’ll access all the related entities. Entity configuration should remain LAZY loaded.

Pre-loading Data With @BatchSize

Please be aware that this approach uses a Hibernate annotation so it’s applicable only when you’re using Hibernate as your JPA provider.
@BatchSize annotation allows to initialize Appointment entity collections for several Doctors. Whenever Hibernate has to initialize Appointment collection for one Doctor and it notices that it has several more Doctors in its Persistence Context it will also go and initialize Appointments for few more Doctors. The number of initialized Doctors is defined within the @BatchSize annotation.

@Entity
public class Doctor {
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    private String name;

    @OneToMany(mappedBy = "doctor", fetch = FetchType.LAZY)
    @org.hibernate.annotations.BatchSize(size = 5)
    private Collection<Appointment> appointments;

Whenever Hibernate initializes one Doctor – Appointments collection it will also initialize Appointments for four more Doctors. The query which Hibernate generates is as follows.

select appointmen0_.doctor_id,
	 appointmen0_.id,
	 appointmen0_.id, 
	 appointmen0_.appointmentTime, 
	 appointmen0_.doctor_id 
	from Appointment 
	 where appointmen0_.doctor_id in (?, ?, ?, ?, ?)

In our example of 10 doctors, a ceiling of N + 1 / @BatchSize, which is 10 + 1 / 5 = 3 total queries will be executed to fetch Doctors and all the Appointments. It’s not really a great optimization as still more queries are executed than we would like, but it is some improvement.
Please also bear in mind that it’s a global optimization. Once this annotation is present, referring to one of our Doctor’s appointments will result preloading Appointments for another four Doctors.

Pre-loading Full Collections with @Fetch annotation

This approach is also a Hibernate feature. When @Fetch annotation is used on Appointments collection, Hibernate will initialize Appointments for all Doctors once one Appointment collection is initialized. Here’s how to configure it in Doctor entity.

@Entity
public class Doctor {
    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    private String name;

    @OneToMany(mappedBy = "doctor", fetch = FetchType.LAZY)
    @org.hibernate.annotations.Fetch(value = FetchMode.SUBSELECT)    
    private Collection<Appointment> appointments;

Hibernate remembers the original query retrieving Doctors and uses it in a subselect to initialize all related Appointments. This way all Doctors in Persistence Context have their Appointments initialilzed with one additional SQL query visible below.

select appointmen0_.doctor_id, 
	 appointmen0_.id, 	
	 appointmen0_.appointmentTime, 
	 appointmen0_.doctor_id 
	from Appointment appointmen0_ 
	 where appointmen0_.doctor_id in (select doctor0_.id from Doctor doctor0_)

The total number of SQL selects executed by Hibernate with @Fetch annotation is two. First one fetches Doctors and the second one fetches all the Appointments related to already fetched Doctors.

Take a look at following artcles to understand more about Hibernate

2 comments add yours

  1. Thank you for presenting the N+1 Selects problem – good to understand it as a real performance issue :)
    But the described prevention is not really suitable for common use cases. The replacing lazy to eager fetch type often is not a good solution because it results in massive loading whenever we touch parent object. Normally we need lazy fetching, only in case of bulk operations eager fetching would be appropriate. As we know keeping two different mappings (lazy / eager) for the same object is not possible.
    So what do we have at our disposal? First we can redesign our application to meet Command Query Responsibility Segregation (CQRS) pattern. Then we can separate bulk operation and adjust fetching strategy only for that. This is not straightforward and when we have developed our application this separation take a big effort.
    Secondly we can use hibernate @BatchSize feature – it’s a middle way solution. We can significantly decrease number of selects but still some unnecessary loading will be performed even when we don’t need the collection in standard processing. Keep in mind that having nested collections (each element of collection has its own collection) is quite common, which makes the issue even more painful.
    Thirdly we can pre-load intentionally all child entities just before bulk processing. Then they go to the hibernate first level cache and will be available to processing without additional fetching (and N selects).
    Personally some time ago I applied the third solution with good results.
    I would appreciate if you elaborate on this blog a real way of handling the N+1 problem :)

    • Hi Marek!

      Thanks so much for your ideas and being the guardian of complete knowledge :) You’re absolutely right, the approaches you mentioned are definitely worth exploring. I focused on the options available in plain JPA only. @BatchSize and @Fetch approaches are here as well now.
      All the best !

Leave a Comment