Hibernation: Initializing a Complex Object

I have problems with fully loading a very complex object from the database in a reasonable amount of time and with a reasonable number of queries.

My object has many built-in objects, each object has links to other objects, other objects refer to another, etc. (Thus, the nesting level is 6)

So, I created an example to demonstrate what I want: https://github.com/gladorange/hibernate-lazy-loading

I have a user.

The user has the @OneToMany collection of favorite oranges, apples, vines and peaches. Each vine has @OneToMany grape collection. Each fruit is another object with one string field.

I create a user with 30 favorite fruits of each type, and each vine has 10 grape varieties. So, I have 421 entities in DB - 30 * 4 fruits, 100 * 30 grapes and one user.

And what I want: I want to load them using no more than 6 SQL queries. And each query should not give a large set of results (large is a result set with more than 200 entries for this example).

My ideal solution would be the following:

  • 6 inquiries. The first query returns information about the user, and the size of the result set is 1.

  • The second Apple request return information for this user and the result set size is 30.

  • The third, fourth, and fifth queries return the same as the second (with result set size = 30), but for the vine, oranges, and peaches.

  • The sixth query returns Grape for ALL grapevines

It is very simple in the SQL world, but I cannot achieve this with JPA (Hibernate).

I tried the following approaches:

  • Use a join to retrieve, e.g. from User u join fetch u.oranges ... This is terrible. The result set is 30 * 30 * 30 * 30, and the execution time is 10 seconds. Number of queries = 3. I tried this without grapes, with grapes you get x10 the size of the result set.

  • Just use lazy loading. This is the best result in this example (with @Fetch = DIVISION for grapes). But in this case, I need to manually iterate over each collection of elements. Also, subselect fetch is too global a parameter, so I would like to have something that could work at the query level. A set of results and time are near ideal. 6 requests and 43 ms.

  • Loading with an entity graph. Same as the introduction, but it also makes a request for each grape to get a vine. However, the result time is better (6 seconds), but still terrible. Number of requests> 30.

  • I tried to trick JPA into manually loading objects into a separate request. How:

      SELECT u FROM User where id = 1;
     SELECT a FROM Apple where a.user_id = 1;
    

This is slightly worse than lazy loading, because for each collection two requests are required: the first request to manually load objects (I have full control over this request, including loading related objects), the second request for lazy loading the same objects by Hibernate itself (this is done automatically using hibernate)

The lead time is 52, the number of requests = 10 (1 for the user, 1 for grapes, 4 * 2 for each fruit collection)

In fact, the โ€œmanualโ€ solution combined with the SUBSELECT fetch allows me to use โ€œsimpleโ€ extraction connections to load the necessary entities into a single request (for example, @OneToOne ). Therefore, I will use it. But I do not like that I need to fulfill two requests to load the collection.

Any suggestions?

+7
java optimization hibernate jpa
source share
3 answers

I am going to suggest another option on how to lazily get grape collections in Grapevine:

 @OneToMany @BatchSize(size = 30) private List<Grape> grapes = new ArrayList<>(); 

Instead of making a subselection, this one will use in (?, ?, etc) to retrieve many Grape collections at once. Will be shipped instead ? Grapevine identifiers. This contradicts querying 1 List<Grape> collection at a time.

This is another method of your arsenal.

+3
source share

I usually cover 99% of these use cases using batch fetching for both entities and collections. If you process the selected objects in the same transaction / session in which you read them, then you do not need to do anything, just go to the associations necessary for the processing logic, and the generated queries will be very optimal. If you want to return the selected objects as detached, you manually initialize the associations:

 User user = entityManager.find(User.class, userId); Hibernate.initialize(user.getOranges()); Hibernate.initialize(user.getApples()); Hibernate.initialize(user.getGrapevines()); Hibernate.initialize(user.getPeaches()); user.getGrapevines().forEach(grapevine -> Hibernate.initialize(grapevine.getGrapes())); 

Note that the last command does not actually query for each vine, since several grapes collections (up to the specified @BatchSize ) are initialized when the first is initialized. You simply iterate over them all to make sure that they are all initialized.

This method is similar to your manual approach, but more efficient (requests are not repeated for each collection), and, in my opinion, it is more readable and supported (you just call Hibernate.initialize instead of manually writing the same request that Hibernate automatically generates).

+5
source share

I do not quite understand your requirements here. It seems to me that you want Hibernate to do what he did not plan to do, and when he cannot, you want the hack solution to be far from optimal. Why not relax the restrictions and get something that works? Why do you have these restrictions at all?

Some common pointers:

  • When using Hibernate / JPA you do not control requests. You should not either (with some exceptions). How many requests, the order of their execution, etc. To a large extent go beyond your control. If you want full control over your queries, just skip JPA and use JDBC instead (e.g. Spring JDBC.)
  • Understanding lazy loading is the key to making decisions in such situations. Deferred relationships are not obtained when the owner object is received; instead, Hibernate returns to the database and receives them when they are actually used. This means that lazy loading pays off if you do not use the attribute every time, but impose a penalty for the time that you actually use. (The Fetch join is used to get the lazy relationship desired. Not intended for use with regular database loading.)
  • Query optimization using Hibernate should not be your first line of action. Always start with your database. Is it modeled correctly, with primary keys and foreign keys, normal forms, etc.? Do you have search indexes at the appropriate places (usually foreign keys)?
  • Testing performance on a very limited dataset is likely to fail. There will probably be overhead for connections, etc., which will be more than the time taken to complete the requests. In addition, there may be random hickups, which cost several milliseconds, which will give a result that can be misleading.
  • A small tip from your code: never provide setters for collections in essence. If actually called inside a transaction, Hibernate will throw an exception.
  • tryManualLoading probably does more than you think. Firstly, he selects the user (with lazy loading), then he extracts each of the fruits, then he again extracts the fruits through lazy loading. (If Hibernate does not understand that the requests will be the same as with lazy loading.)
  • In fact, you donโ€™t need to iterate over the entire collection to start lazy loading. You can do this user.getOranges().size() or Hibernate.initialize(user.getOranges()) . For the vine you will have to sort through to initialize all the grapes.

With the right database design and lazy loading in the right places, there should be nothing but:

 em.find(User.class, userId); 

And then, perhaps, a request to select a connection if lazy loading takes a lot of time.

In my experience, the most important factor to speed up Hibernate is the search indexes in the database .

0
source share

All Articles