The SUBQUERY strategy that Marmite belongs to is related to FetchMode.SELECT, not SUBSELECT.
The console output you posted to fetchmode.subselect is curious because this is not the way that should work.
FetchMode.SUBSELECT
use the subquery request to download additional collections
Hibernate docs :
If you need one lazy collection or an unambiguous proxy, Hibernate will download all of them by re-executing the original request in the subquery. This works the same as batch loading, but without a phased download.
FetchMode.SUBSELECT should look something like this:
SELECT <employees columns> FROM EMPLOYEE employees0_ WHERE employees0_.DEPARTMENT_ID IN (SELECT department0_.DEPARTMENT_ID FROM DEPARTMENT department0_)
You can see that this second request will deliver to the memory of all employees belonging to a certain department (i.e. employee.department_id is not zero), it does not matter if this is the department that you receive in your first request. Thus, this is a potentially serious problem if the employee table is large, because it may be accidentally loading the entire database into memory .
However, FetchMode.SUBSELECT significantly reduces the number of queries, since it accepts only two queries compared to N + 1 FecthMode.SELECT queries.
Perhaps you think that FetchMode.JOIN makes even fewer requests, only 1, so why use SUBSELECT at all? Well, that's true, but at the cost of duplicate data and a harder answer.
If an unambiguous proxy is to be selected using JOIN, the request may receive:
+---------------+---------+-----------+ | DEPARTMENT_ID | BOSS_ID | BOSS_NAME | +---------------+---------+-----------+ | 1 | 1 | GABRIEL | | 2 | 1 | GABRIEL | | 3 | 2 | ALEJANDRO | +---------------+---------+-----------+
The boss employeeβs data is duplicated if he manages several branches and has a cost in the passband.
If the lazy collection needs to be loaded using JOIN, the request may receive:
+---------------+---------------+-------------+ | DEPARTMENT_ID | DEPARTMENT_ID | EMPLOYEE_ID | +---------------+---------------+-------------+ | 1 | Sales | GABRIEL | | 1 | Sales | ALEJANDRO | | 2 | RRHH | DANILO | +---------------+---------------+-------------+
Department data is duplicated if it contains more than one employee (natural case). We not only suffer from the cost of bandwidth, but also get duplicate objects duplicated objects , and we must use SET or DISTINCT_ROOT_ENTITY to remove duplicates.
However, duplicating data at a lower latency position is a good compromise in many cases, such as Markus Winand.
A SQL connection is even more efficient than a subselect approach, even if it performs the same index lookups as it avoids a lot of network communications . It is even faster if the total amount of data transferred is greater due to duplication of employee attributes for each sale. This is due to two performance measurements: response time and throughput; in computer networks, we call them latency and bandwidth. Bandwidth has little effect on response time, but delays have a huge impact . This means that the number of database calls is more important for response time than the number of data transferred.
Thus, the main problem with using SUBSELECT is that it is difficult to control and can load an entire array of objects into memory. With Batch fetching, you get the related object in a separate request as SUBSELECT (so that you do not suffer from duplicates), gradually and most importantly, you only request related objects (so that you do not suffer from the potential load of a huge graph), because the IN subquery is filtered by identifiers obtained using an exit request).
Hibernate: select ... from mkyong.stock stock0_ Hibernate: select ... from mkyong.stock_daily_record stockdaily0_ where stockdaily0_.STOCK_ID in ( ?, ?, ?, ?, ?, ?, ?, ?, ?, ? )
(It can be an interesting test if a batch sample with a very large batch size acts like a SUBSELECT, but without loading the entire table)
A few posts showing various selection strategies and SQL logs (very important):
Summary:
- JOIN: avoids the main problem with N + 1 queries, but can duplicate data.
- SUBSELECT: avoids N + 1 and does not duplicate data, but loads all objects of the associated type into memory.
Tables were built using ascii-tables .