Summary:
I am trying to minimize the number of queries generated by a JPA-based java application in a database. I pointed out the @BatchFetch(BatchFetchType.IN) optimization hint, but I still see some additional requests that I find redundant and unnecessary.
Details:
Given a simple domain model: We have an invoice system. The invoice relates to OneToOne order. We also have a customer who has OneToMany relationship with Orders. (Customer 1-> M Order 1 <-1 Invoice) . Read more about here . Find the full source code here . Here is the entity definition that is currently worth:
Client.java (excluding getters and setter ):
@Entity(name = "CUSTOMER") public class Customer { @Id //signifies the primary key @Column(name = "CUST_ID", nullable = false) @GeneratedValue(strategy = GenerationType.AUTO) private long custId; @Column(name = "FIRST_NAME", length = 50) private String firstName; @OneToMany(mappedBy="customer",targetEntity=Order.class, fetch=FetchType.LAZY) private Collection<Order> orders; }
Order.java (excluding getters and setter ):
@Entity(name = "ORDERS") public class Order { @Id @Column(name = "ORDER_ID", nullable = false) @GeneratedValue(strategy = GenerationType.AUTO) private long orderId; @Column(name = "TOTAL_PRICE", precision = 2) private double totPrice; @OneToOne(fetch = FetchType.LAZY, optional = false, cascade = CascadeType.ALL, mappedBy = "order") private Invoice invoice; @ManyToOne(optional = false) @JoinColumn(name = "CUST_ID", referencedColumnName = "CUST_ID") private Customer customer; @ManyToMany(fetch = FetchType.LAZY) @JoinTable(name = "ORDER_DETAIL", joinColumns = @JoinColumn(name = "ORDER_ID", referencedColumnName = "ORDER_ID"), inverseJoinColumns = @JoinColumn(name = "PROD_ID", referencedColumnName = "PROD_ID")) private List<Product> productList; }
Invoice.java (excluding getters and setter ):
@Entity(name = "ORDER_INVOICE") public class Invoice { @Id // signifies the primary key @Column(name = "INVOICE_ID", nullable = false) @GeneratedValue(strategy = GenerationType.AUTO) private long invoiceId; @Column(name = "AMOUNT_DUE", precision = 2) private double amountDue; @OneToOne(optional = false, fetch = FetchType.LAZY) @JoinColumn(name = "ORDER_ID") private Order order; }
With this model, I conducted a simple test to get all customer orders.
EntityManagerFactory entityManagerFactory = Persistence.createEntityManagerFactory("testjpa"); EntityManager em = entityManagerFactory.createEntityManager(); Customer customer = em.find(Customer.class, 100L); Collection<Order> orders = customer.getOrders(); for(Order order: orders){ System.out.println(order.getInvoice().getInvoiceId()); } em.close();
Since everything was lazy, we received four requests, as shown:
1398882535950|1|1|statement|SELECT CUST_ID, APPT, CITY, EMAIL, FIRST_NAME, LAST_NAME, STREET, LAST_UPDATED_TIME, ZIP_CODE FROM CUSTOMER WHERE (CUST_ID = ?)|SELECT CUST_ID, APPT, CITY, EMAIL, FIRST_NAME, LAST_NAME, STREET, LAST_UPDATED_TIME, ZIP_CODE FROM CUSTOMER WHERE (CUST_ID = 100) 1398882535981|0|1|statement|SELECT ORDER_ID, OREDER_DESC, ORDER_DATE, TOTAL_PRICE, LAST_UPDATED_TIME, CUST_ID FROM ORDERS WHERE (CUST_ID = ?)|SELECT ORDER_ID, OREDER_DESC, ORDER_DATE, TOTAL_PRICE, LAST_UPDATED_TIME, CUST_ID FROM ORDERS WHERE (CUST_ID = 100) 1398882535995|1|1|statement|SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID = ?)|SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID = 111) 1398882536004|0|1|statement|SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID = ?)|SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID = 222) DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID =?) | SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID 1398882535950|1|1|statement|SELECT CUST_ID, APPT, CITY, EMAIL, FIRST_NAME, LAST_NAME, STREET, LAST_UPDATED_TIME, ZIP_CODE FROM CUSTOMER WHERE (CUST_ID = ?)|SELECT CUST_ID, APPT, CITY, EMAIL, FIRST_NAME, LAST_NAME, STREET, LAST_UPDATED_TIME, ZIP_CODE FROM CUSTOMER WHERE (CUST_ID = 100) 1398882535981|0|1|statement|SELECT ORDER_ID, OREDER_DESC, ORDER_DATE, TOTAL_PRICE, LAST_UPDATED_TIME, CUST_ID FROM ORDERS WHERE (CUST_ID = ?)|SELECT ORDER_ID, OREDER_DESC, ORDER_DATE, TOTAL_PRICE, LAST_UPDATED_TIME, CUST_ID FROM ORDERS WHERE (CUST_ID = 100) 1398882535995|1|1|statement|SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID = ?)|SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID = 111) 1398882536004|0|1|statement|SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID = ?)|SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID = 222)
Since I do not need N + 1 calls to receive invoices, I thought about using batch sampling and reduced the number of requests to 4 (one request to extract invoices for all customer orders). To do the same, I updated my Order object as shown below:
Updated - Order.java, the addition of a BatchFetch for an invoice. (Excluding getters and setter ):
@Entity(name = "ORDERS") public class Order { @Id @Column(name = "ORDER_ID", nullable = false) @GeneratedValue(strategy = GenerationType.AUTO) private long orderId; @Column(name = "TOTAL_PRICE", precision = 2) private double totPrice; @BatchFetch(BatchFetchType.IN) @OneToOne(fetch = FetchType.LAZY, optional = false, cascade = CascadeType.ALL, mappedBy = "order") private Invoice invoice; @ManyToOne(optional = false) @JoinColumn(name = "CUST_ID", referencedColumnName = "CUST_ID") private Customer customer; @ManyToMany(fetch = FetchType.LAZY) @JoinTable(name = "ORDER_DETAIL", joinColumns = @JoinColumn(name = "ORDER_ID", referencedColumnName = "ORDER_ID"), inverseJoinColumns = @JoinColumn(name = "PROD_ID", referencedColumnName = "PROD_ID")) private List<Product> productList; }
I repeated the same test again and assumed 3 requests would be received to receive the data. (One for the customer, One for Orders and One for batch invoice collection). However, an eclipse generates 5 queries for the same. The following are the queries:
1398883197009|1|1|statement|SELECT CUST_ID, APPT, CITY, EMAIL, FIRST_NAME, LAST_NAME, STREET, LAST_UPDATED_TIME, ZIP_CODE FROM CUSTOMER WHERE (CUST_ID = ?)|SELECT CUST_ID, APPT, CITY, EMAIL, FIRST_NAME, LAST_NAME, STREET, LAST_UPDATED_TIME, ZIP_CODE FROM CUSTOMER WHERE (CUST_ID = 100) 1398883197030|0|1|statement|SELECT ORDER_ID, OREDER_DESC, ORDER_DATE, TOTAL_PRICE, LAST_UPDATED_TIME, CUST_ID FROM ORDERS WHERE (CUST_ID = ?)|SELECT ORDER_ID, OREDER_DESC, ORDER_DATE, TOTAL_PRICE, LAST_UPDATED_TIME, CUST_ID FROM ORDERS WHERE (CUST_ID = 100) 1398883197037|1|1|statement|SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID IN (?,?))|SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID IN (111,222)) 1398883197042|1|1|statement|SELECT ORDER_ID, OREDER_DESC, ORDER_DATE, TOTAL_PRICE, LAST_UPDATED_TIME, CUST_ID FROM ORDERS WHERE (ORDER_ID = ?)|SELECT ORDER_ID, OREDER_DESC, ORDER_DATE, TOTAL_PRICE, LAST_UPDATED_TIME, CUST_ID FROM ORDERS WHERE (ORDER_ID = 222) 1398883197045|0|1|statement|SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID = ?)|SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID = 222) DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID IN (,)??) | SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM 1398883197009|1|1|statement|SELECT CUST_ID, APPT, CITY, EMAIL, FIRST_NAME, LAST_NAME, STREET, LAST_UPDATED_TIME, ZIP_CODE FROM CUSTOMER WHERE (CUST_ID = ?)|SELECT CUST_ID, APPT, CITY, EMAIL, FIRST_NAME, LAST_NAME, STREET, LAST_UPDATED_TIME, ZIP_CODE FROM CUSTOMER WHERE (CUST_ID = 100) 1398883197030|0|1|statement|SELECT ORDER_ID, OREDER_DESC, ORDER_DATE, TOTAL_PRICE, LAST_UPDATED_TIME, CUST_ID FROM ORDERS WHERE (CUST_ID = ?)|SELECT ORDER_ID, OREDER_DESC, ORDER_DATE, TOTAL_PRICE, LAST_UPDATED_TIME, CUST_ID FROM ORDERS WHERE (CUST_ID = 100) 1398883197037|1|1|statement|SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID IN (?,?))|SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID IN (111,222)) 1398883197042|1|1|statement|SELECT ORDER_ID, OREDER_DESC, ORDER_DATE, TOTAL_PRICE, LAST_UPDATED_TIME, CUST_ID FROM ORDERS WHERE (ORDER_ID = ?)|SELECT ORDER_ID, OREDER_DESC, ORDER_DATE, TOTAL_PRICE, LAST_UPDATED_TIME, CUST_ID FROM ORDERS WHERE (ORDER_ID = 222) 1398883197045|0|1|statement|SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID = ?)|SELECT INVOICE_ID, AMOUNT_DUE, DATE_CANCELLED, DATE_RAISED, DATE_SETTLED, LAST_UPDATED_TIME, ORDER_ID FROM ORDER_INVOICE WHERE (ORDER_ID = 222)
I did not understand why the last two queries are being generated. Any help to explain what was going on would be helpful.
Thanks!