Join, FetchMode, FetchType, Fetch Join 에 대해
JPA를 사용할때, 테이블간 join 시 fetch 를 어떻게 하느냐에 따라 DB 에 실제로 hit 하는 query 가 달라지고, 성능에도 영향을 미친다.
아래와 같은,
- 고객 주문과 (CustomerOrderEntity, customer_orders)
- 주문history (CustomerOrderHistoryEntryEntity, customer_orders_status_history_entries)
를 담당하는 테이블이 있다고 할때 이런 query가 있다고 하자.
@Query("SELECT t FROM CustomerOrderEntity t JOIN t.orderHistory s WHERE NOT EXISTS (SELECT 1 FROM t.orderHistory intern_s WHERE intern_s.sequenceNumber > s.sequenceNumber) AND ((s.status IN ('DELIVERED', 'CANCELED') AND s.timestamp >= :finishedAfter) OR s.status IN ('CHECKEDOUT', 'DISPATCHED', 'CUSTOMER_NO_ANSWER'))")
List<CustomerOrderEntity> findAllActiveOrFinishedAfter(Instant finishedAfter);
Entity는 아래처럼 생겼다고 하자.
@Entity
@Table(name = "customer_orders")
public class CustomerOrderEntity {
...
@OneToMany(cascade = CascadeType.ALL, orphanRemoval = true)
@JoinColumn(name = "customer_order_id")
public List<CustomerOrderHistoryEntry> orderHistory = new ArrayList<>();
...
그리고 현재 DB 에는 4건의 주문이 있다고 하자.
1.가장 기본적인 경우이다. @OneToMany 는 기본적으로 lazy option 이다.
@OneToMany(cascade = CascadeType.ALL, orphanRemoval = true)
@JoinColumn(name = "customer_order_id")
public List<CustomerOrderHistoryEntry> orderHistory = new ArrayList<>();
#query 호출 시작
Hibernate: select customero0_.id as id1_33_, customero0_.gift_type as giftt2_33_, customero0_.coupon_id as coup3_33_, customero0_.ordr_category as ordr_cate4_33_, customero0_.priority as priority5_33_, customero0_.membership_type as memb_6_33_, customero0_.version as version7_33_ from customer_orders customero0_ inner join customer_orders_status_history_entries statushist1_ on customero0_.id=statushist1_.customer_order_id where not (exists (select 1 from customer_orders_status_history_entries statushist2_ where customero0_.id=statushist2_.customer_order_id and statushist2_.sequence_number>statushist1_.sequence_number)) and ((statushist1_.status in ('DELIVERED' , 'CANCELED')) and statushist1_.timestamp>=? or statushist1_.status in ('CHECKEDOUT' , 'DISPATCHED' , 'CUSTOMER_NO_ANSWER'))
#query 호출 종료
! 나중에 history 객체에 접근할때가 되서야 Hibernate가 각각의 history 테이블에 접근한다.
#history 객체 접근 시작
Hibernate: select statushist0_.customer_order_id as custome7_34_0_, statushist0_.id as id1_34_0_, statushist0_.id as id1_34_1_, statushist0_.in_time as in_time2_34_1_, statushist0_.sequence_number as sequence3_34_1_, statushist0_.status as status4_34_1_, statushist0_.timestamp as timestam5_34_1_, statushist0_.version as version6_34_1_ from customer_orders_status_history_entries statushist0_ where statushist0_.customer_order_id=?
#history 객체 접근 종료
#history 객체 접근 시작
Hibernate: select statushist0_.customer_order_id as custome7_34_0_, statushist0_.id as id1_34_0_, statushist0_.id as id1_34_1_, statushist0_.in_time as in_time2_34_1_, statushist0_.sequence_number as sequence3_34_1_, statushist0_.status as status4_34_1_, statushist0_.timestamp as timestam5_34_1_, statushist0_.version as version6_34_1_ from customer_orders_status_history_entries statushist0_ where statushist0_.customer_order_id=?
#history 객체 접근 종료
#history 객체 접근 시작
Hibernate: select statushist0_.customer_order_id as custome7_34_0_, statushist0_.id as id1_34_0_, statushist0_.id as id1_34_1_, statushist0_.in_time as in_time2_34_1_, statushist0_.sequence_number as sequence3_34_1_, statushist0_.status as status4_34_1_, statushist0_.timestamp as timestam5_34_1_, statushist0_.version as version6_34_1_ from customer_orders_status_history_entries statushist0_ where statushist0_.customer_order_id=?
#history 객체 접근 종료
#history 객체 접근 시작
Hibernate: select statushist0_.customer_order_id as custome7_34_0_, statushist0_.id as id1_34_0_, statushist0_.id as id1_34_1_, statushist0_.in_time as in_time2_34_1_, statushist0_.sequence_number as sequence3_34_1_, statushist0_.status as status4_34_1_, statushist0_.timestamp as timestam5_34_1_, statushist0_.version as version6_34_1_ from customer_orders_status_history_entries statushist0_ where statushist0_.customer_order_id=?
#history 객체 접근 종료
2.FetcyTYpe 을 EAGER 로 설정하면 어떻게 되는지 보자.
@OneToMany(cascade = CascadeType.ALL, orphanRemoval = true, fetch = FetchType.EAGER)
@JoinColumn(name = "customer_order_id")
public List<CustomerOrderHistoryEntry> orderHistory = new ArrayList<>();
#Hibernate will fetch children before access but it fetches for each child
#query 호출 시작
Hibernate: select customero0_.id as id1_33_, customero0_.gift_type as giftt2_33_, customero0_.coupon_id as coup3_33_, customero0_.ordr_category as ordr_cate4_33_, customero0_.priority as priority5_33_, customero0_.membership_type as memb_6_33_, customero0_.version as version7_33_ from customer_orders customero0_ inner join customer_orders_status_history_entries statushist1_ on customero0_.id=statushist1_.customer_order_id where not (exists (select 1 from customer_orders_status_history_entries statushist2_ where customero0_.id=statushist2_.customer_order_id and statushist2_.sequence_number>statushist1_.sequence_number)) and ((statushist1_.status in ('DELIVERED' , 'CANCELED')) and statushist1_.timestamp>=? or statushist1_.status in ('CHECKEDOUT' , 'DISPATCHED' , 'CUSTOMER_NO_ANSWER'))
Hibernate: select statushist0_.customer_order_id as custome7_34_0_, statushist0_.id as id1_34_0_, statushist0_.id as id1_34_1_, statushist0_.in_time as in_time2_34_1_, statushist0_.sequence_number as sequence3_34_1_, statushist0_.status as status4_34_1_, statushist0_.timestamp as timestam5_34_1_, statushist0_.version as version6_34_1_ from customer_orders_status_history_entries statushist0_ where statushist0_.customer_order_id=?
Hibernate: select statushist0_.customer_order_id as custome7_34_0_, statushist0_.id as id1_34_0_, statushist0_.id as id1_34_1_, statushist0_.in_time as in_time2_34_1_, statushist0_.sequence_number as sequence3_34_1_, statushist0_.status as status4_34_1_, statushist0_.timestamp as timestam5_34_1_, statushist0_.version as version6_34_1_ from customer_orders_status_history_entries statushist0_ where statushist0_.customer_order_id=?
Hibernate: select statushist0_.customer_order_id as custome7_34_0_, statushist0_.id as id1_34_0_, statushist0_.id as id1_34_1_, statushist0_.in_time as in_time2_34_1_, statushist0_.sequence_number as sequence3_34_1_, statushist0_.status as status4_34_1_, statushist0_.timestamp as timestam5_34_1_, statushist0_.version as version6_34_1_ from customer_orders_status_history_entries statushist0_ where statushist0_.customer_order_id=?
Hibernate: select statushist0_.customer_order_id as custome7_34_0_, statushist0_.id as id1_34_0_, statushist0_.id as id1_34_1_, statushist0_.in_time as in_time2_34_1_, statushist0_.sequence_number as sequence3_34_1_, statushist0_.status as status4_34_1_, statushist0_.timestamp as timestam5_34_1_, statushist0_.version as version6_34_1_ from customer_orders_status_history_entries statushist0_ where statushist0_.customer_order_id=?
#query 호출 종료
! 첫번째 query 이후, history 테이블도 미리 fetch 하는것을 볼 수 있다. 이 시점에서 아직 code level 에선 history 객체에 접근하기 전이다. 그리고 아래에서 접근한다. 이미 fetch 되어 있으므로 DB hit 이 필요없다.
#history 객체 접근 시작
#history 객체 접근 종료
#history 객체 접근 시작
#history 객체 접근 종료
#history 객체 접근 시작
#history 객체 접근 종료
#history 객체 접근 시작
#history 객체 접근 종료
3.Fetch 는 Lazy 로 하되, FetchMode 를 SUBSELECT 로 해보자.
@OneToMany(cascade = CascadeType.ALL, orphanRemoval = true)
@JoinColumn(name = "customer_order_id")
@Fetch(FetchMode.SUBSELECT)
public List<CustomerOrderHistoryEntry> orderHistory = new ArrayList<>();
#query 호출 시작
Hibernate: select customero0_.id as id1_33_, customero0_.gift_type as giftt2_33_, customero0_.coupon_id as coup3_33_, customero0_.ordr_category as ordr_cate4_33_, customero0_.priority as priority5_33_, customero0_.membership_type as memb_6_33_, customero0_.version as version7_33_ from customer_orders customero0_ inner join customer_orders_status_history_entries statushist1_ on customero0_.id=statushist1_.customer_order_id where not (exists (select 1 from customer_orders_status_history_entries statushist2_ where customero0_.id=statushist2_.customer_order_id and statushist2_.sequence_number>statushist1_.sequence_number)) and ((statushist1_.status in ('DELIVERED' , 'CANCELED')) and statushist1_.timestamp>=? or statushist1_.status in ('CHECKEDOUT' , 'DISPATCHED' , 'CUSTOMER_NO_ANSWER'))
#query 호출 종료
# lazy 이기 때문에 바로 history 까지 가져오진 않는다. 그런데 하나의 아이템에 access 하게 되면, 한번의 query 로 모두 가져온다.
#history 객체 접근 시작
Hibernate: select statushist0_.customer_order_id as custome7_34_1_, statushist0_.id as id1_34_1_, statushist0_.id as id1_34_0_, statushist0_.in_time as in_time2_34_0_, statushist0_.sequence_number as sequence3_34_0_, statushist0_.status as status4_34_0_, statushist0_.timestamp as timestam5_34_0_, statushist0_.version as version6_34_0_ from customer_orders_status_history_entries statushist0_ where statushist0_.customer_order_id in (select customero0_.id from customer_orders customero0_ inner join customer_orders_status_history_entries statushist1_ on customero0_.id=statushist1_.customer_order_id where not (exists (select 1 from customer_orders_status_history_entries statushist2_ where customero0_.id=statushist2_.customer_order_id and statushist2_.sequence_number>statushist1_.sequence_number)) and ((statushist1_.status in ('DELIVERED' , 'CANCELED')) and statushist1_.timestamp>=? or statushist1_.status in ('CHECKEDOUT' , 'DISPATCHED' , 'CUSTOMER_NO_ANSWER')))
#history 객체 접근 종료
#history 객체 접근 시작
#history 객체 접근 종료
#history 객체 접근 시작
#history 객체 접근 종료
#history 객체 접근 시작
#history 객체 접근 종료
4.Fetch 를 EAGER 로 하면서 동시에 SUBSELECT 모드를 선택해보자.
@OneToMany(cascade = CascadeType.ALL, orphanRemoval = true, fetch = FetchType.EAGER)
@JoinColumn(name = "customer_order_id")
@Fetch(FetchMode.SUBSELECT)
public List<CustomerOrderHistoryEntry> orderHistory = new ArrayList<>();
#query 호출 시작
Hibernate: select customero0_.id as id1_33_, customero0_.gift_type as giftt2_33_, customero0_.coupon_id as coup3_33_, customero0_.ordr_category as ordr_cate4_33_, customero0_.priority as priority5_33_, customero0_.membership_type as memb_6_33_, customero0_.version as version7_33_ from customer_orders customero0_ inner join customer_orders_status_history_entries statushist1_ on customero0_.id=statushist1_.customer_order_id where not (exists (select 1 from customer_orders_status_history_entries statushist2_ where customero0_.id=statushist2_.customer_order_id and statushist2_.sequence_number>statushist1_.sequence_number)) and ((statushist1_.status in ('DELIVERED' , 'CANCELED')) and statushist1_.timestamp>=? or statushist1_.status in ('CHECKEDOUT' , 'DISPATCHED' , 'CUSTOMER_NO_ANSWER'))
Hibernate: select statushist0_.customer_order_id as custome7_34_1_, statushist0_.id as id1_34_1_, statushist0_.id as id1_34_0_, statushist0_.in_time as in_time2_34_0_, statushist0_.sequence_number as sequence3_34_0_, statushist0_.status as status4_34_0_, statushist0_.timestamp as timestam5_34_0_, statushist0_.version as version6_34_0_ from customer_orders_status_history_entries statushist0_ where statushist0_.customer_order_id in (select customero0_.id from customer_orders customero0_ inner join customer_orders_status_history_entries statushist1_ on customero0_.id=statushist1_.customer_order_id where not (exists (select 1 from customer_orders_status_history_entries statushist2_ where customero0_.id=statushist2_.customer_order_id and statushist2_.sequence_number>statushist1_.sequence_number)) and ((statushist1_.status in ('DELIVERED' , 'CANCELED')) and statushist1_.timestamp>=? or statushist1_.status in ('CHECKEDOUT' , 'DISPATCHED' , 'CUSTOMER_NO_ANSWER' )))
#query 호출 종료
# 위에서 보는것처럼 EAGER 하게 history 테이블까지 가져왔고, 가져올때도 해당하는 history 테이블 전체를 한번에 가져왔다.
#history 객체 접근 시작
#history 객체 접근 종료
#history 객체 접근 시작
#history 객체 접근 종료
#history 객체 접근 시작
#history 객체 접근 종료
#history 객체 접근 시작
#history 객체 접근 종료
@Query("SELECT t FROM CustomerOrderEntity t JOIN FETCH t.orderHistory s WHERE NOT EXISTS (SELECT 1 FROM t.orderHistory intern_s WHERE intern_s.sequenceNumber > s.sequenceNumber ) AND ((s.status IN ('DELIVERED', 'CANCELED') AND s.timestamp >= :finishedAfter) OR s.status IN ('CHECKEDOUT', 'DISPATCHED', 'CUSTOMER_NO_ANSWER'))")
위와 같이 query 내부에 JOIN 대신 JOIN FETCH 를 할 수가 있는데, 이렇게 되면 query 한번으로 customer order 테이블과, order history 테이블을 한번의 query hit 로 가져오며, select projection 에서도 두 테이블의 컬럼을 모두 표시한다.
이때 발생할 수 있는 문제점은, 위의 경우, customer order status 가 3가지 경우가 있는데, (CHECKEDOUT, DISPATCHED, DELIVERED), 쿼리를 통해 return 된 결과값은 하나의 order status 만 포함한다.
그럴수 밖에 없는게, query 를 한번에 하되, customer order 와 order history 를 inner join 으로 가져오면, 하나의 customer order 에는 하나의 order history 만 mapping 될 수 있기 때문이다.
#query 호출 시작
Hibernate: select customero0_.id as id1_33_0_, statushist1_.id as id1_34_1_, customero0_.gift_type as giftt2_33_0_, customero0_.coupon_id as coup3_33_0_, customero0_.ordr_category as ordr_cate4_33_0_, customero0_.priority as priority5_33_0_, customero0_.membership_type as memb_6_33_0_, customero0_.version as version7_33_0_, statushist1_.in_time as in_time2_34_1_, statushist1_.sequence_number as sequence3_34_1_, statushist1_.status as status4_34_1_, statushist1_.timestamp as timestam5_34_1_, statushist1_.version as version6_34_1_, statushist1_.customer_order_id as custome7_34_0__, statushist1_.id as id1_34_0__ from customer_orders customero0_ inner join customer_orders_status_history_entries statushist1_ on customero0_.id=statushist1_.customer_order_id where not (exists (select 1 from customer_orders_status_history_entries statushist2_ where customero0_.id=statushist2_.customer_order_id and statushist2_.sequence_number>statushist1_.sequence_number)) and ((statushist1_.status in ('DELIVERED' , 'CANCELED')) and statushist1_.timestamp>=? or statushist1_.status in ('CHECKEDOUT' , 'DISPATCHED' , 'CUSTOMER_NO_ANSWER'))
#query 호출 종료
#history 객체 접근 시작
#history 객체 접근 종료
#history 객체 접근 시작
#history 객체 접근 종료
#history 객체 접근 시작
#history 객체 접근 종료
#history 객체 접근 시작
#history 객체 접근 종료
즉 Fetch Type 은 자식 entity 혹은, join 으로 딸려오는 entity 를 언제 DB 에서 가져오느냐를 결정했다.
@OneToOne, @ManyToOne 은 default 로 EAGER 이고,
@OneToMany, @ManyToMany 는 default 로 LAZY 이다.
Fetch.SUBSELECT 은 join 하는 테이블을 id 를 기준으로 하나씩 가져올지, 한꺼번에 가져올지를 결정한다. 내부적으로 테스트 했을때, 1000개의 customer order 가 있을때 쿼리를 실행하면,
@Fetch(FetchMode.SUBSELECT) 했을때는 약 250ms, 하지 않았을때는 1300ms 이상 걸렸다.
LAZY 의 단점은
- database 로의 roundtrip이 필요하기 때문에 성능에 영향을 끼친다.
- session 이 닫혔거나, 객체가 detach 된 뒤에 객체에 접근을 시도하게 되면 LazyInitializationException 이 발생한다.
EAGER 의 단점은
- EAGER load 로 인해 읽힌 entity 가 또 다른 EAGER load 를 불러올 수 있어, 느려질 수 있다.
- 다시 LAZY 로 바꿀수 없어, 접근할 필요가 없는데도 항상 load 된다.
- collection 의 경우, size limit 이 없다.
- 그리고 이런 성능상의 문제는 개발시엔 잘 나타나지 않고, 실 서비스에서 더 잘 나타난다. 데이터의 규모 때문에.
또한 둘다, runtime 시에 이 타입을 바꿀 수 없다.
대부분의 경우 우리는 DTO 로, 혹은 VIEW 로 보낼 attribute 들을 알고 있기 때문에 이 컬럼들만 eagerly fetch 하도록 하고싶다.
이때 FETCH JOIN 이 사용될 수 있다. 단, 필요에 따라 query 를 만들어줘야 한다.
FETCH JOIN 의 단점은
- 위에 예시처럼, error-prone 이다.
- 필요헤따라 각각의 query 에 설정해주어야 한다.
그래서 나온게 entity graph 라는것 같은데, 정의에 따르면 a reusable declarative fetch join on a set of attributes. 라고 한다.
이부분은 다음에 더 자세히 살펴볼것.
참조
https://stackoverflow.com/questions/5816417/how-to-properly-express-jpql-join-fetch-with-where-clause-as-jpa-2-criteriaq
https://thorben-janssen.com/hibernate-tips-difference-join-left-join-fetch-join/
https://www.baeldung.com/jpa-entity-graph
Comments
Post a Comment