JPA API versus Ebean - missing features in Java JPA API

JPA API Issues

There are several interesting ommissions from the JPA API that need to be explored.

Batching Support
Caching
Entity Listeners
Raw JDBC - Access to java.sql.Connection
Transaction Isolation Level
Large Query Support - Callback

Batching Support

Batching could be argued to exist in the JPA API in terms of flushing. That is, the mechanism of flushing in JPA is likely to involve transparent use of JDBC Statement Batching.

However, Transparent Statement Batching may not always meet the needs of Developers. Specifically in the case where JDBC v3 getGeneratedKeys is not supported (e.g. Oracle9) then developers may wish more control and for example forgo getGeneratedKeys in the view that they will not be later updating the bean instances.

That is, if you are going to insert lots of beans and not update them afterwards you may wish to explicitly choose NOT to get back the generated keys (involving another statement/query) in the knowledge you are not going to update the same bean later. In summary, bulk insert without getGeneratedKeys support.

You may also wish to have explicit control over the size of the batch.

You may also wish to turn off the cascading nature of Save and Delete for a batch execution process. Specifically you may wish to have exact control over the objects included in the save/delete and exclude any associated parent or child objects that may normally be included by a cascade.

Explicit control over getGeneratedKeys, batch size and Cascading Save/Delete

Caching

Caching is an important aspect in that it can provide very significant performance improvements versus fetching data from the database. With JPA there is no explicit control over when to use caching, how to invalidate cached data and how this works in a clustered environment.

To be fair this is a non-trivial design issue and it is possibly unfair to ask JPA to specify this. However this does leave open some tough questions for JPA implementations.

To provide a flag to indicate explicitly you want to use a cache is fairly easy. The tough questions revolve around how to invalidate the cache and how clustering effects the cache.

Q: If an external program/Stored Procedure runs how do you notify the framework and how does it know which cached objects should be invalidated?

Q: For a clustered environment how is the cache maintained. More specifically after each commited transaction how much information is sent around the cluster in order to maintain the cache. For example, if every inserted updated or deleted bean is sent across the cluster then this could be a significant amount of data and become prohibative.

Ebean provides a simple mechanisim for explicitly invalidating the cache based on table names. This has the benefit of making it very easy to invalidate the appropriate objects after an external program/Stored Procedure is executed assuming you know the tables involved.

This has the additional benefit of requiring only modified table information to be sent around the cluster rather than the beans or even list of primary key values. This means that the cost of maintaining the cache in a clustered environment is very light.

The downside of this approach is that cache invalidation may be more aggresive invalidating more objects out of the cache than strictly necessary. My opinion is that this downside is more than matched by the simplicity of use and light clustering cost.

Explicit and simple mechanisim to invalidate the cache.

Low cost Cluster aware cache maintenance.

Entity Listeners

The JPA Entity Listeners give developers the ability to enhance the Insert Update and Delete behaviour. However, they do not give developers the ability to replace the Insert Update Delete behaviour. For developers wishing to replace default behaviour with stored procedure calls or writing to files etc they will have to look elsewhere.

Ebean BeanController provides the ability to enhance or replace the Insert Update Delete behaviour.

Ebean additionally provides a BeanListener mechanism for cluster wide listening to bean inserted updated and deleted events. With this mechanisim the developer controls what data is sent around the cluster for the remote BeanListeners to recieve. BeanListeners are used by this website to generate cached content. There is no equivilent in JPA.

BeanListener differs from BeanController in that it only recieves committed events, can recieve cluster wide events and runs in a separate background thread. See com.avaje.ebean.bean.BeanListener for details.

Ability to not only Enhance but Replace Insert Update Delete behaviour.

Cluster wide Listening of Inserted Updated Deleted events with BeanListener.

Raw JDBC - Access to java.sql.Connection

With JPA there is no explicit access the the underlying java.sql.Connection. It could be argued that this is a good thing. However Ebean makes this available in that real world requirements may require this access. Specifically I see it as benefical to easily get direct JDBC access for those that need it.

For the use of Savepoints, advanced Clob and Blob manipulation, advanced stored procedures and a general requirement for direct JDBC access Ebean makes java.sql.Connection available from the Transaction object.

JDBC access for Savepoints, Clob/Blob manipulation, Advanced stored procedures

Transaction Isolation Level

JPA does not have support for creating a transaction with an explicit transaction isolation level. Interesting.

Large query support - Callback

You may wish to run a query that will have a lot of beans/rows and perform some process for each bean. For example, write the data to a XML/CSV file. In doing so you may not want to pull all the rows back into a List Set or Map but rather process each row via a callback mechanism.

The reason for this is that the memory requirements for putting all the beans into a List before processing them all can be very high if there are a very large number of beans.

Refer to FindListener.

Query Callback mechanism (FindListener)