Community.microfocus.com

Wiki Page: BES - Performance Tuning EJB-based applications

2014-08-19

Introduction Tuning the application Tuning the Java Virtual Machine Tuning Borland Enterprise Server Conclusion Appendix Introduction This document discusses some of the issues involved in building high performance, scalable applications using the Enterprise JavaBean (EJB) architecture and optimizing those applications on Borland Enterprise Server in particular. Most of the suggestions and guidelines presented in this paper are based on real-world performance tests using the industry-standard benchmark: ECperf. We've have spent well over a year working with ECperf to optimize the performance of Borland Enterprise Server, and the performance of the Java Virtual Machine. Although getting a benchmark such as ECperf running fast is a useful exercise, the ultimate goal is for customers to be able to optimize their own enterprise applications. With this goal in mind, we focus more on the techniques used to measure and optimize performance rather than focusing on exact configuration parameters for any particular combination of hardware and software. For example, in running ECperf, we determined that using a certain heap size gave us best performance. In this paper, we will not discuss what heap size we ultimately selected, but instead what the tradeoffs were in selecting a particular memory combination. Although this paper is focused on performance optimization, we try not to lose sight of the fact that performance is not the only yardstick by which enterprise applications are measured. Just as important (or more so!) are metrics such as: Complexity: how hard is the system to build? Modularity: how much of the system can be reused in future projects? Extensibility: how easy is it to enhance the system? Longevity: what is the usable lifetime of the system? In this paper, we keep in mind that getting an enterprise application fast is only a small part of the overall challenge. Though it is possible to write an extremely fast application using low-level techniques, such systems though admittedly faster than a well designed application, are brittle, hard to maintain and extend. Also, the use of low-level techniques leads to trading off modularity and extensibility for raw performance. As such, we avoid making suggestions that trade off performance improvements for an inferior implementation. The Application There are number of aspects to optimizing an EJB-based application. The first step is to build the application correctly. If an application is poorly designed or otherwise misuses the capabilities of the J2EE platform, there is little hope of optimizing away those problems. Only a well-designed J2EE application stands a chance of running fast. There are numerous resources on various design patterns, strategies and tradeoffs in building J2EE applications and this paper will not rehash information already freely available. Some resources of possible interest to the reader are: Enterprise Blueprints & J2EE Design Patterns from Sun Microsystems EJB Design Patterns by Floyd Marinescu Running and benchmarking the application Once the application has been built, the next step is to run it on Borland Enterprise Server (more on tuning this later) and accessing the application the way typical users would with data input and output. The benchmark harness should test all aspects of the application, and should test the application across all scales of intended use - right up to maximum users, transactions and throughput. Note that the benchmark harness should control randomness in tests. That is, random test sequences if used, should be reproducible so that results from different test runs can be compared objectively. Note: Borland Enterprise Server lazy-loads all resources that an application might directly or indirectly be using. These resources include initialization of threads in thread pools to service client requests, establishing database connections, filling up of pools and caches for Enterprise JavaBeans, etc. The overhead of these initializations, though a one-time affair, can skew application benchmark times. Therefore, it is essential that the application be ramped up to a steady state (where all the above mentioned resources are initialized) before initial benchmark measurements are recorded. See appendix for more details on how to ensure that steady state has been achieved. What to measure Though it is the application that is primarily being benchmarked, there are a number of other factors in the overall system that have an impact on application performance. It is useful to break down measurements into times spent on components in different tiers, times spent on marshaling and unmarshaling data sent between different tiers and times spent on network transfers. A general checklist of possible items to measure are: CPU usage on machines running the application within Borland Enterprise Server CPU usage on the database server Disk and memory usage on database server Memory used by Borland Enterprise Server Network traffic between different tiers, latency times Throughput (e.g.: acceptable response times, transactions per second )etc. Frequency & length of garbage collection cycles by the Java Virtual Machine (JVM) The above figures should help determine where the possible bottlenecks lie when the application is being put through its hoops. Once the top bottlenecks are identified, it is generally best to choose the topmost problem to crack. The tools and techniques used to measure performance at this level (system-wide) are different from those used to drill down into any particular sub-system or component. Some tools useful here are network packet snoopers and performance monitoring tools such as PerfMon (Windows) and mpstat (Solaris). Remember that performance tuning is an iterative process. Fixing a problem in one area may introduce problems or side-effects in other areas of the system. For example, if the EJB pool and cache sizes were increased to raise the likelihood of a cache hit (that is, tuning at Borland Enterprise Server level), these increased sizes might start to impact garbage collection. How to profile your application Borland Enterprise Server's EJB Container can display statistics on the times spent performing various activities – both internal to the Container implementation as well as to your application code. The statistics can be viewed by selecting the ‘Statistics’ tab when the EJB Container is selected in the console as indicated by figure below. Fig 1: Viewing EJB Container statistics Once the application is being accessed, the statistics will change as data is gathered and displayed by the console. The various statistics displayed are explained below: Name Description Dispatch_POA Time spent receiving the TCP request, and sending the reply. Dispatch_Home Time spent in the container dispatching methods to EJBHome objects. Dispatch_Remote Time spent in the container dispatching methods to EJBRemote objects. Dispatch_Bean Time spent in the bean methods. Note that this is broken down in detail on a method-by-method basis if detailed timers are used. EntityHome Time spent in the container implementing EJBHome-related operations specific to Entity beans. Passivate_SB Time spent passivating Stateful Session beans. BeginTx Time spent beginning transactions. CommitTx Time spent committing transactions. RollbackTx Time spent rolling back transactions. ResourceCommit Time spent specifically in committing the work done on the resource (that is, in committing to a database such as Oracle). Synchronization Time spent in various Synchronization callbacks. LoadClass Time spent loading classes from EJB Jars, etc. ( startup cost only ). CMP_Init Time spent initializing the CMP engine ( startup cost only ). CMP Time spent in the CMP engine (excluding other CMP tasks listed explicitly). CMP_Update Time spent in the CMP engine doing SQL updates. CMP_Query Time spent in the CMP engine doing SQL queries. CMP_PrepareStmt Time spent in the CMP engine preparing statements ( startup cost only ). CMP_GetConn Time spent in the CMP engine getting JDBC connections ( startup cost only ). ORB_Activate Time spent in the ORB allocating objects from pools. ORB_Deactivate Time spent in the ORB releasing objects to pools. Jdbc(1|2)_GetCon Time spent in the Jdbc1 or Jdbc2 datasource to get a pooled connection. Jdbc(1|2)_NewCon Time spent in the Jdbc1 or Jdbc2 datasource to get a new connection ( startup cost only ). Jdbc(1|2)_RegRes Time spent in the Jdbc1 or Jdbc2 datasource to register a transaction resource in transaction service. Jdbc2_NewXaCon Time spent in the Jdbc2 datasource to get a new XA-enabled connection ( startup cost only ). Jdbc2_XaStart Time spent in the Jdbc2 datasource starting a transaction branch. Jdbc2_XaEnd Time spent in the Jdbc2 datasource to end a transaction branch. Basic statistics can be viewed from the console in various graph or tabular formats. Fig 2: Basic statistics gathering from the EJB Container Detailed statistics can be gathered from the Container along with method level timing information to identify hotspots and problem areas within your application code. This can be done by running the EJB Container standalone (from the command prompt) with the JVM parameter –DEJBDetailTimers prompt osagent & prompt vbj –Xms128m –Xmx256m –DEJBDetailTimers com.inprise.ejb.Container ejbcontainer MyModule.ear –jns –jts –jss The statistics are gathered and displayed on screen (numerical data in tabular format) when the EJB Container is accessed by clients. The statistics can be redirected to a log file as well for analysis at a later point in time. As mentioned earlier, Borland Enterprise Server's EJB Container has a bias towards lazy loading resources that an application might directly or indirectly be using. Since statistics gathering is a cumulative process, it is imperative that the benchmark be run for a long enough period to avoid one-time initialization times to skew benchmark timings. For example, when the EJB Container has been restarted, and a single client invokes a method on a session bean – which in turn invokes a method on a local entity bean; then the cost of getting a new JDBC Connection (start up cost only) maybe up to 70+% of the total time spent on the server-side for that method invocation. As the application is being accessed more frequently (that is, more methods are invoked by clients), this time will not disappear; but only decrease. After say 20,000 method invocations, the cost of getting a new JDBC connection will be only a fraction (say 2%) of the total amount of time spent on the server-side. Sample output from a run with the EJBDetailTimers flag turned on: Action Total (ms) Count T/C (ms) Percent Description Dispatch_POA 271 20004 0.01 1.31% Dispatch_Home 9670 160001 0.00 46.81% Dispatch_LHome 10 20000 0.0 0.04% Dispatch_RHome 0 1 0.0 0.0% Dispatch_Cmpt 0 140000 0.0 0.00% Dispatch_Bean 8870 400041 0.06 42.94% 0 120000 0.0 0.0% DepartmentBean_PM= javax.ejb.EntityBean.ejbStore() 0 20000 0.0 0.0% DepartmentBean_PM= Department.getPhoneNo() 0 1 0.0 0.0% PerfSessionBean= PerfSessionHome.create() 30 120000 0.0 0.0% DepartmentBean_PM= javax.ejb.EntityBean.ejbLoad() 0 1 0.0 0.0% PerfSessionBean= javax.ejb.SessionBean.setSessionContext() 10 20000 0.0 0.0% DepartmentBean_PM= Department.getMngrNo() 0 20000 0.0 0.0% DepartmentBean_PM= Department.getLocation() 0 20000 0.0 0.0% DepartmentBean_PM= Department.getDeptNo() 0 19 0.0 0.0% DepartmentBean_PM= javax.ejb.EntityBean.ejbActivate() 0 20 0.0 0.0% DepartmentBean_PM= javax.ejb.EntityBean.setEntityContext() 8480 20000 0.42 41.05% DepartmentBean_PM= DepartmentHome.findByPrimaryKey() 390 20000 0.01 1.88% PerfSessionBean= PerfSession.test() 0 20000 0.0 0.0% DepartmentBean_PM= Department.getDepartment() 0 20000 0.0 0.0% DepartmentBean_PM= Department.getHeadDept() EntityHome 10 120000 0.0 0.04% BeginTx 271 20000 0.01 1.31% CommitTx 80 140000 0.0 0.38% Synchronization 71 280006 0.0 0.34% LoadClass 50 48 1.04 0.24% PrepareStmt 110 6 18.33 0.53% ORB_Activate 40 12 3.33 0.19% Jdbc2_GetCon 361 140002 0.0 1.74% Jdbc2_NewCon 0 1 0.0 0.0% Jdbc2_RegRes 80 140002 0.0 0.38% Jdbcw_NewCon 762 1 762.0 3.68% Total 20656 The above statistics were obtained when a client invoked a method on a session bean 20,000 times. The session bean in turn invokes business methods on a local CMP 2.0 entity bean. Looking at the count of some of the methods, we see that the Department entity beans’ ejbLoad() and ejbStore() have been invoked 120,000 times. In addition, we see that the CommitTx (Committed Transactions) count is 140,000. This number points to the setting of incorrect transactional attributes as a possible suspect (20,000 method invocations on the server by a remote client correspond to 140,000 committed transactions on the server-side). Note: The OptimizeIt product can be used to profile and tune Java and J2EE applications. Details of this product and its usage is beyond the scope of this write-up; but a number of online guides and papers are available that enumerate the steps involved in setting up, profiling and analyzing the results obtained by running OptimizeIt on your application. The Java Virtual Machine Tuning the Java Virtual Machine can result in the one of the biggest boosts (apart from tuning your application) in the overall performance of your application. The performance of JVMs can vary between vendors as well as between minor versions of the JVM from the same vendor. To test the relative merits of different VMs, it is a good idea to benchmark your application against different VMs. The performance of JVMs can also vary between different operating systems and hardware. Therefore, benchmarks and performance tuning should be done on the same (or at least similar) hardware and software that the application is expected to be deployed on. One of the areas where most performance gains are likely to be seen is in the tuning of heap sizes at the JVM level. Insufficient heap sizes are indicated by frequent garbage collection runs followed possibly by the virtual machine running out of memory. Since the garbage collector is not directly under your control, the only way to see what the garbage collector is up to is to pass the -verbose:gc flag to the JVM. Increasing the heap size - while decreasing the frequency of garbage collection runs may result in longer times for a full garbage collection run. For example, on gigabyte size heaps; full garbage collection runs can take many seconds and sometimes up to a minute in some cases. In most current implementations of JVMs, the garbage collection thread is synchronous - that is, all the other threads are blocked while the GC thread executes. The effects of the GC thread on a single-processor system might be minimal, but on a multi-processor machine, the running of the garbage collection thread can decrease overall throughput significantly. An optimal tuning strategy is to set a heap size that minimizes the frequency and duration of garbage collection runs while maximizing the overall throughput and average response times of your application. Fig 3: Verbose output from the garbage collector JDK 1.3.x uses the HotSpot VM which uses a sophisticated generational garbage collector. A discussion of HotSpot and its tuning is beyond the scope of this paper, but a useful resource is Sun Microsystems document on Performance of the Java HotSpot VM . The following is a simplified checklist of points to keep in mind while tuning the Java Virtual Machine: Set the -verbose:gc flag on the JVM used by Borland Enterprise Server's Partition process to get an idea of the frequency and duration of garbage collection runs during typical application usage. Tune the minimum and maximum heap sizes ( -Xms and -Xmx flags) of partition(s). Note that the maximum heap size should never be more than the amount of physical memory (RAM) available in the system. If running multiple Borland Enterprise Server partitions on the same machine, ensure that the sum of maximum heap sizes in every partition is less than the amount of physical memory. Do not forget to take into account physical memory used by the operating system itself and other processes when calculating amount of physical memory. Typically the maximum heap size would be no more than 90% of free physical memory. See appendix for more details. A number of parameters are available to tune the HotSpot VM. The paper mentioned earlier - ‘ Performance of the Java HotSpot VM ’ also enumerates the behavior of the various JVM parameters (e.g.: -XX:NewSize -XX:MaxNewSize -XX:MaxTenuringThreshold -XX:SurvivorRatio ). In some cases, the HotSpot Client VM (set using the -client flag) gives better performance figures than the HotSpot Server VM (set using the -server flag) for an application under Borland Enterprise Server on hardware with less than 4 CPUs. On Solaris, an alternate threading library is available which provides better performance. This threading library is not used by the 1.3.1 JDK by default, but needs to be explicitly set by specifying it as a parameter ( LD_LIBRARY_PATH=/usr/lib/lwp ) in the partition's nativeservice.properties file. See appendix for more details. It is recommended that the latest JDK be tried even if it differs only slightly from the shipping JDK. For example, the JDK 1.3.1_004 (also bundled with Borland Enterprise Server 5.1) provides a measurable performance improvement over the bundled 1.3.1_001 version. Borland Enterprise Server 5.2 will also bundle the JDK 1.4.x , and this should give a substantial boost to performance over earlier JDK versions. Some versions of the HotSpot VM don't scale on multiple CPU machines. In this case, it is useful to run multiple JVMs on a single machine and also do a processor-affinity - that is, the JVM processes are dedicated to specific subsets of CPUs. Caveat: It is recommended that you measure the effect of turning on processor-affinity as this can sometimes adversely affect performance. Borland Enterprise Server Tuning the Borland Enterprise Server can provide significant improvements in the performance of your application. Here we go through some of the areas which affect performance and how these areas can be tuned at Borland Enterprise Server level. General Borland Enterprise Server uses a thread pool to service client requests. While creating a thread to service each client request might be a viable option in small-medium scale applications, this model is likely to be problematic when there are a large number of clients. In this scenario, the server would have to spawn a thread for every request and once the number of threads exceeds a certain threshold (determined by the JVM, hardware and OS), the overhead of managing the large number of threads in terms of thread context switching, priorities and memory management will lead to system to thrash around and fail. This can be observed when the system uses all available hardware resources but does not do anything particularly useful. The solution in this case is to place an event queue in front of the thread pool. How the event queue works is that the queue is filled up during periods of burst activity and serviced during periods of inactivity. While this solution might lead to client requests being serviced a little slower, the overall throughput of the system will increase. The VisiBroker property used to control the size of the thread pool is vbroker.se.iiop_tp.scm.iiop_tp.dispatcher.threadMax=N and can be set in the partition's vbroker.properties file. The optimal value of this property depends on the combination of the JVM, hardware, OS and number of partitions. A recommended starting point of this value for each Borland Enterprise Server's partition is: N = 5 * number of CPUs / number of partitions . Note that some experimentation with this value will be necessary and the value of this property can greatly impact the performance of your application when accessed concurrently by a large number of clients. So, the goal of tuning this property should be to find the smallest number of JVM threads that can optimally service client requests. For example, we noticed that moving from a thread pool size of 10 to 28 on twin-CPU, 2 Ghz machine while keeping all other aspects of the server constant resulted in over a 20% degradation in system response time. By default, Borland Enterprise Server's EJB Container collects and outputs bean statistics. Collection of bean statistics and other diagnostics can be turned off by the following flags ejb.collect.stats_gather_frequency=0 and ejb.collect.statistics=false in the partition service's ejbcontainer.properties file. The default behavior of the EJB Container is to use Pass-by-Reference semantics for calls between beans with remote interfaces in the same EJB Container. This is a performance optimization - and the default behavior can be changed to use Pass-by-Value semantics instead. Using Pass-by-Value semantics will incur an additional performance penalty as method arguments and return values have to be copied when the caller EJB invokes methods on the remote interface of the callee EJB. This can be done by right-clicking the EJB Container from the console, choosing 'Configure' and then turning on the check box for 'Use Pass by Value for intra bean calls' . Partition level services that are not used by the application can be turned off. These services can include the web container, stateful session bean storage, VisiConnect container, JDataStore service etc.,. Deployed modules at the partition level that are not used (the examples bundled with Borland Enterprise Server for instance) can be disabled to reduce memory footprint. Security if not used can be turned off. Entity Beans The EJB specification has evolved over time towards the use of Entity Beans for data access instead of raw JDBC. This evolution can be seen in each revision of the specification: in EJB 1.0, Entity Beans were optional; in EJB 1.1, Entity Beans were mandatory; and the CMP model was enhanced; in EJB 2.0, the CMP model was substantially enhanced again. As the Entity Bean specification has evolved, so have various implementations of Container Managed Persistence. Today's best CMP engines support extremely high-performance data access, while providing a high degree of portability across J2EE compliant application servers, and a high degree of portability across various flavors of DBMS. Over time, there are a great number of optimizations that can be made on the CMP version of the application that cannot be made on the JDBC version of the application. This is due to the fact that the EJB Container has a great deal of control over how to execute the CMP code, but has almost no control over raw JDBC. So, for example, the CMP engine can reorder the data access to the database to trade off latency for throughput, and thereby scale the system further. Or, the Container can trade of time for space, by using algorithms that either use more memory, but less CPU time, or vice versa. And while these same optimizations are possible for the direct JDBC code, this requires code changes to your application. In the CMP case, it is the AppServer that is being configured and optimized, not your code. CMP vs. BMP In the above discussions, we compare direct JDBC access with Container Managed Persistence. What about Bean Managed Persistence, which some would argue is the happy medium between the two? Unfortunately, BMP is not a happy medium, it is an abysmal compromise. Typically, BMP code is both complex and brittle when compared to CMP code, while not having the optimization possibilities afforded by CMP. As such, it is strongly recommended that Bean Managed Persistence be avoided, unless one is using an older application server that either does not support CMP, or has very limited support for CMP. Certainly, if a product supports EJB 2.0 (as does our AppServer 5.x), or has good support for the 1.1 version of CMP (as does our AppServer 4.x), then BMP is a poor alternative. There are a number of reasons frequently cited for using BMP instead of CMP. Most, if not all of these are invalid in the case of Borland's product. Below we list the commonly cited problems, along with our solution: Myth: BMP supports more complex queries than does CMP. Reality: In the Borland Enterprise Server, any query that can be defined in SQL can be used to implement an EJB 1.1 finder method. That is, the query language supported by Borland Enterprise Server is unconstrained. In fact, even DBMS-specific syntax can be used in our queries. Unfortunately, in EJB 2.0 the query language is much more constrained it is in our EJB 1.1 implementation. To handle this problem, we allow the user to implement a finder method (or select method) explicitly in bean code. So, if a particular query cannot be implemented using CMP, one can implement just that one method using BMP, and allow all other aspects of persistence be handed by the Container. Myth: BMP must be used to implement relationships and other complex mappings. Reality: Borland provides a much richer set of O/R mappings as part of CMP (both for version 1.1 and version 2.0) than do most other EJB products. In particular, we provide support for all types of Entity relationships in both EJB 1.1 and EJB 2.0, we support dependent objects in both models, and we support complex data types such as CLOBs and BLOBs in both models. In cases where data types beyond those natively supported by CMP are required (new SQL data types such as java.sql.Array, for example), the CMP engine can be augmented. That is, support for additional, arbitrary data types can be added by the user. Myth: BMP is faster than CMP. Reality: Our performance tests indicate that CMP can be significantly faster than equivalent BMP implementations. This is borne out in our testing with ECperf, which provides both BMP and CMP based code. In these tests, we saw throughput increases of over 100% using the CMP version of the benchmark. Tuning CMP Borland Enterprise Server supports transaction commit options A, B and C for Entity Beans as defined in the EJB specification (1.1 and 2.0). The performance of Entity Beans when using a particular option can vary depending on the application being benchmarked. A detailed paper that describes how Entity Beans are handled by Borland Enterprise Server along with guidelines on tuning related parameters (such as entity bean pool and cache sizes) is available. Various miscellaneous but nevertheless important parameters can be set as properties in the deployment descriptor of an Entity Bean. These properties can have a measurable impact on performance. We enumerate through some of the important properties and their effects: ejb.checkExistenceBeforeCreate : Borland Enterprise Server by default, checks for the existence of a particular row in the database prior to performing an SQL INSERT. If a large number of INSERT s (that is, calls to ejbCreate() on an Entity Bean) occur in your application, then setting the value of this property to false can speed up things considerably. Caveat: A side-effect of setting this property to false is that the Container will not throw the javax.ejb.DuplicateKeyException to indicate that the entity object could not be created because an entity object with the same key already exists, but will now throw the superclass of the DuplicateKeyException - the javax.ejb.CreateException . ejb.findByPrimaryKeyBehavior : The possible values for this property are Verify| Load| None . Verify -- is the standard behavior as described in the EJB specification. Here the Container verifies that primary key does exist in the database. Setting the value to Verify will result in two SQL calls to move the bean instance into the transactional state where business methods can be executed. The first SQL performs a SELECT on the primary key field(s) only to verify existence and the second SQL performs a SELECT to synchronize the persistent fields of the entity bean with its values in the database. Load -- this behavior causes the bean's state to be loaded into the container when findByPrimaryKey is invoked, if the finder call is running in an active transaction. The assumption is that found objects will typically be used, and it is optimal to go ahead and load the object's state at find time. This setting is the default. None -- This behavior indicates that findByPrimaryKey should be a no-op. Basically, this causes the verification of the bean to be deferred until the object is actually used. Since it is always the case that an object could be removed between calling find and actually using the object, for most programs this optimization will not cause a change in client logic. load-state of finder queries (CMP 1.x only): Setting the value of this property to true (in the deployment descriptor) results in a single SQL to load all the results of an SQL query into appropriate entity bean instances. However, this aggressive loading might be wasteful when only a sub-set of the total number of items returned by a query are accessed. In such a scenario, turning off state loading for that finder query is recommended. ejb.invalidateFinderCollectionAtCommit (CMP 2.x only): Whether or not to optimize transaction commit by invalidating finder collections. ejb.cmp.optimisticConcurrencyBehavior : The acceptable values for this property are UpdateAllFields| UpdateModifiedFields| VerifyModifiedFields| VerifyAllFields . UpdateAllFields performs an update on all of an entity's fields, regardless if they were modified or not. UpdateModifiedFields performs an update only on fields known to have been modified prior to the update being issued. VerifyModifiedFields verifies the entity's modified fields against the database prior to update. VerifyAllFields verifies all the entity’s fields against the database prior to update regardless if they were modified or not. Stateful Session beans The EJB Container supports stateful session enterprise beans using a high-performance caching architecture based on the Java Session Service (JSS). There are two pools of objects: the ready pool and the passive pool. Enterprise beans transition from the ready pool to the passive pool after a configurable timeout. Transitioning an enterprise bean to the passive pool stores the enterprise bean's state in a database. Passivation of stateful sessions exists for two purposes: Maximize memory resources Implement failover Borland Enterprise Server 5.1 now supports any relational database as the backend (in addition to the Borland JDataStore) for storing the state of session bean when a passivation cycle is triggered by the EJB Container. At deployment time, the deployer uses the Borland Enterprise Server's tools to set a passivation timeout for the EJB Container in a particular Partition. The container regularly polls active session beans to determine when they are last accessed. If a session bean has not been accessed during the timeout period, its state is sent to persistent storage and the bean instance is removed from memory. Simple passivation Passivation timeouts are set at the container-level. You use the property ejb.sfsb.passivation_timeout to configure the length of time a session bean can go un-accessed before its state is persisted and its instance removed from memory. This length of time is specified in seconds. The default value is five seconds. This property can be set in the container properties file for the Partition you are configuring. This file is located at: install-dir /var/servers/ server-name /adm/properties/partitions/ partition-name /services/ejbcontainer.properties This file can be edited to set the ejb.sfsb.passivation_timeout property. Aggressive passivation Aggressive passivation is the storage of session state regardless of its timeout. A bean that is set to use aggressive passivation will have its session state persisted every time it is polled, although its instance will not be removed from memory unless it times out. In this way, if a container instance fails in a cluster, a recently-stored version of the bean is available to other containers using identical JSS instances communicating with the same back-end. As in simple passivation, if the bean times out, it will still be removed from memory. Again, aggressive passivation is set container-wide using the boolean property ejb.sfsb.aggressive_passivation . Setting the property to true will store the session's state regardless if it was accessed before the last passivation attempt. Setting the property to false allows the container to use only simple passivation. Caveat: Bear in mind that, while using aggressive passivation will aid in failover, it will result in a performance hit as the container must access the database more often. If you configure the JSS to use a non-native database (that is, you choose not to use JDataStore), this loss of performance can be even greater. Be aware of the trade-off between availability and performance before you elect to use aggressive passivation. Sessions in secondary storage Most sessions are not kept in persistent storage forever after they timeout. Borland provides a mechanism for removing stored sessions from the database after a discrete period of time known as the keep-alive timeout. The keep-alive timeout specifies the minimum amount of time in seconds to persist a passivated session in stateful storage. The actual amount of time it is kept in the database can vary as it would not be prudent from a performance standpoint to constantly poll the database for disused sessions. The actual amount of time a session is persisted is at least the value of the keep-alive timeout and not more than twice the value of the keep-alive timeout. The Borland JSS implementation uses the property ejb.sfsb.keep_alive_timeout to specify the amount of time (in seconds) to maintain a passivated session in stateful storage. The default value is 86,400 seconds, or twenty-four hours. Like the other properties discussed above, you set the keep-alive timeout in the EJB Container properties file: $BES /var/servers/ server-name /adm/properties/partitions/standard/services/ejbcontainer.properties Remember that any value you specify here can be over-ridden by setting a keep-alive timeout for a specific session bean. Refer to the Developer's Guide for more information on this topic under the chapter titled Caching of Stateful Session Beans . JDBC Borland Enterprise Server's EJB Container has a number of optimizations that prevent unnecessary round-trips to the database. We will not discuss too much on this topic since the Developer's Guide bundled with Borland Enterprise Server 5.1 has two chapters on this topic - one devoted to Using JDBC and the other on Transaction Management and JDBC . The use of the Lite Transaction Manager (Lite TM) which supports one-phase commit protocol is usually recommended and will normally suffice for your application. The usage of VisiTransact Transaction Manager (VisiTransact TM) which supports two-phase commit is recommended iff (a) Your application accesses multiple (possibly heterogeneous) datasources in the scope of a transaction and ACID properties need to be preserved and/or (b) transactions cross systems/platforms and JVMs. The Lite TM is much faster than the VisiTransact TM. The JDBC driver is an often overlooked part of the application and the performance of JDBC drivers from different vendors can vary. It is recommended that JDBC drivers from different vendors be benchmarked against your application to ensure that this component is not the bottleneck. JDBC 2 DataSources perform as well or even better than JDBC 1.x DataSources and should be tried. JDBC 2 usage is illustrated in the bank and order examples. The Developer's Guide has information on DataSources along with related properties and their effects. maxPreparedStatementCacheSize : Borland Enterprise Server maintains a pool of connections and caches PreparedStatement s associated with each database connection. When that connection is reused from the pool on subsequent access, and the same PreparedStatement is created again; Borland Enterprise Server can now retrieve the one from its cache. Thus, Borland Enterprise Server avoids the overhead of having to create and pre-compile PreparedStatements on every database access. The default value of this property is 40, and it is recommended that for database intensive applications, this number be increased. The optimal value of this number can be judged from the count value in the output obtained when turning on the JVM param EJBDetailTimers or EJBTimers at the EJB Container level (or viewing EJB Container statistics from the console). Note: The PreparedStatement cache is utilized transparently by both the AppServer (in the case of CMP) and your application code(in the case of BMP or JDBC access from Session EJBs, JSPs, Servlets etc,.). Borland Enterprise Server tends to pull a lot of the load from the database-tier and into the AppServer-tier unlike some of the competition. Unfortunately, this may hurt performance on the lower-end of the scale. But databases can only be scaled vertically - that is, getting it to run faster can only be achieved by buying a bigger faster box, and this can get expensive very easily. AppServers on the other hand can be scaled horizontally by clustering them and horizontal scaling is less expensive than vertical scaling. Borland Enterprise Server does more work at the AppServer-tier to avoid doing as much work in the database-tier. With competing products, clustering the AppServer beyond a certain threshold will not increase performance. This is because the database tier is saturated and therefore performance can only be scaled up by buying a bigger machine for the database. Consequently, if a given EJB application imposes a lighter load on the database tier, then the same database can be used for more applications. If benchmarking Borland Enterprise Server against competing products, it is therefore recommended that the load on the database-tier be included as an important factor in benchmark measurements Conclusion There are a number of steps in optimizing your EJB-based application and obviously a number of levels at which tuning is possible - such as the application itself, the Java Virtual Machine and Borland Enterprise Server. Some areas that have a huge impact on performance have not been discussed. These include: Ensure that the database tables are properly indexed. Additional indexes on columns that are used in the WHERE clause of finders may help speed up execution of queries. A side-effect of the above is that database INSERT s can become expensive since the database now has to perform additional indexing. When using a single Web Container, only one connection exists between the Web Container and the EJB Container. And subsequently, all communication between these two containers is multiplexed over this single connection. However, this might lead to scalability problems when servicing a large number of web clients and it is therefore recommended that the Web Container be clustered (see Developer's and User's Guide for information on clustering at the web-tier) to help boost performance. Appendix Attaining steady state As mentioned earlier, Borland Enterprise Server's bias towards lazy loading (internal) resources that Enterprise JavaBeans might be using can skew benchmark times. It is therefore imperative that the Borland Enterprise Server and its JVM attain a steady state before the rampup period ends. So, the question might be: How do I ensure that Borland Enterprise Server and its JVM attain a steady state before I start benchmarking my application? The answer is pretty simple. Borland has found that simple CPU monitoring tools (such as the Task Manager under Windows, Perfmeter under Solaris and top under Linux) can provide hints as to whether steady state has been attained. Under reasonable load, one sign to notice is that the CPU graph stays at 100% while the server JVM is 'warming' up. The CPU graph will eventually stabilize to a value less than 100% with some breathing space. This usually indicates that steady state has been attained. If the CPU graph stays consistently at 100% without any breathing space, then it is highly likely that the AppServer-tier is the bottleneck. The solution in this case is to reduce the load on the application to allow the CPUs to have some breathing space; or alternately, to start scaling up and out by clustering the AppServer-tier. Tuning JVM heap sizes of partitions The heap size of a partition can be tuned by opening up the Borland Enterprise Server console, connecting to a server; right-clicking the partition to be tuned and selecting ‘Configure’ from the popup menu. The last tab in the configure dialog ‘Config file’ displays the contents of the partition's partition_server.config file. This file is located in the following directory: $BES/var/servers/ server_name /adm/properties/partitions/ partition_name Find the lines in the file corresponding to the minimum and maximum heap sizes indicated by the virtual machine parameters (vmparam) –Xms and –Xmx . For example to set the minimum heap size of that partition's JVM to 256 MB and the maximum heap size to 720 MB, the following lines in the partition_server.config must be modified to read: vmparam –Xms256m vmparam –Xmx720m Configuring alternate threading library in Solaris An alternate threading library can be specified by right-clicking a partition on a server, and selecting ‘Configure’ from the popup menu. Under the ‘General’ tab in the configure dialog, under the label ‘Environment variables’ , add the entry: LD_LIBRARY_PATH=/usr/lib/lwp This has the effect of adding the following line to the partition's nativeservice.properties file located in the following directory: $BES/var/servers/ server_name /adm/properties/partitions/ partition_name nativeservice.environment="LD_LIBRARY_PATH=/usr/lib/lwp" To verify that the above argument is indeed set and used by the partition, add the following line to the above mentioned partition_server.config file: nativeservice.application.arguments=–debug OptimizeIt References Building performance and quality into Java applications Delivering better Java software, faster Integrating Borland Optimizeit Suite with IBM WebSphere Studio Application Developer 4.0 Performance Tuning Essentials for J2SE and J2EE Java Performance Assurance Guide Java Memory Leaks Temporary Object Usage