2009-04-24

Building a high available and scalable architecture with WebLogic Server

For once, I'd like to talk about general points. As fas as I know, I still haven't seen any process

that has been respecting all these simple rules.

 

Determine the target

Define your capacity planning.

What kind of business are you dealing with ?

Is it fault tolerant ?

Is there money involved ?

 

Your application

First of all, make sure you know what EAI, ETL, SOA and so on refer to.

Because before you build your house, you'd better know what tools exist & could help you.

 

Designing your application

Separation of concern (a layer per function)

Use design patterns where applicable

Make your developpers write the documentation when they're coding. Else, you'll get some unmaintainable code.

Be sure that each class has its unit test !

No custom database access, use a connection pool.

Think beyond the scope of your application : are there some business services which could be reused later ?

Choose the technology that fits your needs (webservices are not THE ULTIMATE solution)

For instance, don't use Hibernate if you can't use it or tune it properly. Prefer a custom DAO layer if you feel more comfortable with it.

Generally, don't use a tool / framework just because it's cool or trendy. Take it if you really need it and

understand what it can bring to you.

Dependency injection : why not if you master the framework you're using.

What kind of data is going to be exchanged between your application and 3rd party app ? (size, protocol, frequence)

Do you need to manipulate huge loads of data ?

Make some POCs !!! And compare performance measures !

 

Tuning your application for production use

Prefer the request scope instead of session : it will help keeping your session light.

Make all the session objects serializable and don't forget to use "transient" objets where applicable.

Your session should not exceed 35kb (for a good replication performance)

Think about packaging. Don't forget that some people are going to manage your application ! Eventually, use deployment plans.

 

Defining WorkManagers

Even if you don't want to use them for now, define one for each subcomponent (webapp / EJB).

When starting, if WLS doesn't find it, it will use the default : you'll still be able to use them later.

 

Continuous Integration

Think about Maven !

Be sure that your code can be compiled EVERY DAY ! Use reports to detect as early as possible

the cyclomatic complexity, the lack of documentation, unit tests coverage and so on.

When integrating a new version, start from scratch : no iterative delivery.

You can even scratch your install and reinstall it from the beginning : a good way to know if you're at ease with WLST :)

 

Your architecture

 

Which frontend ?

Choose a configuration with a minimum intelligence (WebServer + WLS Plugin)

If you plan using a hardware load-balancer, make sure it handles passive cookies and that you have someone in your team

who knows how it works and how to configure it.

 

How many managed servers ?

Depends on the capacity planning.

Should involve several physical machines, to have better SLA and/or handle a hardware crash.

You may think about setting up a MAN or WAN if needed (site disaster).

 

Security configuration (SSL, Authentication, Connection Filters ...) ?

Set some connection filters to avoid man-in-the-middle attacks.

Use SSL if your data is important and must be encrypted but change the DemoCertificates & Keystores.

Remember that SSL has an impact on performance.

SSL might not be sufficient : use a real authentication if needed.

 

Creating your WebLogic domain

 

Admin Server

Chose a (free) default port and determine if you wish to dedicate a port for administration traffic.

 

Template and WLST

Don't forget to industrialize your building. If you have to setup a new configuration at once,

it will spare you a lot of time.

 

Managed Servers

Pack / Unpack commands to export / import the domain on different machines. (it's like a zip but without the unnecessary files).

 

Setting up a cluster

Use multicast only for backward compatibility.

Else Oracle recommends strongly the use of Unicast

Don't forget about the cluster address (even if WLS is completing it for you, it's good to know which servers are

defined in your cluster).

 

Creating your own certificates

Using keytool, manage your own certificates and keystores (trust & identity).

Once you've got your CSR, use a CA or make your own.

 

Setting up the NodeManager (machines)

I'm going to quote official documentation as it's very detailed.

http://edocs.beasys.com/wls/docs103/nodemgr/nodemgr_config.html

 

Tuning your domain performance

 

Defining relevant log levels

No debug on a production server. By cutting out the unnecessary I/O, you'll gain performance and limit the risks of a "disk "quota exceeded".

Keep the logs on a dedicated file system, so that when they dwell, they do not crash the whole system.

 

Enabling Native I/O

Sounds obvious, but a native socketmuxer can give you a large performance gain.

 

Configuring session replication

Depends if you have to, but if it's the case, be sure your session isn't too big, and that all the objects you want to replicate are serializable.

Chose wisely your replication group.

 

Tuning the thread pool

The thread pool has to be defined to a sufficient initial value (default is one and is obviously not enough).

Adjust this parameter according to the test results you got from your load tests.

 

Tuning the connection pool

One good way to size your connection pool is once again linked to the capacity planning. In a word, you have to put a quite high number

of connections. You launch a stress test and then you take a look at the max number of connections that were created.

It will give you the right number.

 

Configuring Panic & Overload modes

By default, no action is performed on an OOME and on stuck threads. So far, you can't do a lot of things, but it's better than nothing.

If your server faces a OOME (Out Of Memory Exception), you may request it to shut down. That way, if a NodeManager is up,

it can start your server up again. The main con is that you may not be aware of a server failure as it is fully automated.

Same thing for stuck threads : you can either chose to shut the instance down or switch it to an admin mode, that is to say,

serving only administration requests, for the time threads get unstuck.

 

Monitoring (optional but recommended)

 

JConsole

Starting with JVM 1.5, Sun has shipped with its JVM a very nice tool to monitor and supervize a JVM : the JConsole.

http://java.sun.com/developer/technicalArticles/J2SE/jconsole.html

 

JRockit Mission Control

BEA and now Oracle has its own JVM : JRockit. And the team (hi swedish fellows !) has developped a very cool tool as well.

It allows to monitor what's happening on your JVM, but can also record any activity for a later analysis.

 

WLDF

Built-in (and then free) tool to replace the "8.1 Performance Tab".

You can choose to monitor all the MBeans exposed by WebLogic Server, that is to say, almost everything !

http://e-docs.bea.com/wls/docs103/wldf_configuring/understand_wldf_config.html

And you may instrument your code as well, as explained in one of my previous posts.

 

SNMP

The use of external tools such as Nagios, Cacto, Tivoli or Mercury BAC is possible with the activation of SNMP.

http://edocs.beasys.com/wls/docs103/snmpman/index.html

 

Oracle Enterprise Manager

http://www.oracle.com/enterprise_manager/index.html

 

Summary : Application Delivery Process

It sounds obvious but it's not often the case : your application must follow a lifecycle such as :

  • => Development / Unit Testing
  • => Integration (continuous) : build from scratch the application and make sure it deploys correctly and works.
  • => FuncTest : Play all the scenarii that are going to be played by users.
  • => Benchmark : Once the application is approved : determine the capacity planning and perform
  • stress tests (very high charge on a short timeline) & load tests (110% of your estimated charge on a large timeline)
  • => PreProduction : Deployment on this environment as if it were the final production environment.
  • => GO-LIVE : Two options.
    • Either you apply the same procedure you did for the PreProduction (with a nightly shutdown for instance)
    • Or you switch the roles and make the PreProduction become the Production and vice-versa.
    • Or best solution, you have two clusters with a load-balancer which is going to perform a "graceful migration".

 

3 comments:

Anonymous said...

Good job Max !

Unknown said...

Hi Max
the info is superb..you posted a very valid point ..how to decide how many manged servers.i to have the same doubt.can you please advice me.My stats are as below

registered users (on the system hosted on weblogic) : 10000
concurrent users : 3000-5000
transactions (peak hour ) - 7500 - 8000
http concurrent sessions - as many as above.

based on this info can we assess how many managed servers are needed at the outset

Ced said...

"[...] you may request it to shut down. That way, if a NodeManager is up, it can start your server up again [...]"

Voilà qui est clair en ce qui concerne ce qui est à faire quand les conditions d'overload sont vérifiées.

Bref clair et explicite ! Merci et bonne continuation :)