2012-09-09

Why is my MDB not reconnecting ?


I have recently encountered a problem I thought fixed for ages : the problem of a JMS consumer which is not reconnecting when the distant queue recovers from a momentary service loss.

As far as I can remember, I guess the first time I noticed that problem was with WebLogic Server 7 !
Now that I’m working with a recent (but not the most recent) version of WLS, I sincerely thought this bug was history …

To sum up the context in which I got that error : we’re using WLS, as the center of our messaging system, and BPEL PM (which has been live for more than a month now, by the way), which is used to coordinate messages (just as WLI did).

And when we had to restart WLS, I could see that no more messages were received by BPEL PM … (and you could think that the main source of the problem comes from the JCA adapter, but it’s not because I bypassed it and created my own JMS Bridge, based on a simple MDB)

So, in this blogpost, I’m going to expose how the error looks like (it’s not that obvious) and what to do to get rid of it.

Context


My project is based on WLS 10.3.4 so that’s the version I’m going to use.

The use case has been voluntarily simplified : it consists in two WLS domains (Producer / Consumer), with only one admin server in each domain.

I’ll create a JMS queue on the Producer domain, and deploy a MDB (obviously connecting to that queue through a foreign JMS server) on the Consumer domain.

image

Normally, everything should go according to plan. Then I’m going to restart the “Producer” domain. The behavior I’m expecting is that I should see some error messages from the “Consumer” domain, but when the distant server is back up & running, I’d like my MDB to reconnect and the error messages to stop flooding my logs.

Problem

First of all, if you’re trying to reproduce the problem yourself, please note that you may change the console cookie name if you don’t want to end up with a logout each time you click on a link in the other console. And you could also unable the on-demand deployment for the console.

Producer domain setup

First of all, I’ve got to create the JMS structure, composed of :
  • a JMS server
  • a JMS module, with a sub-deployment aiming at the previously created JMS server
  • a connection factory (JNDI name : fr.mbutton.blog.jms.cf)
  • and finally a jms queue (JNDI name : fr.mbutton.blog.jms.queue)
image

That being done, I will check if everything is theoretically callable from the outside, by checking the JNDI tree.

If not, you can untarget / retarget your JMS server, or restart your server. Anyway, make sure you have both your objects visible in the JNDI tree.

image

Consumer domain setup

Once you first domain is set up, we have to configure the foreign JMS server, and deploy our MDB.
Note : in order to be able to better understand what’s going on behind the scene, add the two options (where you want, as long as it is taken into account) :
-Dweblogic.debug.DebugEjbMdbConnection=true
-Dweblogic.StdoutDebugEnabled=true
I’ve set up a foreign server, connecting on “t3://localhost:7001”, with a foreign connection factory and foreign jms destination.

Name
Local JNDI
Distant JNDI
ForeignQueue blog.foreign.queue fr.mbutton.blog.jms.queue
ForeignConnectionFactory blog.foreign.cf fr.mbutton.blog.jms.cf

Note : I chosed to have a different local JNDI name for you not to mix up the domains, but I generally chose the same name.

Everything normally went fine, once again, by checking the JNDI tree, you will be sure :

image

If a exclamation point is visible near your ressources, it proves that something is not right. Double check your configuration.

Then, I created a MDB, thanks to NetBeans and since I configured my IDE to work with WLS, it should automatically propose the queue I defined, but it doesn’t work with foreign destination. It doesn’t matter : you may even skip the assistant and annotate yourself your pojo !
import javax.ejb.ActivationConfigProperty;
package fr.mbutton.blog.jms;
import javax.ejb.MessageDriven;
import javax.jms.Message;
import javax.jms.MessageListener;
/**
 * @author Maxence
 */
@MessageDriven(mappedName = "blog.foreign.queue", activationConfig = {
    @ActivationConfigProperty(propertyName = "acknowledgeMode", propertyValue = "Auto-acknowledge"),
    @ActivationConfigProperty(propertyName = "destinationType", propertyValue = "javax.jms.Queue")
})
public class NewMessageBean implements MessageListener {
    
    @Override
    public void onMessage(Message message) {
        System.out.println("A message has been received");
    }
}
Package your MDB so that you can deploy it on your server.

Once deployed, check that everything is normal in the “Monitoring” view :


image


Connection Status should be “Connected” and status “running”.

You should see, thanks to the two options we added before, some messages in your logs :
+++++++++++  Got JMSConnection ++++++++++
<17/04/2012#10:15:45.853#+0200> <Info> <EJB> <BEA-010060> <The Message-Driven EJB: NewMessageBean has connected/reconnected to the JMS destination: blog.foreign.queue.>
<17/04/2012#10:15:45.885#+0200> <Debug> <EjbMdbConnection> <BEA-000000> <[MDConnectionManager] MDB NotificationSender: State = CONNECTED>
<17/04/2012#10:15:45.886#+0200> <Debug> <EjbMdbConnection> <BEA-000000> <[MDConnectionManager] MDB NotificationSender: State = CONNECTED>
<17/04/2012#10:15:45.887#+0200> <Debug> <EjbMdbConnection> <BEA-000000> <[MDConnectionManager] MDB NotificationSender: State = UNDEPLOYING>
<17/04/2012#10:15:45.887#+0200> <Debug> <EjbMdbConnection> <BEA-000000> <[JMSConnectionPoller] disconnect is called>
<17/04/2012#10:15:45.887#+0200> <Debug> <EjbMdbConnection> <BEA-000000> <[MDConnectionManager] MDB NotificationSender: State = UNDEPLOYING>
<17/04/2012#10:15:45.888#+0200> <Debug> <EjbMdbConnection> <BEA-000000> <[MDConnectionManager] MDB NotificationSender: State = UNDEPLOYED>
<17/04/2012#10:15:45.986#+0200> <Debug> <EjbMdbConnection> <BEA-000000> <[MDConnectionManager] MDB TaskNotificationSender: State = CONNECTED>
<17/04/2012#10:15:45.986#+0200> <Debug> <EjbMdbConnection> <BEA-000000> <[JMSConnectionPoller]
Just for fun, let’s send a message : you can send it in many ways, but I’m a fan of Hermes JMS :)


image

Then you can check your message has been received by checking the monitoring of your MDB …

image

… or more simply, check your server logs :

image

Okay, everything works just fine (fortunately).

Now shut down the server from the “Producer” domain.

The MDB monitoring now indicates that :

image


And in the logs, you should see some error messages, such as :
####<15/06/2012#14:53:30.638#+0000> <Debug> <EjbMdbConnection> <localhost> <AdminServer> <[ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)'> <weblogic> <> <> <1339772010638> <BEA-000000> <[MDConnectionManager] MDB NewMessageBean: State = UNDEPLOYED>
####<15/06/2012#14:53:30.638#+0000> <Debug> <EjbMdbConnection> <localhost> <AdminServer> <[ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)'> <weblogic> <> <> <1339772010638> <BEA-000000> <[MDConnectionManager] MDB NewMessageBean: State = UNDEPLOYED>
####<15/06/2012#14:55:21.736#+0000> <Warning> <EJB> <localhost> <AdminServer> <[ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <> <1339772121736> <BEA-010061> <The Message-Driven EJB: NewMessageBean is unable to connect to the JMS destination: blog.foreign.queue. The Error was:
java.lang.SecurityException: [Security:090398]Invalid Subject: principals=[weblogic, Administrators]
        at weblogic.security.service.SecurityServiceManager.seal(SecurityServiceManager.java:833)
        at weblogic.security.service.IdentityUtility.authenticatedSubjectToIdentity(IdentityUtility.java:30)
        at weblogic.security.service.RoleManager.getRoles(RoleManager.java:183)
        at weblogic.security.service.AuthorizationManager.isAccessAllowed(AuthorizationManager.java:375)
        at weblogic.jndi.internal.ServerNamingNode.checkPermission(ServerNamingNode.java:442)
        at weblogic.jndi.internal.ServerNamingNode.checkLookup(ServerNamingNode.java:423)
        at weblogic.jndi.internal.ServerNamingNode.lookupHere(ServerNamingNode.java:180)
        at weblogic.jndi.internal.BasicNamingNode.lookup(BasicNamingNode.java:206)
        at weblogic.jndi.internal.WLEventContextImpl.lookup(WLEventContextImpl.java:254)
        at weblogic.jndi.internal.WLContextImpl.lookup(WLContextImpl.java:411)
        at javax.naming.InitialContext.lookup(InitialContext.java:392)
        at weblogic.jms.common.CDS$2.run(CDS.java:486)
        at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:363)
        at weblogic.jms.common.CrossDomainSecurityManager.runAs(CrossDomainSecurityManager.java:131)
        at weblogic.jms.common.CDS.lookupDestination(CDS.java:480)
        at weblogic.jms.common.CDS.lookupDDAndCalloutListener(CDS.java:345)
        at weblogic.jms.common.CDS.ddLookup(CDS.java:1393)
        at weblogic.jms.common.CDS.access$600(CDS.java:41)
        at weblogic.jms.common.CDS$DDLookupTimerListener.timerExpired(CDS.java:1267)
        at weblogic.timers.internal.TimerImpl.run(TimerImpl.java:273)
        at weblogic.work.SelfTuningWorkManagerImpl$WorkAdapterImpl.run(SelfTuningWorkManagerImpl.java:528)
        at weblogic.work.ExecuteThread.execute(ExecuteThread.java:207)
        at weblogic.work.ExecuteThread.run(ExecuteThread.java:176)
At first, I was completely surprised by that stack ! What in the world have CDS (Cross Domain Security) and security to do with my MDB reconnection ?

Resolution


I then open a case to the Oracle support and together, we found a similar issue which had been resolved thanks to a patch (I love when there’s a patch !).

That patch is “PIEF” in case you need it. I applied it and … nothing. I triple-checked that is had been taken into account, and yes it was …

Note : To play with patches, you can take a look on my previous blopost.
<Jun 13, 2012 11:52:54 AM UTC> <Notice> <WebLogicServer> <BEA-000365> <Server state changed to STARTING>
<Jun 13, 2012 11:52:54 AM UTC> <Info> <WorkManager> <BEA-002900> <Initializing self-tuning thread pool>
<13/06/2012#11:52:54.645#+0000> <Info> <WebLogicServer> <BEA-000214> <WebLogic Server "AdminServer" version:
WebLogic Server Temporary Patch for 11733525 Thu Feb 17 12:22:04 IST 2011
WebLogic Server Temporary Patch for BUG8268373 Wed May 04 11:35:00 IST 2011
WebLogic Server 10.3.4.0.1 PSU Patch for Bug11677325 Mon Apr 11 15:38:05 IST 2011
WebLogic Server Temporary Patch for 9923849 Mon Aug 09 15:28:06 MDT 2010
WebLogic Server 10.3.4.0  Fri Dec 17 20:47:33 PST 2010 1384255  Copyright (c) 1995, 2009, Oracle and/or its affiliates. All rights reserved.>

Well, I thought and thought and couldn’t stop thinking about the stack which was dealing with security & CDS.

As I’m a bit lazy when dealing with a proper CDS configuration, I first came with the easiest solution : domain trust.

I then entered the same password on both domains in the dedicated field (Domain > Security > General > Advanced > Credential)


image


I re-tested and … VICTORY ! It solves my problem !

Now, when I restart the server from the “producer” domain, the MDB disconnects & reconnects like it should have from the beginning !


<22/06/2012#06:20:20.593#+0000> <Debug> <EjbMdbConnection> <BEA-000000> <[JMSConnectionPoller] ** JMS Failure detected on destination:blog.foreign.queue. The exception was: weblogic.jms.common.LostServerException: java.lang.Exception: weblogic.rjvm.PeerGoneException: ; nested exception is:
        weblogic.utils.net.SocketResetException
Nested exception: java.lang.Exception: weblogic.rjvm.PeerGoneException: ; nested exception is:
        weblogic.utils.net.SocketResetException
Nested exception: weblogic.rjvm.PeerGoneException: ; nested exception is:
        weblogic.utils.net.SocketResetException
Nested exception: weblogic.utils.net.SocketResetException>
<22/06/2012#06:20:20.595#+0000> <Debug> <EjbMdbConnection> <BEA-000000> <[MDConnectionManager] MDB NewMessageBean: State = UNDEPLOYED>
<22/06/2012#06:20:20.595#+0000> <Debug> <EjbMdbConnection> <BEA-000000> <[MDConnectionManager] MDB NewMessageBean: State = UNDEPLOYED>
<22/06/2012#06:22:59.778#+0000> <Warning> <EJB> <BEA-010061> <The Message-Driven EJB: NewMessageBean is unable to connect to the JMS destination: blog.foreign.queue. The Error was:
javax.naming.CommunicationException [Root exception is java.net.ConnectException: t3://127.0.0.1:7001: Destination unreachable; nested exception is:
        java.net.ConnectException: Connection refused; No available router to destination]
        at weblogic.jndi.internal.ExceptionTranslator.toNamingException(ExceptionTranslator.java:40)
        at weblogic.jndi.WLInitialContextFactoryDelegate.toNamingException(WLInitialContextFactoryDelegate.java:788)
        at weblogic.jndi.WLInitialContextFactoryDelegate.getInitialContext(WLInitialContextFactoryDelegate.java:366)
        at weblogic.jndi.Environment.getContext(Environment.java:315)
        at weblogic.jndi.Environment.getContext(Environment.java:285)
        at weblogic.jndi.WLInitialContextFactory.getInitialContext(WLInitialContextFactory.java:117)
        at javax.naming.spi.NamingManager.getInitialContext(NamingManager.java:667)
        at javax.naming.InitialContext.getDefaultInitCtx(InitialContext.java:288)
        at javax.naming.InitialContext.init(InitialContext.java:223)
        at javax.naming.InitialContext.<init>(InitialContext.java:197)
        at weblogic.deployment.jms.ForeignOpaqueReference.getReferent(ForeignOpaqueReference.java:182)
        at weblogic.jndi.internal.WLNamingManager.getObjectInstance(WLNamingManager.java:96)
        at weblogic.jndi.internal.ServerNamingNode.resolveObject(ServerNamingNode.java:377)
        at weblogic.jndi.internal.BasicNamingNode.resolveObject(BasicNamingNode.java:856)
        at weblogic.jndi.internal.BasicNamingNode.lookup(BasicNamingNode.java:209)
        at weblogic.jndi.internal.BasicNamingNode.lookup(BasicNamingNode.java:214)
        at weblogic.jndi.internal.BasicNamingNode.lookup(BasicNamingNode.java:214)
        at weblogic.jndi.internal.WLEventContextImpl.lookup(WLEventContextImpl.java:254)
        at weblogic.jndi.internal.WLContextImpl.lookup(WLContextImpl.java:411)
        at javax.naming.InitialContext.lookup(InitialContext.java:392)
        at weblogic.jms.common.CDS$2.run(CDS.java:486)
        at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:363)
        at weblogic.jms.common.CrossDomainSecurityManager.runAs(CrossDomainSecurityManager.java:131)
        at weblogic.jms.common.CDS.lookupDestination(CDS.java:480)
        at weblogic.jms.common.CDS.lookupDDAndCalloutListener(CDS.java:345)
        at weblogic.jms.common.CDS.ddLookup(CDS.java:1393)
        at weblogic.jms.common.CDS.access$600(CDS.java:41)
        at weblogic.jms.common.CDS$DDLookupTimerListener.timerExpired(CDS.java:1267)
        at weblogic.timers.internal.TimerImpl.run(TimerImpl.java:273)
        at weblogic.work.SelfTuningWorkManagerImpl$WorkAdapterImpl.run(SelfTuningWorkManagerImpl.java:528)
        at weblogic.work.ExecuteThread.execute(ExecuteThread.java:207)
        at weblogic.work.ExecuteThread.run(ExecuteThread.java:176)
Caused by: java.net.ConnectException: t3://127.0.0.1:7001: Destination unreachable; nested exception is:
        java.net.ConnectException: Connection refused; No available router to destination
        at weblogic.rjvm.RJVMFinder.findOrCreateInternal(RJVMFinder.java:216)
        at weblogic.rjvm.RJVMFinder.findOrCreate(RJVMFinder.java:170)
        at weblogic.rjvm.ServerURL.findOrCreateRJVM(ServerURL.java:153)
        at weblogic.jndi.WLInitialContextFactoryDelegate$1.run(WLInitialContextFactoryDelegate.java:345)
        at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:363)
        at weblogic.security.service.SecurityManager.runAs(SecurityManager.java:146)
        at weblogic.jndi.WLInitialContextFactoryDelegate.getInitialContext(WLInitialContextFactoryDelegate.java:340)
        ... 29 more
Caused by: java.rmi.ConnectException: Destination unreachable; nested exception is:
        java.net.ConnectException: Connection refused; No available router to destination
        at weblogic.rjvm.ConnectionManager.bootstrap(ConnectionManager.java:470)
        at weblogic.rjvm.ConnectionManager.bootstrap(ConnectionManager.java:321)
        at weblogic.rjvm.RJVMManager.findOrCreateRemoteInternal(RJVMManager.java:254)
        at weblogic.rjvm.RJVMManager.findOrCreate(RJVMManager.java:197)
        at weblogic.rjvm.RJVMFinder.findOrCreateRemoteServer(RJVMFinder.java:238)
        at weblogic.rjvm.RJVMFinder.findOrCreateRemoteCluster(RJVMFinder.java:316)
        at weblogic.rjvm.RJVMFinder.findOrCreateInternal(RJVMFinder.java:205)
        ... 35 more

[ ... ] And when the server is back online

+++++++++++  Got JMSConnection ++++++++++

<22/06/2012#06:23:16.320#+0000> <Debug> <EjbMdbConnection> <BEA-000000> <[MDConnectionManager] MDB NewMessageBean: State = CONNECTED>
<22/06/2012#06:23:16.320#+0000> <Debug> <EjbMdbConnection> <BEA-000000> <[JMSConnectionPoller]

Conclusion


So as you can see, the problem is annoying, but the solution is not that hard to implement, as long as you know what to do. If you have any problem, I strongly suggest that you open a case to the Oracle Support. You could even find some traces of my case ;)

But if you have the chance to run one of the latest versions (WLS 10.3.5 and up), everything should be fine for you ! (I successfully tested it myself)

4 comments:

Kristof Jozsa said...

we have appearently had hit the same issue on weblogic 12c..

Anonymous said...

So the patch for weblogic did not help ? Your solution was to set trust between your domains ?

Maxence Button said...

Hi !

In fact you need both ! :)
If you take only one or the other, it won't solve your problem ... that was the tricky part.

Unknown said...

i have an issue to configure jms security in my MDB using annotation based. My MDB security implementation is working fine without any issues through ejb-jar.xml and weblogic-ejb-jar.xml configurations. But MDB is giving access denied when try to security configurations on MDB using @RolesAllowed and @DeclaredRoles annotations. we have successfully configured the security roles and the users in weblogic console. I have been struggling to solve this issue, but i could fix this issue.
I need guidance to implement the MDB security implementation using annotations.