• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Apache/Jboss via mod_jk loadbalancing/failover not seeing other node in cluster and mcast_addr wrong

 
Ranch Hand
Posts: 50
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello I have the following:

node1=host1 (runnning jboss webapp)
node2=host2 which is VM (duplicate of host1 running jboss webapp)

I following all the instructions per:
http://www.redhat.com/docs/en-US/JBoss_Enterprise_Application_Platform/4.2.0.cp06/html/Server_Configuration_Guide/clustering-http.html#d0e20564

and when I startup apache/jboss on host2 I get see the following:

1. Notice how initially mcast_addr=228.2.1.1 as it should per $JBOSS_HOME/bin/run.conf then it changes after Cache is starts to mcast_addr=228.1.2.30 :

JAVA_OPTS="-Djava.awt.headless=true -server -Xms1000m -Xmx1050m -XX:MaxPermSize=256m -Djboss.partition.name=AuthoringPartition -Djboss.partition.udpGroup=228.2.1.1 -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000 -Djava.net.preferIPv4Stack=true"

JBoss log:
2009-04-09 14:52:49,579 INFO [org.jboss.remoting.transport.socket.SocketServerInvoker] Invoker started for locator: InvokerLocator [socket://10.56.22.234:3873/]
2009-04-09 14:52:53,915 INFO [org.jboss.ws.server.ServiceEndpointManager] WebServices: jbossws-1.0.3.SP1 (date=200609291417)
2009-04-09 14:52:54,529 INFO [org.jboss.jmx.adaptor.snmp.agent.SnmpAgentService] SNMP agent going active
2009-04-09 14:52:54,733 INFO [org.jboss.cache.TreeCache] setting cluster properties from xml to: UDP(down_thread=false;enable_bundling=true;ip_ttl=2;loopback=false;max_bundle_size=64000;max_bundle_timeout=30;mcast_addr=228.2.1.1;mcast_port=45577;mcast_recv_buf_size=25000000;mcast_send_buf_size=640000;ucast_recv_buf_size=20000000;ucast_send_buf_size=640000;up_thread=false;use_incoming_packet_handler=true;use_outgoing_packet_handler=true):PING(down_thread=false;num_initial_members=3;timeout=2000;up_thread=false):MERGE2(down_thread=false;max_interval=100000;min_interval=20000;up_thread=false):FD_SOCK(down_thread=false;up_thread=false):FD(down_thread=false;max_tries=5;shun=true;timeout=20000;up_thread=false):VERIFY_SUSPECT(down_thread=false;timeout=1500;up_thread=false):pbcast.NAKACK(discard_delivered_msgs=true;down_thread=false;gc_lag=50;max_xmit_size=60000;retransmit_timeout=300,600,1200,2400,4800;up_thread=false;use_mcast_xmit=false):UNICAST(down_thread=false;timeout=300,600,1200,2400,3600;up_thread=false):pbcast.STABLE(desired_avg_gossip=50000;down_thread=false;max_bytes=400000;stability_delay=1000;up_thread=false):pbcast.GMS(down_thread=false;join_retry_timeout=2000;join_timeout=3000;print_local_addr=true;shun=true;up_thread=false):FC(down_thread=false;max_credits=2000000;min_threshold=0.10;up_thread=false):FRAG2(down_thread=false;frag_size=60000;up_thread=false):pbcast.STATE_TRANSFER(down_thread=false;up_thread=false)
2009-04-09 14:52:54,747 INFO [org.jboss.cache.TreeCache] interceptor chain is:
class org.jboss.cache.interceptors.CallInterceptor
class org.jboss.cache.interceptors.PessimisticLockInterceptor
class org.jboss.cache.interceptors.UnlockInterceptor
class org.jboss.cache.interceptors.ReplicationInterceptor
2009-04-09 14:52:54,747 INFO [org.jboss.cache.TreeCache] cache mode is REPL_ASYNC
2009-04-09 14:52:54,942 WARN [org.jgroups.JChannel] option GET_STATE_EVENTS has been deprecated (it is always true now); this option is ignored
2009-04-09 14:52:54,980 INFO [STDOUT]
-------------------------------------------------------
GMS: address is 10.56.22.234:39348
-------------------------------------------------------
2009-04-09 14:52:56,998 INFO [org.jboss.cache.TreeCache] viewAccepted(): [IP_node2:39348|0] [IP_node2:39348]
2009-04-09 14:52:57,007 INFO [org.jboss.cache.TreeCache] my local address is IP_node2:39348
2009-04-09 14:52:57,007 INFO [org.jboss.cache.TreeCache] state could not be retrieved (must be first member in group)
2009-04-09 14:52:57,007 INFO [org.jboss.cache.TreeCache] Cache is started!!
2009-04-09 14:52:57,169 INFO [org.jboss.cache.TreeCache] setting cluster properties from xml to: UDP(ip_mcast=true;ip_ttl=64;loopback=false;mcast_addr=228.1.2.30;mcast_port=17733;mcast_recv_buf_size=80000;mcast_send_buf_size=150000;ucast_recv_buf_size=80000;ucast_send_buf_size=150000):PING(down_thread=false;num_initial_members=3;timeout=2000;up_thread=false):MERGE2(max_interval=20000;min_interval=10000):FD(down_thread=false;max_tries=3;shun=true;timeout=2000;up_thread=false):FD_SOCK:VERIFY_SUSPECT(down_thread=false;timeout=1500;up_thread=false):pbcast.NAKACK(down_thread=false;gc_lag=50;max_xmit_size=8192;retransmit_timeout=600,1200,2400,4800;up_thread=false):UNICAST(down_thread=false;min_threshold=10;timeout=600,1200,2400;window_size=100):pbcast.STABLE(desired_avg_gossip=20000;down_thread=false;up_thread=false):FRAG(down_thread=false;frag_size=8192;up_thread=false):pbcast.GMS(join_retry_timeout=2000;join_timeout=5000;print_local_addr=true;shun=true):pbcast.STATE_TRANSFER(down_thread=true;up_thread=true)
2009-04-09 14:52:57,180 INFO [org.jboss.cache.TreeCache] setEvictionPolicyConfig(): [config: null]
2009-04-09 14:52:57,181 WARN [org.jboss.cache.TreeCache] No transaction manager lookup class has been defined. Transactions cannot be used
2009-04-09 14:52:57,181 INFO [org.jboss.cache.TreeCache] interceptor chain is:
class org.jboss.cache.interceptors.CallInterceptor
class org.jboss.cache.interceptors.PessimisticLockInterceptor
class org.jboss.cache.interceptors.UnlockInterceptor
2009-04-09 14:52:57,193 INFO [org.jboss.cache.TreeCache] cache mode is local, will not create the channel
2009-04-09 14:52:57,194 INFO [org.jboss.cache.eviction.LRUPolicy] Starting eviction policy using the provider: org.jboss.cache.eviction.LRUPolicy
2009-04-09 14:52:57,194 INFO [org.jboss.cache.eviction.LRUPolicy] Starting a eviction timer with wake up interval of (secs) 5
2009-04-09 14:52:57,195 INFO [org.jboss.cache.TreeCache] Cache is started!!


2. Seeing the following in my apache error log:

[Thu Apr 09 14:52:24.593 2009] [3690:3097320544] [error] uri_worker_map_ext::jk_uri_worker_map.c (505): Could not find worker with name 'localhost' in uri map post processing.
[Thu Apr 09 14:52:24.593 2009] [3690:3097320544] [error] uri_worker_map_ext::jk_uri_worker_map.c (505): Could not find worker with name 'localhost' in uri map post processing.

3. When I attempt starting up the other node, node1, its doesnt see node2...I would expect to see after "viewAccepted():" the following

INFO [org.jboss.cache.TreeCache] viewAccepted(): [IP_node1:32963|1] [IP_node1:32963, IP_node2:35451]

but I see the following:

INFO [org.jboss.cache.TreeCache] viewAccepted(): [IP_node1:32963|1] [IP_node1:32963]


Any help would be much appreciated!! Thanks
 
Diego Bowen
Ranch Hand
Posts: 50
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Sorry I forgot to mention that its JBoss 4.0.5
 
Diego Bowen
Ranch Hand
Posts: 50
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Here is a detailed setup for apache2/jboss4.0.5 failover cluster of two:

Hosts used for setup:
host1 aka H1
host2 aka H2


1. Configured Apache via httpd.conf (identical for both H1 and v2):


a. Within the httpd.conf add the following inside the IfModule status block:

# Add jkstatus for managing runtime data
<Location /jkstatus/>
JkMount status
Order deny,allow
Deny from all
Allow from 127.0.0.1
</Location>

b. Ensure that the following is in your jk_module If Block:

JkWorkersFile /usr/local/apache/conf/workers.properties
JkLogFile /usr/local/apache/logs/mod_jk.log
JkMountFile conf/uriworkermap.properties
JkLogLevel info # change to debug is setting up


# JkOptions indicates to send SSK KEY SIZE
JkOptions +ForwardKeySize +ForwardURICompat -ForwardDirectories

# JkRequestLogFormat
JkRequestLogFormat "%w %V %T"


# Add shared memory.
# This directive is present with 1.2.10 and
# later versions of mod_jk and is needed for
# load balancing to work properly
JkShmFile logs/jk.shm

# Make sure application Mount to your loadbalancer instead of localhost. For now assuming "balancer"
# Note: No need to do application/* because done in uriworkermap.properties but if you have Jboss serving
# your jsp's etc you need to ensure the following as well
JkMount /*.jsp balanced
........................
....................

2. Configure workers.properties. Assuming mod_jk 1.2.27 (identical for both H1 and H2):

a. Add the new "loadbalancer" to existing workers list. Assuming "balanced"

worker.list=balanced

b. Create a root template that all the other workers (node1 and node2) can reference:

worker.template.port=8009
worker.template.type=ajp13
worker.template.lbfactor=1
worker.template.socket_connect_timeout=10
worker.template.ping_timeout=10000
worker.template.ping_mode=A

# node1 (host1)
worker.node1.reference=worker.template
worker.node1.host=<IP_ADDRESS_H1>
worker.node1.redirect=node2

# remote (host2)
worker.node2.reference=worker.template
worker.node2.host=<IP_ADDRESS_H2>
worker.node2.redirect=node1


# Load-balancing behaviour for balanced
worker.balanced.type=lb
worker.balanced.balance_workers=node1,node2
worker.balanced.sticky_session=true

# localhost (had this in here so i wouldnt have to keep reverting this file back to work prior to failover setup)
worker.localhost.reference=worker.template
worker.localhost.host=localhost

3. Create uriworkermap.properties. (identical for both H1 and H2):

# Simple worker configuration file
# Mount the Servlet context to the ajp13 worker
/jmx-console=balanced
/jmx-console/*=balanced
/web-console=balanced
/web-console/*=balanced
/application=balanced
/application/*=balanced


4. Configuring JBoss to work with mod_jk for each clustered JBoss node.

A. Adding "jvmRoute" to server.xml for both H1 & H2

JBOSS_HOME/server/PRJ/deploy/jboss-web.deployer/server.xml

<Engine name="jboss.web" defaultHost="localhost" jvmRoute="node#"> (where # is "1" for H1 and "2" for H2)

B. For SingleSignOn make sure the "authenticator.SingleSignOn" is disabled (which does not support SSO across a cluster)
and enable SingleSignOn "ClusteredSingleSignOn":

<Valve className="org.jboss.web.tomcat.tc5.sso.ClusteredSingleSignOn"/>

5. For each node in the cluster tell each JBoss Tomcat instance in the cluster, we need to tell it that mod_jk is in use:
JBOSS_HOME/server/PRJ/deploy/jboss-web.deployer/META-INF/jboss-service.xml
<attribute name="UseJK">true</attribute>



6. For each node in the cluster replicate session data across the nodes in the cluster.

NOTE: Our current Jboss4.0.5 server configuration is based off the default which is why you have to copy over the tc5-cluster sar from the "all" config.

Copy the $JBOSS_HOME/server/all/deploy/tc5-cluster.sar to deploy directory.
Note: From a stock jboss-service.xml do the following:

a. Disable <attribute name="UseRegionBasedMarshalling">false</attribute>

b. Add <attribute name="UseMarshalling">false</attribute> above InactiveOnStartup

c. Remove BuddyReplicationConfig.

d. Disable UDP Cluster in favor of TCP Clustering. Use this version:

<!-- Alternate TCP stack: customize it for your environment, change bind_addr and initial_hosts -->
<config>
<TCP bind_addr="thishostIP" start_port="7810" loopback="true"
tcp_nodelay="true"
recv_buf_size="20000000"
send_buf_size="640000"
discard_incompatible_packets="true"
enable_bundling="true"
max_bundle_size="64000"
max_bundle_timeout="30"
use_incoming_packet_handler="true"
use_outgoing_packet_handler="false"
down_thread="false" up_thread="false"
use_send_queues="false"
sock_conn_timeout="300"
skip_suspected_members="true"/>
<TCPPING initial_hosts="thishostIP[7810],otherhostIP[7810]" port_range="3" timeout="3000"
down_thread="false" up_thread="false"
num_initial_members="3"/>
<MERGE2 max_interval="100000"
down_thread="false" up_thread="false" min_interval="20000"/>
<FD_SOCK down_thread="false" up_thread="false"/>
<FD timeout="10000" max_tries="5" down_thread="false" up_thread="false" shun="true"/>
<VERIFY_SUSPECT timeout="1500" down_thread="false" up_thread="false"/>
<pbcast.NAKACK max_xmit_size="60000"
use_mcast_xmit="false" gc_lag="0"
retransmit_timeout="300,600,1200,2400,4800"
down_thread="false" up_thread="false"
discard_delivered_msgs="true"/>
<pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
down_thread="false" up_thread="false"
max_bytes="400000"/>
<pbcast.GMS print_local_addr="true" join_timeout="3000"
down_thread="false" up_thread="false"
join_retry_timeout="2000" shun="true"
view_bundling="true"/>
<FC max_credits="2000000" down_thread="false" up_thread="false"
min_threshold="0.10"/>
<FRAG2 frag_size="60000" down_thread="false" up_thread="false"/>
<pbcast.STATE_TRANSFER down_thread="false" up_thread="false" use_flush="false"/>
</config>

</attribute>


7. For each node in the cluster Enabling session replication in your application

a. Add distributable tag in the web.xml descriptor.
<distributable/>

b. Define what triggers a session replication. Using "SET_AND_GET":

/usr/local/jboss/server/PRJ/deploy/jmx-console.war/WEB-INF/jboss-web.xml

<jboss-web>
<!-- Uncomment the security-domain to enable security. You will
need to edit the htmladaptor login configuration to setup the
login modules used to authentication users.
<security-domain>java:/jaas/jmx-console</security-domain>
-->

<!-- The downside of SET_AND_GET is that it can have significant performance
implications, if more than two nodes are used, since even reading immutable objects from the session
(e.g., strings, numbers) will mark the read attributes as needing to be replicated.
-->
<replication-config>
<replication-trigger>SET_AND_GET</replication-trigger>
<replication-granularity>SESSION</replication-granularity>
</replication-config>


</jboss-web>



6. For each node in the cluster set parameters in the Jboss startup script:

Note: If you use custom start/stop scripts for jboss then only edit run.conf in the following manner
-Dbind.address=THISHOSTIP
-Djboss.partition.name=MYPartition
-Djboss.partition.udpGroup=MulticastIP


7. For each node in the cluster update jboss-cache.jar with one in all server config.



8. Logs and Debugging

Can you add this to conf/jboss-log4j.xml to get additional JGroups logging:

<category name="org.jgroups">
<priority value="INFO"/>
</category>

To get JBoss to include additional debug logging for SSO, add this to conf/log4j.xml:

<category name="org.apache.catalina.core.ContainerBase">
<priority value="DEBUG"></priority>
</category>

9. Tests

a. Test to see if network is correctly configured to UDP multicast:

1. On one node, run the following command, replacing X.X.X.X with that node\'s IP address and YYY.Y.Y.Y with multi-cast IP.
This instance will receive multicast packets.

java -cp jgroups-all.jar org.jgroups.tests.McastReceiverTest -mcast_addr YYY.Y.Y.Y -port 5555 -bind_addr X.X.X.X

2. On the other node, run the following command, replacing X.X.X.X with that node\'s IP address.
This instance will send multicast packets.

java -cp jgroups-all.jar org.jgroups.tests.McastSenderTest -mcast_addr YYY.Y.Y.Y -port 5555 -bind_addr X.X.X.X

3. On the instance sending packets, you can enter text followed by pressing enter.
You should see what you entered echoed on the JVM instance receiving packets.

b. Test for multicast traffic. Put the following text in a file called, "config.xml" in $JBOSS_HOME on both
nodes and run the following:

Note: Changing the frag size to 1000 can account for a common issue with some networks dropping packets over a certain
size.

Command for node1:

java -Djava.net.preferIPv4Stack=true -cp server/eol3/lib/jgroups.jar:lib/commons-logging.jar:lib/concurrent.jar org.jgroups.tests.LargeState -props ./config.xml -provider -size 10000000 -Dbind.address=node2IP

Command for node2:

java -Djava.net.preferIPv4Stack=true -cp server/eol3/lib/jgroups.jar:lib/commons-logging.jar:lib/concurrent.jar org.jgroups.tests.LargeState -props ./config.xml -Dbind.address=node1Ip

config.xml

<config>
<UDP mcast_addr="${jboss.partition.udpGroup:YYY.Y.Y.Y}"
mcast_port="45577"
ucast_recv_buf_size="20000000"
ucast_send_buf_size="640000"
mcast_recv_buf_size="25000000"
mcast_send_buf_size="640000"
loopback="false"
max_bundle_size="64000"
max_bundle_timeout="30"
use_incoming_packet_handler="true"
use_outgoing_packet_handler="true"
ip_ttl="${jgroups.mcast.ip_ttl:2}"
down_thread="false" up_thread="false"
enable_bundling="true"/>
<PING timeout="2000"
down_thread="false" up_thread="false" num_initial_members="3"/>
<MERGE2 max_interval="100000"
down_thread="false" up_thread="false" min_interval="20000"/>
<FD_SOCK down_thread="false" up_thread="false"/>
<FD shun="true" up_thread="false" down_thread="false"
timeout="20000" max_tries="5"/>
<VERIFY_SUSPECT timeout="1500"
up_thread="false" down_thread="false"/>
<pbcast.NAKACK max_xmit_size="60000"
use_mcast_xmit="false" gc_lag="50"
retransmit_timeout="300,600,1200,2400,4800"
down_thread="false" up_thread="false"
discard_delivered_msgs="true"/>
<UNICAST timeout="300,600,1200,2400,3600"
down_thread="false" up_thread="false"/>
<pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
down_thread="false" up_thread="false"
max_bytes="400000"/>
<pbcast.GMS print_local_addr="true" join_timeout="3000"
down_thread="false" up_thread="false"
join_retry_timeout="2000" shun="true"/>
<FC max_credits="2000000" down_thread="false" up_thread="false"
min_threshold="0.10"/>
<FRAG2 frag_size="60000" down_thread="false" up_thread="false"/>
<pbcast.STATE_TRANSFER down_thread="false" up_thread="false"/>
</config>


c. Test for TCP configuration. Put the following text in a file called, "config.xml" in $JBOSS_HOME on both
nodes and run the following:


Command for node1:

/usr/local/java/bin/java -Djava.net.preferIPv4Stack=true -cp jgroups.jar:lib/commons-logging.jar:lib/concurrent.jar org.jgroups.tests.LargeState -props ./config.xml -provider -size 10000000

Command for node2:

/usr/local/java/bin/java -Djava.net.preferIPv4Stack=true -cp jgroups.jar:lib/commons-logging.jar:lib/concurrent.jar org.jgroups.tests.LargeState -props ./config.xml

config.xml:


<config>
<TCP bind_addr="THISHOSTIP" start_port="7810" loopback="true"
tcp_nodelay="true"
recv_buf_size="20000000"
send_buf_size="640000"
discard_incompatible_packets="true"
enable_bundling="true"
max_bundle_size="64000"
max_bundle_timeout="30"
use_incoming_packet_handler="true"
use_outgoing_packet_handler="false"
down_thread="false" up_thread="false"
use_send_queues="false"
sock_conn_timeout="300"
skip_suspected_members="true"/>
<TCPPING initial_hosts="THISHOSTIP[7810],OTHERHOSTIP[7810]" port_range="3"
timeout="3000"
down_thread="false" up_thread="false"
num_initial_members="3"/>
<MERGE2 max_interval="100000"
down_thread="false" up_thread="false" min_interval="20000"/>
<FD_SOCK down_thread="false" up_thread="false"/>
<FD timeout="10000" max_tries="5" down_thread="false" up_thread="false" shun="true"/>
<VERIFY_SUSPECT timeout="1500" down_thread="false" up_thread="false"/>
<pbcast.NAKACK max_xmit_size="60000"
use_mcast_xmit="false" gc_lag="0"
retransmit_timeout="300,600,1200,2400,4800"
down_thread="false" up_thread="false"
discard_delivered_msgs="true"/>
<pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"
down_thread="false" up_thread="false"
max_bytes="400000"/>
<pbcast.GMS print_local_addr="true" join_timeout="3000"
down_thread="false" up_thread="false"
join_retry_timeout="2000" shun="true"
view_bundling="true"/>
<FC max_credits="2000000" down_thread="false" up_thread="false"
min_threshold="0.10"/>
<FRAG2 frag_size="60000" down_thread="false" up_thread="false"/>
<pbcast.STATE_TRANSFER down_thread="false" up_thread="false" use_flush="false"/>
</config>





 
ranger
Posts: 17347
11
Mac IntelliJ IDE Spring
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The basic configuration that you have appears correct. It could always be a simple typo.

But what has me confused is what you start up your JBoss instances by calling "java" instead of using the run scripts.

I would have run them as

in Linux

./run.sh -b <BindAddressOfTheHost> -c all

and that is all.

Mark
 
Diego Bowen
Ranch Hand
Posts: 50
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I use a start/stop script w/in $JBOSS_HOME dir that does some custom stuff then runs $JBOSS_HOME/bin/run.sh. I edited the $JBOSS_HOME/bin/run.conf such that I added:

-Dbind.address=THISHOSTIP
-Djboss.partition.name=MYPartition
-Djboss.partition.udpGroup=<mcast_addr>

to

JAVA_OPTS. Thanks
 
Diego Bowen
Ranch Hand
Posts: 50
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I think i should have mentioned that the above configuration works fine...I just wanted to include it here for reference.
 
reply
    Bookmark Topic Watch Topic
  • New Topic