I am currently having a problem with my jgroups configuration, causing thousands of messages to get stuck in NAKACK.xmit_table. In fact, they all seem to fall into xmit_table, and another dump after a few hours indicates that they never intend to leave ...
This is the protocol stack configuration
UDP(bind_addr=xxx.xxx.xxx.114;
bind_interface=bond0;
ip_mcast=true;ip_ttl=64;
loopback=false;
mcast_addr=228.1.2.80;mcast_port=45589;
mcast_recv_buf_size=80000;
mcast_send_buf_size=150000;
ucast_recv_buf_size=80000;
ucast_send_buf_size=150000):
PING(num_initial_members=3;timeout=2000):
MERGE2(max_interval=20000;min_interval=10000):
FD_SOCK:
FD(max_tries=5;shun=true;timeout=10000):
VERIFY_SUSPECT(timeout=1500):
pbcast.NAKACK(discard_delivered_msgs=true;gc_lag=50;retransmit_timeout=600,1200,2400,4800;use_mcast_xmit=true):
pbcast.STABLE(desired_avg_gossip=20000;max_bytes=400000;stability_delay=1000):UNICAST(timeout=600,1200,2400):
FRAG(frag_size=8192):pbcast.GMS(join_timeout=5000;print_local_addr=true;shun=true):
pbcast.STATE_TRANSFER
Introductory message ...
2010-03-01 23:40:05,358 INFO [org.jboss.cache.TreeCache] viewAccepted(): [xxx.xxx.xxx.35:51723|17] [xxx.xxx.xxx.35:51723, xxx.xxx.xxx.36:53088, xxx.xxx.xxx.115:32781, xxx.xxx.xxx.114:32934]
2010-03-01 23:40:05,363 INFO [org.jboss.cache.TreeCache] TreeCache local address is 10.35.191.114:32934
2010-03-01 23:40:05,393 INFO [org.jboss.cache.TreeCache] received the state (size=32768 bytes)
2010-03-01 23:40:05,509 INFO [org.jboss.cache.TreeCache] state was retrieved successfully (in 146 milliseconds)
... indicates that so far everything is in order.
Logs set at the warning level do not indicate that something is wrong, except for the octave
2010-03-03 09:59:01,354 ERROR [org.jgroups.blocks.NotificationBus] exception=java.lang.IllegalArgumentException: java.lang.NullPointerException
which, I suppose, is irrelevant, as it was seen earlier without a memory problem.
I dug two memory dumps from one of the machines to find oddities, but nothing so far. Except, perhaps, some statistics from different protocols.
UDP has
num_bytes_sent 53617832
num_bytes_received 679220174
num_messages_sent 99524
num_messages_received 99522
while NAKACK has ...
num_bytes_sent 0
num_bytes_received 0
num_messages_sent 0
num_messages_received 0
... xmit_table.
JChannel, ehcache TreeCache. , mcast- , , ? , , mcast .
, , , .