fyi... for anyone that may find my initial post:
I think we have worked thru our problem, which at this point we will categorize as 'bad architecture'. We had a hub & spoke architecture with over 800 spokes. The hub had a remote MDB pointing to each spoke and each spoke had an remote MDB pointing to the hub. The main problem mostly at the hub - using Messaging w/Remoting bisocket - this meant that, if all spokes were connected to the hub, we ended up with over 2400 threads (800 hubs each getting a Connection Consumer thread, along with 2 separate control socket thread (firewall between hub and spokes)) along with everything else our hub server was trying to do.
Data flowed correctly for some time, then things appeared to break and apparently we were getting networking problems on port 4457. We have changed our architecture to no longer use the remote MDBs and seem to be much more stable.
Not sure exactly what in the flow was broken but things are much more stable w/o the remote mdbs - may just too many remote mdbs....