[JBoss JIRA] (SWSQE-117) Resolve "The node was low on resource: imagefs." on b20
by Filip Brychta (JIRA)
[ https://issues.jboss.org/browse/SWSQE-117?page=com.atlassian.jira.plugin.... ]
Filip Brychta resolved SWSQE-117.
---------------------------------
Resolution: Done
So here is what was causing the problem.
Default threshold is imagefs.available<15%. B20 is already 85% full.
Basically it was trying to pull the image but failed because of reaching the threshold.
There was not much to clean so I removed the b20 from the cluster.
I created SWSQE-116 to reinstall it.
> Resolve "The node was low on resource: imagefs." on b20
> -------------------------------------------------------
>
> Key: SWSQE-117
> URL: https://issues.jboss.org/browse/SWSQE-117
> Project: Kiali QE
> Issue Type: Task
> Reporter: Filip Brychta
> Assignee: Filip Brychta
>
> Jeeva reported problem with pods on b20 which were not started because of "The node was low on resource: imagefs."
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 6 months
[JBoss JIRA] (SWSQE-116) Reinstall openshift production cluster
by Guilherme Baufaker Rêgo (JIRA)
[ https://issues.jboss.org/browse/SWSQE-116?page=com.atlassian.jira.plugin.... ]
Guilherme Baufaker Rêgo reassigned SWSQE-116:
---------------------------------------------
Assignee: Guilherme Baufaker Rêgo (was: Michael Foley)
> Reinstall openshift production cluster
> --------------------------------------
>
> Key: SWSQE-116
> URL: https://issues.jboss.org/browse/SWSQE-116
> Project: Kiali QE
> Issue Type: Task
> Reporter: Filip Brychta
> Assignee: Guilherme Baufaker Rêgo
>
> Currently there are following problems with our OS cluster:
> * docker on nodes is using loopback devices which is strongly discouraged for production use
> * small partition for imagefs and nodefs which is causing issues "The node was low on resource: imagefs"
> * different root passwords
> We need to:
> # investigate recommended partitioning for production OS clusters (also think about RAID versions available on each blade)
> # reinstall RHELs with recommended partitions
> # Configuring Docker Storage as described in https://docs.openshift.com/container-platform/3.9/install_config/install/...
> # install OS 3.9
> # create mojo documentation
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 6 months
[JBoss JIRA] (SWSQE-116) Reinstall openshift production cluster
by Filip Brychta (JIRA)
Filip Brychta created SWSQE-116:
-----------------------------------
Summary: Reinstall openshift production cluster
Key: SWSQE-116
URL: https://issues.jboss.org/browse/SWSQE-116
Project: Kiali QE
Issue Type: Task
Reporter: Filip Brychta
Assignee: Michael Foley
Currently there are following problems with our OS cluster:
* docker on nodes is using loopback devices which is strongly discouraged for production use
* small partition for imagefs and nodefs which is causing issues "The node was low on resource: imagefs"
* different root passwords
We need to:
# investigate recommended partitioning for production OS clusters (also think about RAID versions available on each blade)
# reinstall RHELs with recommended partitions
# Configuring Docker Storage as described in https://docs.openshift.com/container-platform/3.9/install_config/install/...
# install OS 3.9
# create mojo documentation
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 6 months
[JBoss JIRA] (SWSQE-114) Please setup persistent storage for the B11 OpenShift cluster
by Guilherme Baufaker Rêgo (JIRA)
[ https://issues.jboss.org/browse/SWSQE-114?page=com.atlassian.jira.plugin.... ]
Guilherme Baufaker Rêgo edited comment on SWSQE-114 at 4/4/18 11:29 AM:
------------------------------------------------------------------------
Created following PVs (10 GB each):
{panel:title=PVs}
persistentvolume "cluster1-pv10g-1" created
persistentvolume "cluster1-pv10g-2" created
persistentvolume "cluster1-pv10g-3" created
persistentvolume "cluster1-pv10g-4" created
persistentvolume "cluster1-pv10g-5" created
persistentvolume "cluster1-pv10g-6" created
persistentvolume "cluster1-pv10g-7" created
persistentvolume "cluster1-pv10g-8" created
persistentvolume "cluster1-pv10g-9" created
persistentvolume "cluster1-pv10g-10" created
persistentvolume "cluster1-pv10g-11" created
persistentvolume "cluster1-pv10g-12" created
persistentvolume "cluster1-pv10g-13" created
persistentvolume "cluster1-pv10g-14" created
persistentvolume "cluster1-pv10g-15" created
persistentvolume "cluster1-pv10g-16" created
persistentvolume "cluster1-pv10g-17" created
persistentvolume "cluster1-pv10g-18" created
persistentvolume "cluster1-pv10g-19" created
persistentvolume "cluster1-pv10g-20" created
persistentvolume "cluster1-pv10g-21" created
persistentvolume "cluster1-pv10g-22" created
persistentvolume "cluster1-pv10g-23" created
persistentvolume "cluster1-pv10g-24" created
persistentvolume "cluster1-pv10g-25" created
persistentvolume "cluster1-pv10g-26" created
persistentvolume "cluster1-pv10g-27" created
persistentvolume "cluster1-pv10g-28" created
persistentvolume "cluster1-pv10g-29" created
persistentvolume "cluster1-pv10g-30" created
persistentvolume "cluster1-pv10g-31" created
persistentvolume "cluster1-pv10g-32" created
persistentvolume "cluster1-pv10g-33" created
persistentvolume "cluster1-pv10g-34" created
persistentvolume "cluster1-pv10g-35" created
persistentvolume "cluster1-pv10g-36" created
persistentvolume "cluster1-pv10g-37" created
persistentvolume "cluster1-pv10g-38" created
persistentvolume "cluster1-pv10g-39" created
persistentvolume "cluster1-pv10g-40" created
{panel}
was (Author: gbaufake):
Created following PVs:
{panel:title=PVs}
persistentvolume "cluster1-pv10g-1" created
persistentvolume "cluster1-pv10g-2" created
persistentvolume "cluster1-pv10g-3" created
persistentvolume "cluster1-pv10g-4" created
persistentvolume "cluster1-pv10g-5" created
persistentvolume "cluster1-pv10g-6" created
persistentvolume "cluster1-pv10g-7" created
persistentvolume "cluster1-pv10g-8" created
persistentvolume "cluster1-pv10g-9" created
persistentvolume "cluster1-pv10g-10" created
persistentvolume "cluster1-pv10g-11" created
persistentvolume "cluster1-pv10g-12" created
persistentvolume "cluster1-pv10g-13" created
persistentvolume "cluster1-pv10g-14" created
persistentvolume "cluster1-pv10g-15" created
persistentvolume "cluster1-pv10g-16" created
persistentvolume "cluster1-pv10g-17" created
persistentvolume "cluster1-pv10g-18" created
persistentvolume "cluster1-pv10g-19" created
persistentvolume "cluster1-pv10g-20" created
persistentvolume "cluster1-pv10g-21" created
persistentvolume "cluster1-pv10g-22" created
persistentvolume "cluster1-pv10g-23" created
persistentvolume "cluster1-pv10g-24" created
persistentvolume "cluster1-pv10g-25" created
persistentvolume "cluster1-pv10g-26" created
persistentvolume "cluster1-pv10g-27" created
persistentvolume "cluster1-pv10g-28" created
persistentvolume "cluster1-pv10g-29" created
persistentvolume "cluster1-pv10g-30" created
persistentvolume "cluster1-pv10g-31" created
persistentvolume "cluster1-pv10g-32" created
persistentvolume "cluster1-pv10g-33" created
persistentvolume "cluster1-pv10g-34" created
persistentvolume "cluster1-pv10g-35" created
persistentvolume "cluster1-pv10g-36" created
persistentvolume "cluster1-pv10g-37" created
persistentvolume "cluster1-pv10g-38" created
persistentvolume "cluster1-pv10g-39" created
persistentvolume "cluster1-pv10g-40" created
{panel}
> Please setup persistent storage for the B11 OpenShift cluster
> -------------------------------------------------------------
>
> Key: SWSQE-114
> URL: https://issues.jboss.org/browse/SWSQE-114
> Project: Kiali QE
> Issue Type: QE Task
> Reporter: Kevin Earls
> Assignee: Guilherme Baufaker Rêgo
>
> For https://issues.jboss.org/browse/SWSQE-113 I need to create a project on an openshift cluster when I can run a persistent instance of Jenkins. [~mmahoney] recommended that I use B11, which is not currently configured with persistent storage.
> Note I'd be happy to move to B21 or another cluster if that is a better solution.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 6 months
[JBoss JIRA] (SWSQE-114) Please setup persistent storage for the B11 OpenShift cluster
by Guilherme Baufaker Rêgo (JIRA)
[ https://issues.jboss.org/browse/SWSQE-114?page=com.atlassian.jira.plugin.... ]
Guilherme Baufaker Rêgo resolved SWSQE-114.
-------------------------------------------
Resolution: Done
Created following PVs:
{panel:title=My title}
persistentvolume "cluster1-pv10g-1" created
persistentvolume "cluster1-pv10g-2" created
persistentvolume "cluster1-pv10g-3" created
persistentvolume "cluster1-pv10g-4" created
persistentvolume "cluster1-pv10g-5" created
persistentvolume "cluster1-pv10g-6" created
persistentvolume "cluster1-pv10g-7" created
persistentvolume "cluster1-pv10g-8" created
persistentvolume "cluster1-pv10g-9" created
persistentvolume "cluster1-pv10g-10" created
persistentvolume "cluster1-pv10g-11" created
persistentvolume "cluster1-pv10g-12" created
persistentvolume "cluster1-pv10g-13" created
persistentvolume "cluster1-pv10g-14" created
persistentvolume "cluster1-pv10g-15" created
persistentvolume "cluster1-pv10g-16" created
persistentvolume "cluster1-pv10g-17" created
persistentvolume "cluster1-pv10g-18" created
persistentvolume "cluster1-pv10g-19" created
persistentvolume "cluster1-pv10g-20" created
persistentvolume "cluster1-pv10g-21" created
persistentvolume "cluster1-pv10g-22" created
persistentvolume "cluster1-pv10g-23" created
persistentvolume "cluster1-pv10g-24" created
persistentvolume "cluster1-pv10g-25" created
persistentvolume "cluster1-pv10g-26" created
persistentvolume "cluster1-pv10g-27" created
persistentvolume "cluster1-pv10g-28" created
persistentvolume "cluster1-pv10g-29" created
persistentvolume "cluster1-pv10g-30" created
persistentvolume "cluster1-pv10g-31" created
persistentvolume "cluster1-pv10g-32" created
persistentvolume "cluster1-pv10g-33" created
persistentvolume "cluster1-pv10g-34" created
persistentvolume "cluster1-pv10g-35" created
persistentvolume "cluster1-pv10g-36" created
persistentvolume "cluster1-pv10g-37" created
persistentvolume "cluster1-pv10g-38" created
persistentvolume "cluster1-pv10g-39" created
persistentvolume "cluster1-pv10g-40" created
{panel}
> Please setup persistent storage for the B11 OpenShift cluster
> -------------------------------------------------------------
>
> Key: SWSQE-114
> URL: https://issues.jboss.org/browse/SWSQE-114
> Project: Kiali QE
> Issue Type: QE Task
> Reporter: Kevin Earls
> Assignee: Guilherme Baufaker Rêgo
>
> For https://issues.jboss.org/browse/SWSQE-113 I need to create a project on an openshift cluster when I can run a persistent instance of Jenkins. [~mmahoney] recommended that I use B11, which is not currently configured with persistent storage.
> Note I'd be happy to move to B21 or another cluster if that is a better solution.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 6 months
[JBoss JIRA] (SWSQE-114) Please setup persistent storage for the B11 OpenShift cluster
by Guilherme Baufaker Rêgo (JIRA)
[ https://issues.jboss.org/browse/SWSQE-114?page=com.atlassian.jira.plugin.... ]
Guilherme Baufaker Rêgo edited comment on SWSQE-114 at 4/4/18 11:28 AM:
------------------------------------------------------------------------
Created following PVs:
{panel:title=PVs}
persistentvolume "cluster1-pv10g-1" created
persistentvolume "cluster1-pv10g-2" created
persistentvolume "cluster1-pv10g-3" created
persistentvolume "cluster1-pv10g-4" created
persistentvolume "cluster1-pv10g-5" created
persistentvolume "cluster1-pv10g-6" created
persistentvolume "cluster1-pv10g-7" created
persistentvolume "cluster1-pv10g-8" created
persistentvolume "cluster1-pv10g-9" created
persistentvolume "cluster1-pv10g-10" created
persistentvolume "cluster1-pv10g-11" created
persistentvolume "cluster1-pv10g-12" created
persistentvolume "cluster1-pv10g-13" created
persistentvolume "cluster1-pv10g-14" created
persistentvolume "cluster1-pv10g-15" created
persistentvolume "cluster1-pv10g-16" created
persistentvolume "cluster1-pv10g-17" created
persistentvolume "cluster1-pv10g-18" created
persistentvolume "cluster1-pv10g-19" created
persistentvolume "cluster1-pv10g-20" created
persistentvolume "cluster1-pv10g-21" created
persistentvolume "cluster1-pv10g-22" created
persistentvolume "cluster1-pv10g-23" created
persistentvolume "cluster1-pv10g-24" created
persistentvolume "cluster1-pv10g-25" created
persistentvolume "cluster1-pv10g-26" created
persistentvolume "cluster1-pv10g-27" created
persistentvolume "cluster1-pv10g-28" created
persistentvolume "cluster1-pv10g-29" created
persistentvolume "cluster1-pv10g-30" created
persistentvolume "cluster1-pv10g-31" created
persistentvolume "cluster1-pv10g-32" created
persistentvolume "cluster1-pv10g-33" created
persistentvolume "cluster1-pv10g-34" created
persistentvolume "cluster1-pv10g-35" created
persistentvolume "cluster1-pv10g-36" created
persistentvolume "cluster1-pv10g-37" created
persistentvolume "cluster1-pv10g-38" created
persistentvolume "cluster1-pv10g-39" created
persistentvolume "cluster1-pv10g-40" created
{panel}
was (Author: gbaufake):
Created following PVs:
{panel:title=My title}
persistentvolume "cluster1-pv10g-1" created
persistentvolume "cluster1-pv10g-2" created
persistentvolume "cluster1-pv10g-3" created
persistentvolume "cluster1-pv10g-4" created
persistentvolume "cluster1-pv10g-5" created
persistentvolume "cluster1-pv10g-6" created
persistentvolume "cluster1-pv10g-7" created
persistentvolume "cluster1-pv10g-8" created
persistentvolume "cluster1-pv10g-9" created
persistentvolume "cluster1-pv10g-10" created
persistentvolume "cluster1-pv10g-11" created
persistentvolume "cluster1-pv10g-12" created
persistentvolume "cluster1-pv10g-13" created
persistentvolume "cluster1-pv10g-14" created
persistentvolume "cluster1-pv10g-15" created
persistentvolume "cluster1-pv10g-16" created
persistentvolume "cluster1-pv10g-17" created
persistentvolume "cluster1-pv10g-18" created
persistentvolume "cluster1-pv10g-19" created
persistentvolume "cluster1-pv10g-20" created
persistentvolume "cluster1-pv10g-21" created
persistentvolume "cluster1-pv10g-22" created
persistentvolume "cluster1-pv10g-23" created
persistentvolume "cluster1-pv10g-24" created
persistentvolume "cluster1-pv10g-25" created
persistentvolume "cluster1-pv10g-26" created
persistentvolume "cluster1-pv10g-27" created
persistentvolume "cluster1-pv10g-28" created
persistentvolume "cluster1-pv10g-29" created
persistentvolume "cluster1-pv10g-30" created
persistentvolume "cluster1-pv10g-31" created
persistentvolume "cluster1-pv10g-32" created
persistentvolume "cluster1-pv10g-33" created
persistentvolume "cluster1-pv10g-34" created
persistentvolume "cluster1-pv10g-35" created
persistentvolume "cluster1-pv10g-36" created
persistentvolume "cluster1-pv10g-37" created
persistentvolume "cluster1-pv10g-38" created
persistentvolume "cluster1-pv10g-39" created
persistentvolume "cluster1-pv10g-40" created
{panel}
> Please setup persistent storage for the B11 OpenShift cluster
> -------------------------------------------------------------
>
> Key: SWSQE-114
> URL: https://issues.jboss.org/browse/SWSQE-114
> Project: Kiali QE
> Issue Type: QE Task
> Reporter: Kevin Earls
> Assignee: Guilherme Baufaker Rêgo
>
> For https://issues.jboss.org/browse/SWSQE-113 I need to create a project on an openshift cluster when I can run a persistent instance of Jenkins. [~mmahoney] recommended that I use B11, which is not currently configured with persistent storage.
> Note I'd be happy to move to B21 or another cluster if that is a better solution.
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 6 months
[JBoss JIRA] (JGRP-2260) UNICAST3 doesn't remove dead nodes from its tables
by Bela Ban (JIRA)
[ https://issues.jboss.org/browse/JGRP-2260?page=com.atlassian.jira.plugin.... ]
Bela Ban commented on JGRP-2260:
--------------------------------
Thanks, Rich!
> UNICAST3 doesn't remove dead nodes from its tables
> --------------------------------------------------
>
> Key: JGRP-2260
> URL: https://issues.jboss.org/browse/JGRP-2260
> Project: JGroups
> Issue Type: Bug
> Affects Versions: 4.0.10
> Environment: WildFly 12.0.0.Final
> Reporter: Rich DiCroce
> Assignee: Bela Ban
>
> Scenario: 2 WildFly instances clustered together. A ForkChannel is defined, with a MessageDispatcher on top. I start both nodes, then stop the second one. 6-7 minutes after stopping the second node, I start getting log spam on the first node:
> {quote}
> 12:47:04,519 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ee,RCD_GP (flags=0), site-id=DEFAULT, rack-id=null, machine-id=null)) JGRP000032: RCD_GP (flags=0), site-id=DEFAULT, rack-id=null, machine-id=null): no physical address for RCD_NMS (flags=0), site-id=DEFAULT, rack-id=null, machine-id=null), dropping message
> 12:47:06,522 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ee,RCD_GP (flags=0), site-id=DEFAULT, rack-id=null, machine-id=null)) JGRP000032: RCD_GP (flags=0), site-id=DEFAULT, rack-id=null, machine-id=null): no physical address for RCD_NMS (flags=0), site-id=DEFAULT, rack-id=null, machine-id=null), dropping message
> 12:47:08,524 WARN [org.jgroups.protocols.UDP] (TQ-Bundler-4,ee,RCD_GP (flags=0), site-id=DEFAULT, rack-id=null, machine-id=null)) JGRP000032: RCD_GP (flags=0), site-id=DEFAULT, rack-id=null, machine-id=null): no physical address for RCD_NMS (flags=0), site-id=DEFAULT, rack-id=null, machine-id=null), dropping message
> {quote}
> After some debugging, I discovered that the reason is because UNICAST3 is still trying to retransmit to the dead node. Its send_table still contains an entry for the dead node with state OPEN.
> After looking at the source code for UNICAST3, I have a theory about what's happening.
> * When a node leaves the cluster, down(Event) gets invoked with a view change, which calls closeConnection(Address) for each node that left. That sets the connection state to CLOSING.
> * Suppose that immediately after the view change is handled, a message with the dead node as its destination gets passed to down(Message). That invokes getSenderEntry(Address), which finds the connection... and sets the state back to OPEN.
> Consequently, the connection is never closed or removed from the table, so retransmit attempts continue forever even though they will never succeed.
> This issue is easily reproducible for me, although unfortunately I can't give you the application in question. But if you have fixes you want to try, I'm happy to drop in a patched JAR and see if the issue still happens.
> This is my JGroups subsystem configuration:
> {code:xml}
> <subsystem xmlns="urn:jboss:domain:jgroups:6.0">
> <channels default="ee">
> <channel name="ee" stack="main">
> <fork name="shared-dispatcher"/>
> <fork name="group-topology"/>
> </channel>
> </channels>
> <stacks>
> <stack name="main">
> <transport type="UDP" socket-binding="jgroups" site="${gp.site:DEFAULT}"/>
> <protocol type="PING"/>
> <protocol type="MERGE3">
> <property name="min_interval">
> 1000
> </property>
> <property name="max_interval">
> 5000
> </property>
> </protocol>
> <protocol type="FD_SOCK"/>
> <protocol type="FD_ALL2">
> <property name="interval">
> 3000
> </property>
> <property name="timeout">
> 8000
> </property>
> </protocol>
> <protocol type="VERIFY_SUSPECT"/>
> <protocol type="pbcast.NAKACK2"/>
> <protocol type="UNICAST3"/>
> <protocol type="pbcast.STABLE"/>
> <protocol type="pbcast.GMS">
> <property name="join_timeout">
> 100
> </property>
> </protocol>
> <protocol type="UFC"/>
> <protocol type="MFC"/>
> <protocol type="FRAG3"/>
> </stack>
> </stacks>
> </subsystem>
> {code}
--
This message was sent by Atlassian JIRA
(v7.5.0#75005)
7 years, 6 months