Gareth,
I've seen this before and when I did it is because the agent doesn't have the cluster-reader role.
So something happened with the command:
oc adm policy add-cluster-role-to-user cluster-reader system:serviceaccount:openshift-infra:hawkular-agent I can tell you just recently I changed names to ensure all names in HOSA are consistent (before, you would see some things named "hawkular-agent" and other times you'd see "hawkular-openshift-agent" and other times you'd see things named "openshift-agent" - I made all names consistent... "hawkular-openshift-agent").
One of the changes is the name of the service account - it is now "hawkular-openshift-agent" not "hawkular-agent"
So if you are running newer code, check that. Your command above should be:
oc adm policy add-cluster-role-to-user cluster-reader system:serviceaccount:openshift-infra:hawkular- openshift-agent
Without that role, the agent will not work at all.
You can see the Makefile we use to deploy in our dev environments - see the "openshift-deploy" target here:
https://github.com/hawkular/hawkular-openshift-agent/blob/ master/Makefile#L42-L46
----- Original Message -----
> Hi John,
>
> Just tried deploying latest agent but get an error. Am running the below
> steps to deploy the agent:
>
> oc adm policy add-cluster-role-to-user cluster-reader
> system:serviceaccount:openshift-infra:hawkular-agent
> oc create -f
> https://raw.githubusercontent.com/hawkular/hawkular- openshift-agent/master/deploy/ openshift/hawkular-openshift- agent-configmap.yaml
> -n openshift-infra
> oc process -f
> https://raw.githubusercontent.com/hawkular/hawkular- openshift-agent/master/deploy/ openshift/hawkular-openshift- agent.yaml
> IMAGE_VERSION=latest | oc create -n openshift-infra -f -
>
>
> But get this error when the agent starts:
>
> E0111 18:34:11.353256 1 node_event_consumer.go:72] Error obtaining
> information about the agent pod
> [openshift-infra/hawkular-openshift-agent-h1ynb]. err=User
> "system:serviceaccount:openshift-infra:hawkular- openshift-agent" cannot get
> pods in project "openshift-infra"
>
> I've also got a custom pod that has a configmap mounted, but the agent
> never gets past this error message so doesn't collect any metrics.
>
> Output of oc describe on my custom pod shows the volume mount:
>
> Volume Mounts:
> /var/run/configmap/hawkular-agent from hawkular-openshift-agent (rw)
>
> Any ideas?
>
> On Sat, Jan 7, 2017 at 1:44 AM, John Mazzitelli <mazz@redhat.com> wrote:
>
> > FYI: some new things went into HOSA (that's the Hawkular OpenShift Agent
> > for the uninitiated).
> >
> > 1. The agent now emits its own metrics and can monitor itself. Right now
> > it just emits some basic "go" metrics like memory usage, CPU usage, etc
> > along with one agent-specific one - a counter that counts the number of
> > data points it has collected in its lifetime. We'll add more metrics as we
> > figure out the things people want to see, but we have the infrastructure in
> > place now.
> >
> > 2. The agent is deployed as a daemonset. This means as new nodes are
> > brought online, an agent will go along with it (or so I'm told :)
> >
> > 3. The agent has changed the way it discovers what to monitor - it no
> > longer looks at annotations on pods to determine where the configmaps are
> > for those pods. Instead, it looks up volume declarations to see if there is
> > an agent configmap defined. This was done to be ready for the future when
> > new security constraints will be introduced in OpenShift which would have
> > broken our annotation approach. This approach using volumes should not hit
> > that issue.
> >
> > NOTE: If you are building the latest agent from master, we added some
> > dependencies so you have to update your dependencies via Glide by using the
> > "make update-deps" target prior to building from source.
> > _______________________________________________
> > hawkular-dev mailing list
> > hawkular-dev@lists.jboss.org
> > https://lists.jboss.org/mailman/listinfo/hawkular-dev
> >
>