Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to monitor container Restarts #17

Open
bjmask opened this issue May 14, 2020 · 5 comments
Open

Unable to monitor container Restarts #17

bjmask opened this issue May 14, 2020 · 5 comments

Comments

@bjmask
Copy link

bjmask commented May 14, 2020

Describe the bug

No alerts triggered when restartCount > 0.

To Reproduce
apiVersion: argoproj.io/v1alpha1
kind: Notification
metadata:
name: "dummy-notification"
namespace: dev
spec:
Namespace: dev
monitorResource:
Resource: pods
Version: v1
rules:
- allConditions:
- jsonPath: status/containerStatuses[0]/restartCount
operator: gt
value: "0"
events:
- message: "Condition Triggered : Pod ={{.metadata.name}} is being tested"
emailSubject: "[ALERT] Argo Notification Condition Triggered {{.metadata.name}}"
notificationLevel: info
notifierNames:
- slack
name: "dummy-notification"
initialDelaySec: 1
throttleMinutes: 1
notifiers:
- name: slack
slack:
channel: alerts
hookUrlSecret:
key: hookURL
name: my-slack-secret

Expected behavior

I expect a notification everytime a pod has had a restart greater than 0.

  • Latest image built locally from master.
@tiej-dr
Copy link

tiej-dr commented May 15, 2020

This might work?
- jsonPath: status/containerStatuses/*/restartCount

@bjmask
Copy link
Author

bjmask commented May 15, 2020

That does indeed work, and it will monitor all containers under a pod. But how can I extract the restart count for the ith container?

e.g templating the restartCount in the message.

{{.status.containerStatuses.[0].restartCount}}

@tiej-dr
Copy link

tiej-dr commented May 16, 2020

Had similar problem addressing list elements - the syntax is not that intuitive. Try something like this:
{{ (index .status.containerStatuses 0).restartCount }}

Or more fail-safe, check all containers:

message: |
Pod: {{.metadata.name}}
{{- range $index, $container := .status.containerStatuses }}
  {{- if gt $container.restartCount <threshold> }}
  Container {{$container.containerID}} restarted {{ $container.restartCount }} times. 
  {{ end -}}
{{ end -}}

@bjmask
Copy link
Author

bjmask commented May 19, 2020

Should this be added to the README?

@tiej-dr
Copy link

tiej-dr commented May 19, 2020

I'm not involved in this project, so can't give you an answer to that one :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants