Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notification if container is unealthy > renew #40

Open
nlevee opened this issue Dec 22, 2018 · 13 comments
Open

Notification if container is unealthy > renew #40

nlevee opened this issue Dec 22, 2018 · 13 comments

Comments

@nlevee
Copy link

nlevee commented Dec 22, 2018

Is the listener is capable of sending notification if a container in a service is unhealthy then renew ?

Thanks

@thomasjpfan
Copy link
Contributor

When a service becomes healthy, DFSL will send a notification when it is to become healthy later.

@nlevee
Copy link
Author

nlevee commented Jan 2, 2019

This is not what happen with version 18.11.28-19 : when my service is unhealthy, docker swarm kill the container and restart another one. But the listener don't send any notification.

Do you need more information ?

@thomasjpfan
Copy link
Contributor

Can you provide information about how your service is set up?

@nlevee
Copy link
Author

nlevee commented Jan 2, 2019

my services are set up like this :

version: '3.4'

services:
  webapp-front-http:
    image: apache:latest
    ports:
        - "80"
    healthcheck: 
      test: "curl -f http://127.0.0.1/server-status?auto || exit 1"
      interval: 60s
      timeout: 5s
      retries: 5
      start_period: 10s
    deploy:
      mode: replicated
      replicas: 3
      labels:
        com.df.consulName: 'front-http'
        com.df.stackName: 'regiecamp_webapp'
        com.df.scrapeNetwork: 'regiecamp_webapp_default'
        com.df.notify: 'true'
        com.df.port: '80'
      restart_policy:
        condition: on-failure
      update_config:
        order: start-first
        parallelism: 3

@thomasjpfan
Copy link
Contributor

As a debugging option, can you listen to docker events by running: docker events -f type=service and see what events firing when:

  1. Service starts
  2. Service gets unhealthy
  3. Server becomes healthy

I didn't ask before, does your service become healthy at some point?

@nlevee
Copy link
Author

nlevee commented Jan 3, 2019

The container is killed by docker and a new one is restarted after that, So the service become healthy when the new container is started.

I try the debugging process and give you a feedback

@nlevee
Copy link
Author

nlevee commented Jan 3, 2019

So I try the command :
docker events -f type=service

sample of response :

2019-01-03T00:11:04.055685338+01:00 service update hb5rrsl3fm2yc1gpajrsdutkp (name=xxxxxxxxx_webapp_webapp-api-php)
2019-01-03T00:11:04.080270513+01:00 service update hb5rrsl3fm2yc1gpajrsdutkp (name=xxxxxxxxx_webapp_webapp-api-php, updatestate.new=updating)
2019-01-03T00:11:12.256159085+01:00 service update hb5rrsl3fm2yc1gpajrsdutkp (name=xxxxxxxxx_webapp_webapp-api-php, updatestate.new=completed, updatestate.old=updating)
2019-01-03T00:11:22.021262027+01:00 service update ptq3swwaw43cn7knowy4664aj (name=xxxxxxxxx_webapp_webapp-front-php)
2019-01-03T00:11:22.031196553+01:00 service update ptq3swwaw43cn7knowy4664aj (name=xxxxxxxxx_webapp_webapp-front-php, updatestate.new=updating)
2019-01-03T00:11:29.965940816+01:00 service update ptq3swwaw43cn7knowy4664aj (name=xxxxxxxxx_webapp_webapp-front-php, updatestate.new=completed, updatestate.old=updating)
2019-01-03T00:12:17.609856908+01:00 service update wtcglpmm9ewbeh9uxl7kgnapl (name=xxxxxxxxx_webapp_webapp-front-http)
2019-01-03T00:12:17.617948834+01:00 service update p12hfbl977gqgaqrf6w6o2636 (name=xxxxxxxxx_webapp_webapp-api-http)
2019-01-03T00:12:17.630087290+01:00 service update p12hfbl977gqgaqrf6w6o2636 (name=xxxxxxxxx_webapp_webapp-api-http, updatestate.new=updating)
2019-01-03T00:12:17.641321163+01:00 service update wtcglpmm9ewbeh9uxl7kgnapl (name=xxxxxxxxx_webapp_webapp-front-http, updatestate.new=updating)
2019-01-03T00:13:26.301852250+01:00 service update wtcglpmm9ewbeh9uxl7kgnapl (name=xxxxxxxxx_webapp_webapp-front-http, updatestate.new=completed, updatestate.old=updating)
2019-01-03T00:13:26.412017055+01:00 service update p12hfbl977gqgaqrf6w6o2636 (name=xxxxxxxxx_webapp_webapp-api-http, updatestate.new=completed, updatestate.old=updating)

I cannot make my service unhealthy right know, but I tried to kill a container in a service, nothing is fired up in service events. Is that normal ?

@thomasjpfan
Copy link
Contributor

When you kill the container, does the service start up again? If I recall, this would not fire an service event.

@sguilly
Copy link

sguilly commented Jan 29, 2019

We have the same issue when a container exit and when docker restart it.

@thomasjpfan
Copy link
Contributor

@sguilly May you provide more details about your issue?

@Mualig
Copy link

Mualig commented Feb 4, 2019

Answering for @sguilly: when one of our container exits (normal operation or killed by an error) and is restarted by docker swarm, the listener doesn't "see" the new container. The restart is set in docker-compose:

  version: '3.4'

  networks:
    proxy_proxy:
      external: true

  services:

    api:
      image: <our api>

      networks:
        - proxy_proxy

      deploy:
        mode: replicated
        replicas: 4

        update_config:
          parallelism: 1
          delay: 10s
          order: start-first
          failure_action: rollback
          monitor: 30s

        restart_policy:
          # no / any / on-failure
          condition: any
          delay: 30s
          max_attempts: 3000

        resources:
          limits:
            memory: 500M

        labels:
          - com.df.notify=true
          - com.df.scrapeNetwork=proxy_proxy
          - com.df.scrapePort=14000
          - com.df.env=production
          - com.df.metricType=api
          - com.df.alertName=errorsRate
          - "com.df.alertAnnotations=summary=API error rate is high"
          - "com.df.alertLabels=severity=high"
          - 'com.df.alertIf=(sum(rate(http_request_duration_ms_count{code=~"^5..$$"}[1m])) / sum(rate(http_request_duration_ms_count[1m]))) > 0.05'

      environment:
        NODE_ENV: production

If we restart docker-flow-swarm-listener service, the container shows up. But if we don't restart the service, Prometheus display the following error:

Get http://10.0.11.213:14000/metrics: dial tcp 10.0.11.213:14000: connect: no route to host

@thomasjpfan
Copy link
Contributor

@Mualig Which docker version are you using?

@Mualig
Copy link

Mualig commented Feb 6, 2019

We have multiple nodes in our swarm. Monitor is always deployed on a node with docker 18.06.1-ce (this version is on the majority of the nodes), but some are on docker 18.03.1-ce and one is on docker 18.09.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants