Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graceful Termination Queue Listen while scaling down #521

Open
mohavee opened this issue Feb 10, 2025 · 6 comments
Open

Graceful Termination Queue Listen while scaling down #521

mohavee opened this issue Feb 10, 2025 · 6 comments
Labels

Comments

@mohavee
Copy link

mohavee commented Feb 10, 2025

Hello,

We are running queue workers in a Kubernetes environment where pods are short-lived and can be interrupted at any time. Currently, the yii\queue\cli\Queue::listen() method continuously listens for new messages until it receives a termination signal (SIGTERM, SIGINT, or SIGHUP). Found related https://github.com/yiisoft/yii2-queue/issues/399

When we push a long-running job to the queue and send a termination signal (e.g., Ctrl+C), the worker behaves correctly by finishing the current job before stopping. However, after the job is processed, the listen() method hangs. Once a new message is pushed to the queue, the process stops immediately.

Expected Behavior for Graceful Termination when scaling down workers:
Termination signal + empty queue → The worker should stop immediately.
Termination signal during job processing → The worker should complete the current job and stop without continuing to listen for new messages.
Is there a way to achieve this behavior natively in Yii2 Queue?

@samdark
Copy link
Member

samdark commented Feb 10, 2025

I think that's correct behavior you are describing. How do you run queues? What's in your entry script/cmd?

@mohavee
Copy link
Author

mohavee commented Feb 10, 2025

The workers are running as a Kubernetes Deployment, scaled using HPA (Horizontal Pod Autoscaler).

The component:

'components' => [
   'queueService' => [
        'class' => yii\queue\amqp_interop\Queue::class,
         'vhost' => '',
         'host' => 'rabbitmq',
         'port' => 5672,
         'user' => '',
         'password' => '',
         'exchangeName' => 'event_sync_exchange',
         'queueName' => 'event_sync_queue',
         'driver' => yii\queue\amqp_interop\Queue::ENQUEUE_AMQP_LIB,
     ]
 ]

The command and args for the container are:

Command: /bin/bash  
Args: -c php yii queue-service/listen  

During downscaling, even with a properly configured terminationGracePeriodSeconds, the worker gets stuck and does not stop gracefully. Instead, it waits for the full termination period and ultimately ends with a SIGKILL.

EDITED: added component, changed the args.

@samdark
Copy link
Member

samdark commented Feb 10, 2025

What's rabbit/leaflets?

@mohavee
Copy link
Author

mohavee commented Feb 10, 2025

Ive updated the original comment.

@samdark
Copy link
Member

samdark commented Feb 11, 2025

OK. That looks valid and is likely a bug. Can't dig into it right now myself though :(

@samdark samdark added the type:bug Bug label Feb 11, 2025
@s1lver
Copy link
Member

s1lver commented Feb 12, 2025

This is a known issue for long handlers when running in K8S. It can also occur when using a RabbitMQ cluster and the accompanying HAProxy. The thing is that you need to tell the server that your connection is still alive. You can use the heartbeat option for this (I didn't see it in your config).

[
    ...,
    'heartbeat' => 10, // seconds
]

However, this may still not solve the connection failure problem for K8S and HAProxy, and your handler may fail with an error or lose connection with the server and not reconnect (for example, if the timeout on Ingress k8s and HAProxy are different). In this case, the frame calculation based on the set heartbeat seconds will not fall within this interval.

You can additionally add a setupBroker handler to handle such situations

eg

protected function setupBroker(): void
{
    if ($this->setupBrokerDone) {
        return;
    }

    static $reconnectAttempt = 0;
    try {
        parent::setupBroker();
    } catch (Throwable $e) {
        if ($reconnectAttempt < $this->retries) {
            $this->close();
            $reconnectAttempt++;

            if ($this->retryInterval > 0) {
                usleep($this->retryInterval);
            }
            $this->open();
            $this->setupBroker();
          } else {
             throw $e;
          }
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants