Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: transition event loop to max live iters soft threshold #7701

Closed
wants to merge 1 commit into from

Conversation

matthewfala
Copy link
Contributor

@matthewfala matthewfala commented Jul 13, 2023

Issue Flagged by Leonardo and previously spotted by team.

Problem

Event loop possibly does not process all injected events before end of an event loop cycle because of max iters causing cleanup to prematurely occur.

Old No Priority System

Before priority events were added, one mk_event_wait was called and all events were processed. Clean up starts after all events processed

Current Priority System

Now mk_event_wait_2, non-blocking, is called many times to pick up new events of higher priority. In one round of processing, if more than max iter events are processed the loop will break and the rest of the events will be processed in the next cycle. This is a problem. All picked up events should be finished processed within one cycle of the event loop to finish remaining work.

Problems with Current System

This is mainly a problem for Injected events related to networking. Injected events are used to squeeze some work into the scheduler before any of the cleanup functions run (network related cleanup functions, timeout related iirc). Capping the amount of iters can corrupt this model where the work is pushed after the cleanup functions which means it will be too late.

Solution & Updated System

Sloppy/Soft Prioritization
The solution is a simple one. We switch from a hard max iteration limit to a soft max iteration limit.

After reaching some limit on the event loop, the event loop stops picking up newly triggered events, but continues to process all already triggered events. It continues to accept incoming injected events.

The loop ends when there are no more events to be processed, meaning that there could be a few more cycles to complete the remaining already accepted work before exiting the event loop and moving on to the cleanup functions.

How the Solution Helps

This means that all accepted and injected work will be completed before any of the cleanup functions are run. Thus resolving problems.

Side Effects

Sloppy Prioritization
This change breaks the priority event loop abstraction slightly. If some low priority events are triggered after the live iteration limit has been reached, the priority event loop will not admit this work for the current round of processing but will continue to process lower priority events before the next round where high priority events are admitted.

The maintainers should carefully consider this.

Guarantees

  • All events picked up by mk_event_wait[_2] and injected will be processed before the cleanup code

Non-Guarantees

  • Events no longer are guaranteed to be processed in priority order if # of events exceeds FLB_ENGINE_LOOP_MAX_LIVE_ITER

Other Options

Strict Prioritization via Event 0 Event Processing
A way to trade off is to make all injected events of Priority 0 and to only process priority 0 events after pausing event intake on the event loop.

How does this help: Since only top priority events are processed we GUARANTEE that events are processed in order of priority. However we also guarantee that high priority work admitted to the event loop or injected will be processed.

This may be a good compromise, so long as all injected events or work that must be completed in the round will be of Priority 0.

Guarantees

  • All priority 0 events picked up by mk_event_wait[_2] and priority 0 injected will be processed before the cleanup code
  • Events are guaranteed to be processed in order of priority

Non Guarantees

  • Injected events and events picked up by mk_event_wait[_2] of priority greater than 0 are NOT guaranteed to be processed before the cleanup code (only priority 0 events)

Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

@matthewfala
Copy link
Contributor Author

matthewfala commented Jul 18, 2023

Network Event Drop

Summary

Today, met with Leonardo to discuss his proposed solution to network related issues that this PR would resolve.
We think that the problem can be resolved without modification to the Event Loop. However this PR is kept open in the case where solution attempts do not resolve the detected problem.

Symptoms of Network Event Drop

In testing, Leonardo sent a large amount of data Fluent Bit and noticed that some coroutines fail to be woken up after timeout is hit.

He noticed that increasing FLB_ENGINE_LOOP_MAX_ITER resolves the problem:

#define FLB_ENGINE_LOOP_MAX_ITER 10 /* Max events processed per round */

Hypothesis of Network Coro Drop Problem Cause -- From Leonardo, addressed indirectly by this PR

Evey 1.5 seconds the a network timeout detection function is called possibly triggering

prepare_destroy_conn(u_conn);

followed by:

mk_event_inject(u_conn->evl,

In usual circumstances this causes the following to happen:

  1. Connection is closed immediately in the prepare_destroy_connection function
  2. Connection struct memory is queued for deletion after a round of the event loop finishes
  3. Connection event is injected into the Event Loop
  4. Connection event is processed by the Event Loop
  5. Connection event memory (in connection struct) is freed by the cleanup functions

In the problematic case the following will occur (Note, only steps 4 and 5 reverse)

  1. Connection is closed immediately in the prepare_destroy_connection function
  2. Connection struct memory is queued for deletion after a round of the event loop finishes
  3. Connection event is injected into the Event Loop
    (N_iter Limit is hit triggering cleanup functions to be called) (New!)
  4. Connection event memory (in connection struct) is freed by the cleanup functions (Reversed!)
  5. Connection event is processed by the Event Loop (Reversed!)

Analysis of the Cause

  1. We expect that the injected network wake up event occurs before the event cleanup code frees connection memory.
  2. This allows for the network co-routine to properly clean itself up and exit.

If the order switches and we free the event memory before processing it, then we may process a corrupted event.

Solution Recommendation - By Leonardo

Old responsibilities:

  1. Callback Timer - That triggers network timeouts and injects events
    1. Shut the connection socket
    2. Queue connection (event) memory for deletion
  2. Engine - Clean up functions
    1. Delete the connection (event) memory

Solution: migrate the connection event deletion responsibility from the Engine to the co-routine (after handling timeouts).

New responsibilities:

  1. Callback Timer - That triggers network timeouts and injects events
    1. Shut the connection socket (in the case of read/write)
    2. Queue connection (event) memory for deletion
  2. Engine - Clean up functions
    1. Delete the connection (event) memory
  3. Coroutine - That timed out
    1. Delete the connection (event) memory (after handling timeout)

Feasibility

Assessed

  • connection creation code - get_conn
  • TLS
  • Read/write

Out of these, Read/write is the hardest to handle in terms of network errors:

  • Relies on the socket getting shutdown when the timeout is called. This is resolved by still shutting down the socket on timeout, but not queuing the connection immediately for deletion.

Notes

This solution may not fully resolve the event drop issue. This PR can server as an alternate solution.

PR Modification notes:

It would be best to keep the priority event loop performance gains by ensuring that new tasks and flush events are not triggered prematurely before all events are processed. This can be done by allowing all events to be guaranteed to be processed before the end of an event loop cycle, except very low priority events.

Leonardo notes that this convolutes the already complicated event loop and network solution alternatives should be fully explored first before resorting to event loop changes.

Actions

Wait

@leonardo-albertovich
Copy link
Collaborator

As per our previous conversation I have opened PR #7728 which addresses the root cause of the issue and I'd appreciate it if you took a look at it.

@matthewfala
Copy link
Contributor Author

matthewfala commented Jul 25, 2023

Alternate solution which changes the network code to support the existing system is merged:
#7728

This PR is no longer needed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants