Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CFP for flow log aggregation #65

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
150 changes: 150 additions & 0 deletions hubble/CFP-37472-flowlog-aggregation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
# CFP-37472: Support aggregation to manage flow log volume on disk

**SIG: SIG-NAME** SIG-HUBBLE-API/Observability

**Begin Design Discussion:** 2025-02-06

**Cilium Release:** N/A

**Authors:** Anubhab Majumdar <[email protected]> , Neha Aggarwal <[email protected]>
anubhabMajumdar marked this conversation as resolved.
Show resolved Hide resolved

**Status:** Implementable

## Summary

Persisting flow logs provides users with the opportunity to historically review network performance and security. However, saving every flow generated by the dataplane can incur significant costs and cause performance issues, especially in large and dense clusters. This CFP proposes adding an option to allow the aggregation of flows before persisting them, addressing the aforementioned issues.

## Motivation

Users in cloud environment wants to persist the flows in databases for querying and drawing inference. In cloud environment, there is cost associated with ingestion, storage and querying such logs. Today, even with `--montior-aggregation` set to `medium` and applying filters for exporter, the volume of flow logs persisted on disk is enormous. One cause is lack of controls to allow users the opportunity to perform some form of aggregation of the flow logs, as allowed by the data plane and metrics through `monitor-aggregation` and `source/destinationContext` options.

Providing an aggregation option as part of dynamic exporter would allow user to set the verbosity of logs they want to store, while also capturing the important signals from the cluster networking. Consider the current scenario where `client-1` and `client-2` are communicating with pod `server` over TCP. Say, both client pods are on same node. We expect to see entries like this on the log file stored on the node over 30 second period.


| source_pod | source_port | destination_pod | destination_port | protocol | flags |
|------------|-------------|-----------------|------------------|----------|---------|
| client-1 | 12345 | server | 80 | TCP | SYN |
| server | 80 | client-1 | 12345 | TCP | SYN-ACK |
| client-1 | 12345 | server | 80 | TCP | ACK |
| client-1 | 12345 | server | 80 | TCP | PSH |
| server | 80 | client-1 | 12345 | TCP | ACK |
| server | 80 | client-1 | 12345 | TCP | PSH |
| client-1 | 12345 | server | 80 | TCP | ACK |
| client-1 | 12345 | server | 80 | TCP | FIN |
| server | 80 | client-1 | 12345 | TCP | FIN-ACK |
| client-1 | 12345 | server | 80 | TCP | ACK |
| client-2 | 23456 | server | 80 | TCP | SYN |
| server | 80 | client-2 | 23456 | TCP | SYN-ACK |
| client-2 | 23456 | server | 80 | TCP | ACK |
| client-2 | 23456 | server | 80 | TCP | PSH |
| server | 80 | client-2 | 23456 | TCP | ACK |
| server | 80 | client-2 | 23456 | TCP | PSH |
| client-2 | 23456 | server | 80 | TCP | ACK |
| client-2 | 23456 | server | 80 | TCP | FIN |
| server | 80 | client-2 | 23456 | TCP | FIN-ACK |
| client-2 | 23456 | server | 80 | TCP | ACK |

However, we can gleam the same level of information if the logs were aggregated as follows:

| source_pod | source_port | destination_pod | destination_port | protocol | ingress_flow_count | egress_flow_count |
|------------|-------------|-----------------|------------------|----------|--------------------|-------------------|
| client-1 | 12345 | server | 80 | TCP | 4 | 6 |
| client-2 | 23456 | server | 80 | TCP | 4 | 6 |
Comment on lines +49 to +52
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this assume that we take the traffic direction into account, or just ignore the reply flows?


This has three major benefits:
- less disk writes (better disk space utilization on host)
- less external storage cost
- efficient querying to draw inference

## Goals

* Expose an additional configuration in dynamic flow log to allow aggregation based off specified fields from [flow API](https://github.com/cilium/cilium/blob/main/api/v1/flow/flow.proto)

## Non-Goals

* Have pre-determined aggregation levels (ex: low or medium)
* Aggregate at a cluster level instead of node. For ex: a flow between two pods on two nodes will generate at least one entry on each node. This aggregation option won't aggregate both entries into one.

## Proposal

### Overview

We propose adding a new field in hubble dynamic export configmap called `fieldAggregate`, similar to `fieldMask`. It will be an array of string specifying the fields of [Flow](https://github.com/cilium/cilium/blob/main/api/v1/flow/flow.proto).


### Configmap

```bash
hubble:
export:
dynamic:
enabled: true
config:
enabled: true
content:
- name: "test001"
filePath: "/var/run/cilium/hubble/test001.log"
fieldMask: []
fieldAggregate: []
includeFilters: []
excludeFilters: []
end: "2023-10-09T23:59:59-07:00"
- name: "test002"
filePath: "/var/run/cilium/hubble/test002.log"
fieldMask: ["source.namespace", "source.pod_name", "destination.namespace", "destination.pod_name", "verdict"]
fieldAggregate: ["source.pod_name", "destination.pod_name"]
includeFilters:
- source_pod: ["default/"]
event_type:
- type: 1
- destination_pod: ["frontend/webserver-975996d4c-7hhgt"]
excludeFilters: []
end: "2023-10-09T23:59:59-07:00"
- name: "test003"
filePath: "/var/run/cilium/hubble/test003.log"
fieldMask: ["source", "destination","verdict"]
fieldAggregate: ["source.pod_name", "destination.pod_name", "l4.TCP.destination_port"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To expand a little bit on the https://github.com/cilium/design-cfps/pull/65/files#r1946717206 question:

If we use SQL group_by as something that has similar functionality to aggregation, what happens to all the other fields not aggregated?

When a group_by is added to the sql query, all the other fields not included in the group_by need an aggregation function, so integers can be summed, values concatenated, etc.

Using your provided table as an example, lets say we don't include the destination port into the aggregation and just use source_pod, source_port, how are destination_pod, destination_port, protocol and flags are going to be handled?

| source_pod | source_port | destination_pod | destination_port | protocol | flags   |
|------------|-------------|-----------------|------------------|----------|---------|
| client-1   | 12345       | server          | 80               | TCP      | SYN     |
| server     | 80          | client-1        | 12345            | TCP      | SYN-ACK |
| client-1   | 12345       | server          | 80               | TCP      | ACK     |
| client-1   | 12345       | server          | 80               | TCP      | PSH     |
| server     | 80          | client-1        | 12345            | TCP      | ACK     |
| server     | 80          | client-1        | 12345            | TCP      | PSH     |
| client-1   | 12345       | server          | 80               | TCP      | ACK     |
| client-1   | 12345       | server          | 80               | TCP      | FIN     |
| server     | 80          | client-1        | 12345            | TCP      | FIN-ACK |
| client-1   | 12345       | server          | 80               | TCP      | ACK     |
| client-2   | 23456       | server          | 80               | TCP      | SYN     |
| server     | 80          | client-2        | 23456            | TCP      | SYN-ACK |
| client-2   | 23456       | server          | 80               | TCP      | ACK     |
| client-2   | 23456       | server          | 80               | TCP      | PSH     |
| server     | 80          | client-2        | 23456            | TCP      | ACK     |
| server     | 80          | client-2        | 23456            | TCP      | PSH     |
| client-2   | 23456       | server          | 80               | TCP      | ACK     |
| client-2   | 23456       | server          | 80               | TCP      | FIN     |
| server     | 80          | client-2        | 23456            | TCP      | FIN-ACK |
| client-2   | 23456       | server          | 80               | TCP      | ACK     |

includeFilters: []
excludeFilters:
- destination_pod: ["ingress/"]
```

The `fieldAggregate` option would allow user to have different level of aggregation across different exporters. This flexibility is necessary as user may want granular logs for sensitive namespaces and more broad signals from others.

## Impacts / Key Questions

### Impact: Log volume

This configuration can cut down log volume by significant amount. This helps store more logs using same disk space on host. Also, less I/O writes on disk.

### Impact: Cost

Reduces ingestion and storage cost of logs, while providing insight into cluster networking. Querying the data is also simplified because of already aggregated data.

### Will this impact existing clusters?

No, this is not a breaking change. By default, the new option won't aggregate any logs. Adding fields to `fieldAggregate` will still write the same `Flow` json structure in the same location - only the number of entries written will differ. Any tooling built to ingest these logs will keep working as expected.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the Motivation section:

| source_pod | source_port | destination_pod | destination_port | protocol | ingress_flow_count | egress_flow_count |
|------------|-------------|-----------------|------------------|----------|--------------------|-------------------|
| client-1   | 12345       | server          | 80               | TCP      | 4                 | 6                 |
| client-2   | 23456       | server          | 80               | TCP      | 4                 | 6                 |

How are the ingress_flow_count and egress_flow_count computed with the current JSON flow structure?

Copy link
Author

@anubhabMajumdar anubhabMajumdar Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we may not be able to count exactly how many "packets" were sent/received between two pods for x seconds. Since we are aggregating the flows, I was thinking of conveying the information to user as to how many "flows" Hubble processed between the two pods in a period of time.

I was thinking of extending Flow using the extensions field to convey this information.


### Will this incur more resource usage?

Depending on implementation, the aggregation mechanism would incur additional memory cost in the agent.
This should incur less CPU usage as the main usage of CPU is for JSON serialization, and we are writing less flows to the disk.

## Alternatives

### Aggregation at dataplane

The eBPF program collecting events from kernel can possibly aggregate the events. But the dataplane is the source for all events. The same events are used to generate metrics, show details flows at given instance, and historical flow logs. Styming at source may impact functionality of one tool at the cost of another. For ex: I may want to see flows at a highly granular level using Hubble CLI, but store in an aggregated form in a database.

Also, if Hubble is running without the Cilium data plane, it will need to depend on the data plane for supporting any form of consolidation of events.

### Aggregation during ingestion

Aggregation can be implemented in tools that is scraping the flow logs from the host before storing them externally. However, that would mean
- the tool should support aggregation based off JSON fields
- if you want to upgrade the tool, you need to re-implement the logic
- still doesn't reduce I/O disk write on host and storage taken up on host

## Future Milestones

In future, aggregation can be extended at a cluster level.