-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add CFP for flow log aggregation #65
base: main
Are you sure you want to change the base?
Add CFP for flow log aggregation #65
Conversation
Signed-off-by: Anubhab Majumdar <[email protected]>
cc: @cilium/sig-hubble |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the proposal makes a lot of sense and the design is sane. The approach of using fieldAggregate
is both intuitive (for users who are used to fieldMask
at least) and flexible.
The only caveat I can think of at the moment is potential pitfalls when combining fieldMask
and fieldAggregate
(e.g. exclude a field via fieldMask
but attempt to use it in fieldAggregate
). However, I think this is a problem that can be solved with specific checks.
|
||
### Will this impact existing clusters? | ||
|
||
No, this is not a breaking change. By default, the new option won't aggregate any logs. Adding fields to `fieldAggregate` will still write the same `Flow` json structure in the same location - only the number of entries written will differ. Any tooling built to ingest these logs will keep working as expected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the Motivation section:
| source_pod | source_port | destination_pod | destination_port | protocol | ingress_flow_count | egress_flow_count |
|------------|-------------|-----------------|------------------|----------|--------------------|-------------------|
| client-1 | 12345 | server | 80 | TCP | 4 | 6 |
| client-2 | 23456 | server | 80 | TCP | 4 | 6 |
How are the ingress_flow_count
and egress_flow_count
computed with the current JSON flow structure?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we may not be able to count exactly how many "packets" were sent/received between two pods for x
seconds. Since we are aggregating the flows, I was thinking of conveying the information to user as to how many "flows" Hubble processed between the two pods in a period of time.
I was thinking of extending Flow using the extensions field to convey this information.
@rolinh I was thinking the |
Co-authored-by: Bill Mulligan <[email protected]> Signed-off-by: Anubhab Majumdar <[email protected]> Signed-off-by: Anubhab Majumdar <[email protected]>
fcee6dd
to
fde9c92
Compare
@rolinh @xmulligan @kaworu Thanks for the reviews! |
Do we need checks? if fieldMask is empty, shouldnt we add all fields, no? |
Let's way a couple more days to give time to other Hubble maintainers to have a look at the proposal.
I think we just need to ensure the relation from
Yes |
| source_pod | source_port | destination_pod | destination_port | protocol | ingress_flow_count | egress_flow_count | | ||
|------------|-------------|-----------------|------------------|----------|--------------------|-------------------| | ||
| client-1 | 12345 | server | 80 | TCP | 4 | 6 | | ||
| client-2 | 23456 | server | 80 | TCP | 4 | 6 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this assume that we take the traffic direction into account, or just ignore the reply flows?
- name: "test003" | ||
filePath: "/var/run/cilium/hubble/test003.log" | ||
fieldMask: ["source", "destination","verdict"] | ||
fieldAggregate: ["source.pod_name", "destination.pod_name", "l4.TCP.destination_port"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To expand a little bit on the https://github.com/cilium/design-cfps/pull/65/files#r1946717206 question:
If we use SQL group_by as something that has similar functionality to aggregation, what happens to all the other fields not aggregated?
When a group_by is added to the sql query, all the other fields not included in the group_by need an aggregation function, so integers can be summed, values concatenated, etc.
Using your provided table as an example, lets say we don't include the destination port into the aggregation and just use source_pod, source_port
, how are destination_pod, destination_port, protocol and flags are going to be handled?
| source_pod | source_port | destination_pod | destination_port | protocol | flags |
|------------|-------------|-----------------|------------------|----------|---------|
| client-1 | 12345 | server | 80 | TCP | SYN |
| server | 80 | client-1 | 12345 | TCP | SYN-ACK |
| client-1 | 12345 | server | 80 | TCP | ACK |
| client-1 | 12345 | server | 80 | TCP | PSH |
| server | 80 | client-1 | 12345 | TCP | ACK |
| server | 80 | client-1 | 12345 | TCP | PSH |
| client-1 | 12345 | server | 80 | TCP | ACK |
| client-1 | 12345 | server | 80 | TCP | FIN |
| server | 80 | client-1 | 12345 | TCP | FIN-ACK |
| client-1 | 12345 | server | 80 | TCP | ACK |
| client-2 | 23456 | server | 80 | TCP | SYN |
| server | 80 | client-2 | 23456 | TCP | SYN-ACK |
| client-2 | 23456 | server | 80 | TCP | ACK |
| client-2 | 23456 | server | 80 | TCP | PSH |
| server | 80 | client-2 | 23456 | TCP | ACK |
| server | 80 | client-2 | 23456 | TCP | PSH |
| client-2 | 23456 | server | 80 | TCP | ACK |
| client-2 | 23456 | server | 80 | TCP | FIN |
| server | 80 | client-2 | 23456 | TCP | FIN-ACK |
| client-2 | 23456 | server | 80 | TCP | ACK |
Adding a proposal for
flowlog
aggregation in Hubble.