Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

in_ebpf: initial version of loader plugin #9406

Closed
wants to merge 1 commit into from

Conversation

niedbalski
Copy link
Collaborator

@niedbalski niedbalski commented Sep 20, 2024

This is a proposal for a proof of concept (POC) of an eBPF ingestor plugin. It uses libebpf to load and link to an existing eBPF program and pulls events from a fixed-size ring buffer. These events are then fed into the log ingestion pipeline.

The event types are predefined in the fluent-bit codebase, and the eBPF program must follow these definitions when submitting events to the ring buffer. In the future, this process needs to be flexible, so we can support other eBPF collectors.

Additionally, I've added a fallback option to pass strings as event payloads without needing a specific event type.

Compiled as

cmake -D FLB_IN_EBPF=ON .

An example configuration is:

[INPUT]
    Name              ebpf
    bpf_object_file   ./ebpf_program.o
    bpf_program_name  handle_fs_event
    ringbuf_map_name  events

[INPUT]
    Name              ebpf
    bpf_object_file   ./ebpf_program.o
    bpf_program_name  handle_execve_event
    ringbuf_map_name  events

[OUTPUT]
    Name stdout
    Match *

[SERVICE]
    log_level trace

An example ebpf program used on this configuration

#include <linux/types.h>

#include <bpf/bpf_helpers.h>
#include <linux/bpf.h>


struct trace_entry {
  short unsigned int type;
  unsigned char flags;
  unsigned char preempt_count;
  int pid;
};

struct trace_event_raw_sys_enter {
  struct trace_entry ent;
  long int id;
  long unsigned int args[6];
  char __data[0];
};


#define MAX_EVENT_LEN 128

// Event types enum
enum event_type {
    EVENT_FILESYSTEM = 0,
    EVENT_NETWORK = 1,
    EVENT_PROCESS = 2
};

// Base event structure sent by eBPF
struct flb_ebpf_event {
    __u32 pid;
    __u32 event_type;            // Event type as an enum
    char data[MAX_EVENT_LEN];     // Event-specific data (filename, network info, etc.)
};


// Define the ring buffer map
struct {
    __uint(type, BPF_MAP_TYPE_RINGBUF);
    __uint(max_entries, 8192);
} events SEC(".maps");

// Hook for file open (Filesystem event)
SEC("tracepoint/syscalls/sys_enter_openat")
int handle_fs_event(struct trace_event_raw_sys_enter *ctx) {
    struct flb_ebpf_event *event;
    const char *filename = (const char *)ctx->args[1];

    // Reserve space in the ring buffer
    event = bpf_ringbuf_reserve(&events, sizeof(*event), 0);
    if (!event) {
        return 0;
    }

    // Fill event data (structured event)
    event->pid = bpf_get_current_pid_tgid() >> 32;
    event->event_type = EVENT_FILESYSTEM;
    bpf_probe_read_user_str(event->data, MAX_EVENT_LEN, filename);

    // Submit the structured event
    bpf_ringbuf_submit(event, 0);
    return 0;
}

// Function to send just a string (Raw String event)
SEC("tracepoint/syscalls/sys_enter_execve")
int handle_execve_event(struct trace_event_raw_sys_enter *ctx) {
    char *event;
    const char *cmd = (const char *)ctx->args[0];

    // Reserve space in the ring buffer for the string
    event = bpf_ringbuf_reserve(&events, MAX_EVENT_LEN, 0);
    if (!event) {
        return 0;
    }

    // Send the raw string (command)
    bpf_probe_read_user_str(event, MAX_EVENT_LEN, cmd);

    // Submit the raw string
    bpf_ringbuf_submit(event, 0);
    return 0;
}

char LICENSE[] SEC("license") = "GPL";

To compile this program, you need clang in your system and run

clang -D__TARGET_ARCH_X86_64 -g -O2 -target bpf -c ebpf_program_example.c -o ebpf_program.o

With the sample configuration, the following outputs are produced:

[2024/09/20 18:05:47] [ info] [input:ebpf:ebpf.0] initializing
[2024/09/20 18:05:47] [ info] [input:ebpf:ebpf.0] storage_strategy='memory' (memory only)
[2024/09/20 18:05:47] [ info] [input:ebpf:ebpf.0] eBPF program 'handle_fs_event' loaded successfully from object file './ebpf_program.o' with ring buffer 'events'
[2024/09/20 18:05:47] [ info] [input:ebpf:ebpf.1] initializing
[2024/09/20 18:05:47] [ info] [input:ebpf:ebpf.1] storage_strategy='memory' (memory only)
[2024/09/20 18:05:47] [ info] [input:ebpf:ebpf.1] eBPF program 'handle_execve_event' loaded successfully from object file './ebpf_program.o' with ring buffer 'events'
[2024/09/20 18:05:47] [ info] [sp] stream processor started
[2024/09/20 18:05:47] [ info] [output:stdout:stdout.0] worker #0 started
[0] ebpf.0: [[1726848348.381941693, {}], {"pid"=>71947, "event_type"=>"filesystem", "event_data"=>"./ebpf_program.o"}]
[1] ebpf.0: [[1726848348.382495832, {}], {"pid"=>71947, "event_type"=>"filesystem", "event_data"=>"/sys/kernel/debug/tracing/events/syscalls/sys_enter_execve/id"}]
[2] ebpf.0: [[1726848348.382551540, {}], {"pid"=>851, "event_type"=>"filesystem", "event_data"=>"/sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/session.slice/memory.pressure"}]
[3] ebpf.0: [[1726848348.382586076, {}], {"pid"=>851, "event_type"=>"filesystem", "event_data"=>"/sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/session.slice/memory.current"}]
[4] ebpf.0: [[1726848348.382610182, {}], {"pid"=>851, "event_type"=>"filesystem", "event_data"=>"/sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/session.slice/memory.min"}]
[5] ebpf.0: [[1726848348.382634648, {}], {"pid"=>851, "event_type"=>"filesystem", "event_data"=>"/sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/session.slice/memory.low"}]
[6] ebpf.0: [[1726848348.382657849, {}], {"pid"=>851, "event_type"=>"filesystem", "event_data"=>"/sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/session.slice/memory.swap.current"}]
[7] ebpf.0: [[1726848348.382679632, {}], {"pid"=>851, "event_type"=>"filesystem", "event_data"=>"/sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/session.slice/memory.stat"}]

Unknown events

[0] ebpf.0: [[1726848417.031536658, {}], {"event_type"=>"unknown", "event_data"=>"/usr/bin/ps"}]
[0] ebpf.0: [[1726848420.076250428, {}], {"event_type"=>"unknown", "event_data"=>"/usr/bin/cmake"}]
[0] ebpf.0: [[1726848422.176789034, {}], {"event_type"=>"unknown", "event_data"=>"/usr/bin/top"}]

^C[2024/09/20 18:07:16] [engine] caught signal (SIGINT)
[2024/09/20 18:07:18] [ warn] [engine] service will shutdown in max 5 seconds
[2024/09/20 18:07:18] [ info] [input] pausing ebpf.0
[2024/09/20 18:07:19] [ info] [engine] service has stopped (0 pending tasks)
[2024/09/20 18:07:19] [ info] [input] pausing ebpf.0
[2024/09/20 18:07:19] [ info] [output:stdout:stdout.0] thread worker #0 stopping...
[2024/09/20 18:07:19] [ info] [output:stdout:stdout.0] thread worker #0 stopped

@cosmo0920
Copy link
Contributor

I'm actually living in Ubuntu 22.04 box. So, I needed to refer the actual architecture dependent header files:

$ clang -D__TARGET_ARCH_X86_64 -g -O2 -target bpf -c ebpf_program_example.c -o ebpf_program.o -I /usr/include/x86_64-linux-gnu/  

Copy link
Contributor

@cosmo0920 cosmo0920 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current code base, I also concerned about libbpf linking status:

$ ldd bin/fluent-bit
	linux-vdso.so.1 (0x00007ffeee7be000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x000078338b5c9000)
	libyaml-0.so.2 => /lib/x86_64-linux-gnu/libyaml-0.so.2 (0x000078338b5a8000)
	libsystemd.so.0 => /lib/x86_64-linux-gnu/libsystemd.so.0 (0x000078338a139000)
	libbpf.so.0 => /lib/x86_64-linux-gnu/libbpf.so.0 (0x000078338a0ea000)
	libssl.so.3 => /lib/x86_64-linux-gnu/libssl.so.3 (0x000078338a046000)
	libcrypto.so.3 => /lib/x86_64-linux-gnu/libcrypto.so.3 (0x0000783389c00000)
	libcurl.so.4 => /lib/x86_64-linux-gnu/libcurl.so.4 (0x0000783389b59000)
	libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x0000783389b3d000)
	libzstd.so.1 => /lib/x86_64-linux-gnu/libzstd.so.1 (0x0000783389a6e000)
	libsasl2.so.2 => /lib/x86_64-linux-gnu/libsasl2.so.2 (0x0000783389a53000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x0000783389a33000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x0000783389800000)
	/lib64/ld-linux-x86-64.so.2 (0x000078338b6e9000)
<snip>

This could indicate that libbpf is linked as shared object. So. fluent-bit is not tainted for non-Apache License such as GNU like license.

CMakeLists.txt Outdated Show resolved Hide resolved
CMakeLists.txt Outdated Show resolved Hide resolved
@niedbalski niedbalski changed the title [DRAFT] in_ebpf: initial version in_ebpf: initial version of loader plugin Nov 2, 2024
This is an initial proposal of a POC of an ebpf ingestor
plugin. This adds capabilities to load and attach to
an existing ebpf program and consume events from a fixed-sized
ring buffer, subsequently those events are ingested in the log
ingestion buffer.

Events types are known and defined in the fluent-bit codebase and
those has to be implemented by the ebpf program to follow when submitted
into the ring buffer, this in the future must be serialized and
be an extensible part of the project as we possibly make progress towards
compability with other ebpf collectors.

Also, i've implemented a fallback to allow strings to be passed as the
payload of the event, without following a specific event type.

Signed-off-by: Jorge Niedbalski <[email protected]>
Copy link
Contributor

@cosmo0920 cosmo0920 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, the direction of in_ebpf implementation is correct.
I found several of coding style issues and how to display or assert test conditions/results.
So, I marked as request changes for now.

Comment on lines +163 to +170
} else if (data_sz <= MAX_EVENT_LEN) {
*event_type_str = FLB_IN_EBPF_EVENT_TYPE_UNKNOWN;
*pid = 0;
*event_data = (char *)data;
*event_data_len = strlen(*event_data);
} else {
return -1;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to add a newline before else.
It's our Fluent bit coding style.


/* Define default values */
#define FLB_IN_EBPF_DEFAULT_RINGBUF_MAP_NAME "events"
#define FLB_IN_EBPF_DEFAULT_POLL_MS "1000" // 1 second default poll timeout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to use /* */ style for one line comments.

#define FLB_IN_EBPF_DEFAULT_RINGBUF_MAP_NAME "events"
#define FLB_IN_EBPF_DEFAULT_POLL_MS "1000" // 1 second default poll timeout
#define FLB_IN_EBPF_DEFAULT_ATTRIBUTE_NAME "payload"
#define FLB_IN_EBPF_DEFAULT_RINGBUF_SIZE "8192" // Default ring buffer size in bytes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

TEST_CHECK(strcmp(event_type_str, FLB_IN_EBPF_EVENT_TYPE_PROCESS) == 0);
TEST_CHECK(pid == 5678);
TEST_CHECK(strcmp(event_data, "structured_event_data") == 0);
printf("test_extract_event_data_structured passed\n");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove these debug prints in this unit testing file?
We can confirm whether succeeded or not with -v option to pass the built bin/flb-rt-in_ebpf executable.
So, we needn't display the result of the status of unit testing.
Instead, we need to create each of assertions carefully what we wanted to check the results and conditions.

@niedbalski
Copy link
Collaborator Author

This has been dismissed in favour of #9576

@niedbalski niedbalski closed this Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-required ok-package-test Run PR packaging tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants