|
| 1 | +# Kmesh eBPF Unit Testing Framework |
| 2 | + |
| 3 | +## **1 Background** |
| 4 | + |
| 5 | +Currently, Kmesh needs a lightweight unit testing framework to test eBPF programs. This framework should be able to run tests for individual eBPF programs independently, without loading the entire Kmesh system, thereby improving testing efficiency and coverage. |
| 6 | + |
| 7 | +## **2 Design Approach** |
| 8 | + |
| 9 | +The eBPF kernel code in the Kmesh project is managed by the cilium/ebpf project, so we can draw inspiration from the eBPF unit testing framework in the Cilium project, making appropriate modifications and customizations to meet the needs of the Kmesh project. |
| 10 | + |
| 11 | +### 2.1 Introduction to the Cilium eBPF Unit Testing Framework |
| 12 | + |
| 13 | +> Reference cilium v1.17: |
| 14 | +> |
| 15 | +> eBPF unit testing documentation: https://docs.cilium.io/en/v1.17/contributing/testing/bpf/#bpf-testing |
| 16 | +> |
| 17 | +> cilium/bpf/tests source code: https://github.com/cilium/cilium/tree/v1.17.0/bpf/tests |
| 18 | +
|
| 19 | +#### 2.1.1 Overview of cilium/bpf/tests |
| 20 | + |
| 21 | +The Cilium project uses a dedicated testing framework to verify the correctness of its BPF programs. This framework allows developers to write test cases, construct network packets, and verify the behavior of BPF programs in different scenarios. |
| 22 | + |
| 23 | +#### 2.1.2 Structure of cilium/bpf/tests Test Files |
| 24 | + |
| 25 | +Taking `xdp_nodeport_lb4_test.c` (which tests Cilium's XDP program for load balancing in an IPv4 environment) as an example, the core content of a typical test file is as follows: |
| 26 | + |
| 27 | +```c |
| 28 | +// https://github.com/cilium/cilium/blob/v1.17.0/bpf/tests/xdp_nodeport_lb4_test.c |
| 29 | +#include "common.h" |
| 30 | +#include "bpf/ctx/xdp.h" |
| 31 | + |
| 32 | +// Mock FIB (Forwarding Information Base) lookup function, populate source MAC and destination MAC |
| 33 | +#define fib_lookup mock_fib_lookup |
| 34 | + |
| 35 | +static const char fib_smac[6] = {0xDE, 0xAD, 0xBE, 0xEF, 0x01, 0x02}; |
| 36 | +static const char fib_dmac[6] = {0x13, 0x37, 0x13, 0x37, 0x13, 0x37}; |
| 37 | + |
| 38 | +long mock_fib_lookup(__maybe_unused void *ctx, struct bpf_fib_lookup *params, |
| 39 | + __maybe_unused int plen, __maybe_unused __u32 flags) |
| 40 | +{ |
| 41 | + __bpf_memcpy_builtin(params->smac, fib_smac, ETH_ALEN); |
| 42 | + __bpf_memcpy_builtin(params->dmac, fib_dmac, ETH_ALEN); |
| 43 | + return 0; |
| 44 | +} |
| 45 | + |
| 46 | +// Include BPF code directly |
| 47 | +#include "bpf_xdp.c" |
| 48 | + |
| 49 | +// Use tail call to execute the BPF program under test in the test code |
| 50 | +struct { |
| 51 | + __uint(type, BPF_MAP_TYPE_PROG_ARRAY); |
| 52 | + __uint(key_size, sizeof(__u32)); |
| 53 | + __uint(max_entries, 2); |
| 54 | + __array(values, int()); |
| 55 | +} entry_call_map __section(".maps") = { |
| 56 | + .values = { |
| 57 | + [0] = &cil_xdp_entry, |
| 58 | + }, |
| 59 | +}; |
| 60 | + |
| 61 | +// Build test xdp packet, can use PKTGEN macro to achieve the same effect |
| 62 | +static __always_inline int build_packet(struct __ctx_buff *ctx){} |
| 63 | + |
| 64 | +// Build packet, add frontend and backend, then tail call jump to entry point |
| 65 | +SETUP("xdp", "xdp_lb4_forward_to_other_node") |
| 66 | +int test1_setup(struct __ctx_buff *ctx) |
| 67 | +{ |
| 68 | + int ret; |
| 69 | + |
| 70 | + ret = build_packet(ctx); |
| 71 | + if (ret) |
| 72 | + return ret; |
| 73 | + |
| 74 | + lb_v4_add_service(FRONTEND_IP, FRONTEND_PORT, IPPROTO_TCP, 1, 1); |
| 75 | + lb_v4_add_backend(FRONTEND_IP, FRONTEND_PORT, 1, 124, |
| 76 | + BACKEND_IP, BACKEND_PORT, IPPROTO_TCP, 0); |
| 77 | + |
| 78 | + /* Jump into the entrypoint */ |
| 79 | + tail_call_static(ctx, entry_call_map, 0); |
| 80 | + /* Fail if we didn't jump */ |
| 81 | + return TEST_ERROR; |
| 82 | +} |
| 83 | + |
| 84 | +// Check test results |
| 85 | +CHECK("xdp", "xdp_lb4_forward_to_other_node") |
| 86 | +int test1_check(__maybe_unused const struct __ctx_buff *ctx) |
| 87 | +{ |
| 88 | + test_init(); |
| 89 | + |
| 90 | + void *data = (void *)(long)ctx->data; |
| 91 | + void *data_end = (void *)(long)ctx->data_end; |
| 92 | + |
| 93 | + if (data + sizeof(__u32) > data_end) |
| 94 | + test_fatal("status code out of bounds"); |
| 95 | + |
| 96 | + //... |
| 97 | + |
| 98 | + test_finish(); |
| 99 | +} |
| 100 | +``` |
| 101 | +
|
| 102 | +#### 2.1.3 Design of the cilium/bpf/tests Testing Framework |
| 103 | +
|
| 104 | +The Cilium eBPF testing framework centers on `common.h`, which provides the infrastructure, macros, and functions needed for testing: |
| 105 | +
|
| 106 | +##### Core Test Macros |
| 107 | +
|
| 108 | +- **TEST(name, body)**: Defines individual test cases for organizing independent test functionalities |
| 109 | +- **PKTGEN(progtype, name)**: Defines test segments for generating network packets |
| 110 | +- **SETUP(progtype, name)**: Defines the initialization phase of tests, such as setting up test environments and preconditions |
| 111 | +- **CHECK(progtype, name)**: Defines segments for verifying test results; each test needs at least one |
| 112 | +
|
| 113 | +##### Test Flow Control |
| 114 | +
|
| 115 | +- **test_init()**: Initializes the test environment, called at the beginning of a test |
| 116 | +- **test_finish()**: Completes the test and returns results, called at the end of a test |
| 117 | +- **test_fail(fmt, ...)**: Marks the current test as failed and provides a failure reason |
| 118 | +- **test_skip(fmt, ...)**: Skips the current test, commonly used when dependency conditions are not met |
| 119 | +
|
| 120 | +##### Assertion and Logging Mechanisms |
| 121 | +
|
| 122 | +- **assert(cond)**: Verifies if a condition is true, otherwise the test fails |
| 123 | +- **test_log(fmt, args...)**: Records test messages, similar to the `printf` format |
| 124 | +- **test_error(fmt, ...)**: Records errors and marks the test as failed |
| 125 | +- **test_fatal(fmt, ...)**: Records severe errors and terminates the test immediately |
| 126 | +- **assert_metrics_count(key, count)**: Verifies if a specific metric count meets expectations |
| 127 | +
|
| 128 | +##### Test Result Management |
| 129 | +
|
| 130 | +The testing framework uses the following status codes to mark test results: |
| 131 | +
|
| 132 | +- **TEST_ERROR (0)**: Test execution encountered an error |
| 133 | +- **TEST_PASS (1)**: Test passed |
| 134 | +- **TEST_FAIL (2)**: Test failed |
| 135 | +- **TEST_SKIP (3)**: Test was skipped |
| 136 | +
|
| 137 | +##### Test Execution Flow |
| 138 | +
|
| 139 | +1. **Test Launch**: Execute the `make run_bpf_tests` command in the project root directory |
| 140 | +2. **Container Build**: Build a Docker test container to ensure consistency in the test environment |
| 141 | +3. **Test Compilation**: Compile eBPF test code using Clang |
| 142 | +4. **Test Coordination**: The Go testing framework manages the test lifecycle, including: |
| 143 | + - Loading compiled eBPF programs |
| 144 | + - Initializing the test environment |
| 145 | + - Executing test cases |
| 146 | + - Collecting test results |
| 147 | +
|
| 148 | +##### Communication Mechanism Between Go and eBPF |
| 149 | +
|
| 150 | +1. **Protocol Buffer Interface**: Defines structured message formats for communication between Go and eBPF test programs |
| 151 | +2. **Test Result Storage**: eBPF test programs encode results and store them in `suite_result_map` |
| 152 | +3. **Result Extraction and Parsing**: Go test code reads the map, decodes the results, and performs verification and reporting |
| 153 | +
|
| 154 | +##### Test Coverage |
| 155 | +
|
| 156 | +Cilium project uses [coverbee](https://github.com/cilium/coverbee) subproject to measure code coverage for eBPF programs. This provides coverage analysis capabilities for eBPF programs similar to user-space code: |
| 157 | +
|
| 158 | +- **Working Principle**: |
| 159 | + - Instruments the eBPF bytecode, assigning unique IDs to each line of code, and adding counter logic: `cover_map[line_id]++` |
| 160 | + - When the program executes, the counter for each accessed line of code increments |
| 161 | +
|
| 162 | +- **Coverage Analysis Workflow**: |
| 163 | + 1. The instrumented eBPF program collects execution count data during execution |
| 164 | + 2. User-space program reads the coverage map (cover_map) |
| 165 | + 3. The collected data is associated with source code line numbers |
| 166 | + 4. Standard format coverage reports are generated |
| 167 | +
|
| 168 | +##### Data Exchange Flow |
| 169 | +
|
| 170 | +``` |
| 171 | +[eBPF Test Program] → [Encode Results] → [suite_result_map] → [Go Test Runner] → [Decode & Report] |
| 172 | +``` |
| 173 | +
|
| 174 | +The Go testing framework is responsible for the final test report summary, including test pass rates, coverage statistics, and failed case analysis. |
| 175 | +
|
| 176 | +### 2.2 Design of the kmesh eBPF Unit Testing Framework |
| 177 | +
|
| 178 | +#### 2.2.1 Requirement Analysis for kmesh eBPF Unit Testing |
| 179 | +
|
| 180 | +Comparing the Cilium and Kmesh projects, we need to consider the following differences in designing the unit testing framework: |
| 181 | +
|
| 182 | +1. **Build System Differences**: |
| 183 | + - Cilium uses Clang to directly compile BPF code into bytecode |
| 184 | + - Kmesh uses the bpf2go tool provided by cilium/ebpf to compile BPF C code and convert it to Go code calls |
| 185 | +
|
| 186 | +2. **Code Maintenance Challenges**: |
| 187 | + - Currently, Kmesh uses libbpf to maintain BPF code under test, resulting in the need to maintain two sets of compilation commands: bpf2go and unittest-makefile |
| 188 | + - After changes to core eBPF code, test code needs to be synchronized, leading to high maintenance costs |
| 189 | +
|
| 190 | +3. **Objectives**: |
| 191 | + - Design a testing framework closely integrated with the main code |
| 192 | + - Reduce duplicate maintenance overhead |
| 193 | + - Use the Golang testing framework for testing, facilitating integration into CI/CD workflows |
| 194 | +
|
| 195 | +#### 2.2.2 Design of the kmesh eBPF Unit Testing Framework |
| 196 | +
|
| 197 | +Based on the analysis of the Cilium testing framework and the characteristics of the Kmesh project, we have designed the Kmesh eBPF unit testing framework, which includes the following main components: |
| 198 | +
|
| 199 | +##### Overall Architecture |
| 200 | +
|
| 201 | +The Kmesh eBPF unit testing framework adopts a layered design: |
| 202 | +
|
| 203 | +1. **eBPF Test Program Layer**: Write eBPF test code in C language, including test cases, test data, and verification logic |
| 204 | +2. **Go Test Driver Layer**: Responsible for loading eBPF programs, loading policies in user space, executing tests, and collecting results |
| 205 | +3. **Result Communication Layer**: Use Protocol Buffer-defined structures for test result transmission |
| 206 | +
|
| 207 | +##### Core Components |
| 208 | +
|
| 209 | +1. **eBPF Unit Test Structure**: |
| 210 | + - **PKTGEN Section**: Generate test data packets to simulate network input |
| 211 | + - **JUMP Section**: Configure the initial state, tail call the BPF program being tested |
| 212 | + - **CHECK Section**: Verify test results and perform assertion checks |
| 213 | + - **Memory Data Exchange**: Transfer data between BPF programs and Golang user space through eBPF maps |
| 214 | +
|
| 215 | +2. **Go Test Driver**: |
| 216 | + - **unittest Class**: The unittest structure represents an eBPF unit test, containing test name, eBPF object file name, and user space setup function |
| 217 | + - **Program Loader**: Use the `cilium/ebpf` library to load compiled eBPF object files |
| 218 | + - **Test Executor**: Call BPF programs, passing test data and context |
| 219 | + - **Result Parser**: Read and parse test results from eBPF maps |
| 220 | +
|
| 221 | +3. **Result Formatting**: |
| 222 | + - Use the `SuiteResult` Protocol Buffer message to define the test result structure |
| 223 | + - Support test logs and test status (pass, fail, skip, error) |
| 224 | +
|
| 225 | +##### Test Execution Flow |
| 226 | +
|
| 227 | +1. **Test Loading**: Load eBPF Collection for each unittest object |
| 228 | +2. **Test Preparation**: Run the `setupInUserSpace` logic of the unittest object to initialize the test environment |
| 229 | +3. **Program Classification**: Categorize programs into `jump`, `check`, and `pktgen` types based on section names |
| 230 | +4. **Test Execution**: |
| 231 | + - If `setupInUserSpace` exists in the user-space Go unittest object, run it first to set up the test environment |
| 232 | + - If `pktgen` exists, run it to generate test data |
| 233 | + - Then run `jump` to execute the BPF program being tested |
| 234 | + - Finally run the `check` program to verify results |
| 235 | +5. **Result Collection**: Read test results from `suite_result_map` |
| 236 | +6. **Result Reporting**: Parse results and generate reports using the Go testing framework |
0 commit comments