We are delighted to announce the release of Kmesh v1.1.0, a milestone achieved through the collective efforts of our global community over the past three months. Special recognition goes to the contributors from the LXF Project, whose dedication has been pivotal in driving this release forward.
Building on the foundation of v1.0.0, this release introduces significant enhancements to Kmesh’s architecture, observability, and ecosystem integration. The official Kmesh website has undergone a comprehensive redesign, offering an intuitive interface and streamlined documentation to empower both users and developers. Under the hood, we’ve refactored the DNS module and added metrics for long connections, providing deeper insights into more traffic patterns.
In Kernel-Native mode, we’ve reduced invasive kernel modifications. Also, we use global variables to replace the BPF config map to simplify the underlying complexity. Compatibility with Istio 1.25 has been rigorously validated, ensuring seamless interoperability with the latest Istio version. Notably, the persistent TestKmeshRestart E2E test case flaky—a long-standing issue—has been resolved through long-term investigation and reconstruction of the underlying BPF program, marking a leap forward in runtime reliability.
Main Features
Website overhaul
The Kmesh official website has undergone a complete redesign, offering an intuitive user experience with improved documentation, reorganized content hierarchy and streamlined navigation. In addressing feedback from the previous iteration, we focused on key areas where user experience could be enhanced. The original interface presented some usability challenges that occasionally led to navigation difficulties. Our blog module in particular required attention, as its content organization and visual hierarchy impacted content discoverability and readability. From an engineering perspective, we recognized opportunities to improve the code structure through better component organization and more systematic styling approaches, as the existing implementation had grown complex to maintain over time.
To address these problems, we shifted to React with Docusaurus, a modern documentation framework that's much more developer-friendly. This allowed us to create modular components, eliminating redundant code through reusability. Docusaurus provides built-in navigation systems specifically designed for documentation and blogs, plus version-controlled documentation features. We've implemented multilingual support with both English and Chinese documentation, added advanced search functionality, and completely reorganized the content structure. The result is a dramatically improved experience that makes the Kmesh site more accessible and valuable for all users.
Long connection metrics
Before this release, Kmesh provides access logs during termination and establishment of a TCP connection with more detailed information about the connection, such as bytes sent, received, packet lost, rtt and retransmits. Kmesh also provides workload and service specific metrics such as bytes sent and received, lost packets, minimum rtt, total connection opened and closed by a pod. These metrics are only updated after a connection is closed.
In this release, we implement access logs and metrics for TCP long connections, developing a continuous monitoring and reporting mechanism that captures detailed, real-time data throughout the lifetime of long-lived TCP connections. Access logs are reported periodically with information such as reporting time, connection establishment time, bytes sent, received, packet loss, rtt, retransmits and state. Metrics such as bytes sent and received, packet loss, retransmits are also reported periodically for long connections.
DNS refactor
The current DNS process includes the CDS refresh process. As a result, DNS is deeply coupled with kernel-native mode and cannot be used in dual-engine mode.
In release 1.1 we refactored the DNS module of Kmesh. Instead of a structure containing cds, the data looped through the refresh queue in the Dns is now a domain, so that the Dns module no longer cares about the Kmesh mode, only providing the hostname to be resolved.
BPF config map optimization
Kmesh has eliminated the dedicated kmesh_config_map BPF map, which previously stored global runtime configurations such as BPF logging level and monitoring toggle. These settings are now managed through global variables. Leveraging global variables simplifies BPF configuration management, enhancing runtime efficiency and maintainability.
Optimise Kernel Native mode to reduce intrusive modifications to the kernel
The kernel-native mode requires a large number of intrusive kernel reconstructions to implement HTTP-based traffic control. Some of these modifications may have a significant impact on the kernel, which makes the kernel-native mode difficult to deploy and use in a real production environment.
To resolve this problem, we have modified the kernel in kernel-native mode and the involved ko and eBPF synchronously. Through the optimization of this release. In kernel 5.10, the kernel modification is limited to four, and in kernel 6.6, the kernel modification is reduced to only one. This last one will be eliminated as much as possible, with the goal of eventually running kernel-native mode on native version 6.6 and above.
Adopt istio 1.25
Kmesh has verified compatibility with istio 1.25 and has added the corresponding E2E test to CI. The Kmesh community maintains verification of the three istio versions in CI, so the E2E test of istio 1.22 has been removed from CI.
Critical Bug Fix
-
kmeshctl install waypoint error (#1287)
-
TestKmeshRestart flaky (#1192)
-
TestServiceEntrySelectsWorkloadEntry flaky (#1352)
What's Changed
- improve xdp bpf log by @weli-l in #1158
- Ability to automatically push helm packages at publicize release by @LiZhenCheng9527 in #1174
- Can specify the out name for kmeshctl by @LiZhenCheng9527 in #1176
- fix DATA RACE in TestCertRoute by @lec-bit in #1168
- add scripts to change kmesh version automatically by @LiZhenCheng9527 in #1183
- adapt MAP_SIZE_OF_LISTENER into 8192 by @lec-bit in #1187
- fix Update mode failed by @lec-bit in #1188
- Bump google.golang.org/grpc from 1.69.0 to 1.69.4 by @dependabot in #1179
- Bump golang.org/x/net to address CVE-2024-45338 by @hzxuzhonghu in #1193
- Bump google.golang.org/protobuf from 1.36.1 to 1.36.3 by @dependabot in #1191
- fix kernel_enhanced lack of pkg general by @lec-bit in #1199
- bump version to 1.1-dev by @hzxuzhonghu in #1197
- Bump github.com/cilium/ebpf from 0.16.0 to 0.17.1 by @hzxuzhonghu in #1205
- Update meeting in README.md by @hzxuzhonghu in #1196
- add workload metrics by @LiZhenCheng9527 in #1105
- Bump the k8s-io group with 5 updates by @dependabot in #1201
- Fix typos by @hzxuzhonghu in #1219
- Improve Authz UX: Immediate Feedback & Status Subcommand by @ravjot07 in #1217
- chore: solved contruct_tuple typo #1221 by @yp969803 in #1223
- Modify the kmeshctl documentation according to make gen by @LiZhenCheng9527 in #1241
- add LiZhenCheng9527 to OWNERS by @LiZhenCheng9527 in #1244
- Using global variable to control bpf log level by @hzxuzhonghu in #1206
- adapt BPF_LOG in route_config.h by @lec-bit in #1220
- fix bookinfo issue 553 by @weli-l in #1245
- add sample yamls for useguide by @weli-l in #1248
- Bump github.com/cilium/ebpf from 0.17.1 to 0.17.3 by @dependabot in #1237
- Bump the k8s-io group with 5 updates by @dependabot in #1250
- new kernel adapt by @lec-bit in #1198
- Bump github.com/safchain/ethtool from 0.5.9 to 0.5.10 by @dependabot in #1252
- Bump github.com/go-jose/go-jose/v3 from 3.0.3 to 3.0.4 in the go_modules group by @dependabot in #1255
- Bump github.com/prometheus/client_golang from 1.20.5 to 1.21.0 by @dependabot in #1254
- feat: dump authorizationPolicy by @yp969803 in #1222
- Bump google.golang.org/grpc from 1.69.4 to 1.70.0 by @dependabot in #1258
- Bump istio.io/api from 1.24.2 to 1.24.3 by @dependabot in #1259
- optimizie xdp auth by @weli-l in #1256
- adapt doc by @lec-bit in #1268
- enable auth offload by default by @weli-l in #1274
- adapt bpf2go files in new kernel by @lec-bit in #1273
- pretty print bpf dump by @Kuromesi in #1279
- Proposal for tcp_long_connection_metrics by @yp969803 in #1224
- Make use of global variables instead of bpf config map by @hzxuzhonghu in #1263
- add channel for Rbac when Kmesh restart by @weli-l in #1275
- fix getKmeshWaypointImage tag by @silenceper in #1288
- fix possible e2e panic and update maxportnum warning by @Kuromesi in #1283
- Remove spammy log by @hzxuzhonghu in #1285
- Fix bpf log level by @hzxuzhonghu in #1291
- remove deprecated flag
--enable-bpf-log
in e2e test by @YaoZengzeng in #1290 - Refactor dns by @LiZhenCheng9527 in #1247
- eBPF unit test: add proposal by @sancppp in #1267
- update ubuntu 2404 by @hzxuzhonghu in #1302
- add e2e tests for isito 1.25 and remove e2e tests for isito 1.22 by @YaoZengzeng in #1297
- store original dst info in map_of_sock_storage and remove map_of_orig_dst by @YaoZengzeng in #1303
- Tcp long conn metrics and accesslogs (new) by @yp969803 in #1298
- Add ebpf ut framework & Add xdp ut by @sancppp in #1299
- fix build error:fatal error: 'encoder.h' file not found by @LiZhenCheng9527 in #1306
- fix: Updating connOpened metric for the conn report at intervals #1307 by @yp969803 in #1308
- Skip test on 5.15 temporarily because ubuntu20.04 runner is removed by @hzxuzhonghu in #1312
- fix kmesh daemon commands description by @LiZhenCheng9527 in #1314
- rfac: added gomft in makefile by @yp969803 in #1324
- Fix dead link by @hzxuzhonghu in #1320
- Fix: cgroup_skb pogs should respect enable_monitoring flag by @hzxuzhonghu in #1313
- feat: added new prometheus metrics for long conn by @yp969803 in #1305
- log for failed metrics enable/disable kmeshctl request by @yp969803 in #1327
- fix enhanced issue by @lec-bit in #1332
- fix helper func num by @lec-bit in #1336
- rfac: added config.h include in tcp_probe.h and cgroup_skb.h by @yp969803 in #1335
- rfac: initialized enable_monitoring to 1 by @yp969803 in #1337
- Fix kmesh manage, especially after kmesh restart, old xdp prog maybe … by @hzxuzhonghu in #1347
- fix workload processer panic if newWorkload addresses is nil when add serviceEntry by @LiZhenCheng9527 in #1345
- e2e test for l4 authorization by @YaoZengzeng in #1349
- Revert "e2e test for l4 authorization" by @hzxuzhonghu in #1353
- fix: values is not updated in tcp_conn map #1341 by @yp969803 in #1342
- Fix cgroup_skb/* get sk_storage failed by @hzxuzhonghu in #1350
- bash format added in make format by @yp969803 in #1329
- Update quick-start link by @sancppp in #1359
- added periodic report knob for long connection by @yp969803 in #1360
- eBPF unit test: add workload sockops ut by @sancppp in #1361
- prevent deletion of addresses that are already occupied by other objects by @YaoZengzeng in #1358
- remove BpfLoader.KmeshConfig to prevent writing using it, should alwa… by @hzxuzhonghu in #1351
- eBPF unit test: add general tc ut by @sancppp in #1362
- fix log cannot print in kernel 5.10 by bpf_trace_printk by @lec-bit in #1363
- fix: enableConnMetric knob at monitoring handler by @yp969803 in #1372
- Skip output accesslog on connection establishment by @hzxuzhonghu in #1368
- rfac: outputAccesslog func to skip accesslog at conn establishment by @yp969803 in #1376
- Fix: unexpected log when pod is shutting down by @hzxuzhonghu in #1383
- fix kmeshctl version report error by @LiZhenCheng9527 in #1371
- e2e for l4 auth by @YaoZengzeng in #1365
- Fix regression by socket tuple when c/s deployed in same node by @hzxuzhonghu in #1378
- use bytes_acked instead of delivered to get the sent bytes, it is acc… by @hzxuzhonghu in #1391
- build(deps): bump github.com/miekg/dns from 1.1.62 to 1.1.66 by @dependabot in #1380
- Fix authz during shutdown by @hzxuzhonghu in #1395
New Contributors
- @ravjot07 made their first contribution in #1217
- @yp969803 made their first contribution in #1223
- @silenceper made their first contribution in #1288
- @sancppp made their first contribution in #1267
Full Changelog: v1.0.0...v1.1.0