-
Notifications
You must be signed in to change notification settings - Fork 702
feat(ai-proxy): add modelMapping regexp support #2358
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Xijun Dai <[email protected]>
feat(ai-proxy): 新增基于正则表达式的模型映射功能变更文件
时序图sequenceDiagram
participant Client as HTTP客户端
participant Envoy as Envoy代理
participant Provider as 模型服务
Client->>Envoy: 发送/v1/chat/completions请求
Envoy->>Envoy: wasm插件处理模型名称
Envoy->>Envoy: 正则匹配模型映射规则(gpt.* → openai/gpt$1)
Envoy->>Provider: 转发请求到openrouter.ai服务
Provider-->>Envoy: 返回处理结果
Envoy-->>Client: 响应结果到客户端
💡 小贴士与 lingma-agents 交流的方式📜 直接回复评论
📜 在代码行处标记
📜 在讨论中提问
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔍 代码评审报告
📋 评审意见详情
💡 单文件建议
✅ 未发现需要特别关注的代码问题。
🚀 跨文件建议
以下是对代码架构和设计的综合分析,聚焦于跨文件交互、系统一致性和潜在优化空间。
🔍 1. 正则表达式在循环中重复编译导致性能问题
在循环中每次调用regexp.MustCompile(k)会重复编译正则表达式,导致不必要的CPU和内存开销。建议将所有模式预编译为regexp.Regexp对象并缓存,避免重复编译。
📌 关键代码:
if strings.Contains(k, "(") && strings.Contains(k, ")") {
re := regexp.MustCompile(k)
...
🔍 2. 匹配策略未模块化影响可扩展性
当前将前缀匹配和正则表达式匹配逻辑混合在同一个循环中,违反开闭原则。建议将匹配策略抽象为独立接口(如Strategy模式),便于未来添加新匹配方式(如通配符组合)而无需修改核心代码。
📌 关键代码:
for k, v := range modelMapping {
// 混合前缀匹配和正则表达式逻辑
...
🔍 3. 正则表达式可能引发ReDoS安全风险
允许任意正则表达式可能导致指数级时间复杂度匹配(如包含大量嵌套分组的模式)。建议对用户提供的正则表达式进行复杂度验证(如限制捕获组数量)或采用安全正则引擎。
🔍 4. 缺乏正则表达式模式合法性校验
当前未对正则表达式语法有效性进行校验,可能导致运行时panic(如语法错误的正则表达式)。建议在配置加载时预校验所有模式的合法性。
📌 关键代码:
re := regexp.MustCompile(k) // 语法错误的k会直接panic
💡 小贴士
与 lingma-agents 交流的方式
📜 直接回复评论
直接回复本条评论,lingma-agents 将自动处理您的请求。例如:
-
在当前代码中添加详细的注释说明。
-
请详细介绍一下你说的 LRU 改造方案,并使用伪代码加以说明。
📜 在代码行处标记
在文件的特定位置创建评论并 @lingma-agents。例如:
-
@Lingma-Agent 分析这个方法的性能瓶颈并提供优化建议。
-
@Lingma-Agent 对这个方法生成优化代码。
📜 在讨论中提问
在任何讨论中 @lingma-agents 来获取帮助。例如:
-
@Lingma-Agent 请总结上述讨论并提出解决方案。
-
@Lingma-Agent 请根据讨论内容生成优化代码。
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2358 +/- ##
===========================================
+ Coverage 35.91% 46.07% +10.16%
===========================================
Files 69 81 +12
Lines 11576 13010 +1434
===========================================
+ Hits 4157 5995 +1838
+ Misses 7104 6670 -434
- Partials 315 345 +30 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
对应的文档也需要更新
} | ||
} | ||
|
||
if strings.Contains(k, "(") && strings.Contains(k, ")") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个的意思是说包含 capture group 的就是正则表达式吗?感觉是不是应该做的更通用一点,比如 ~
开头的就代表是正则表达式(正常模型名字不会以它开头吧)。这样替换的时候也不要求必须是走 capture group 了。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里之所以要走 capture group, 是需要获取到匹配的内容,并用于之后替换到目标模型名称上
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我是说“是不是必须要走 capture group”,而不是“是不是可以走 capture group”,也就是“走 capture group” 是“使用正则映射”的必要还是充分条件。目前这种基于括号的正则表达式判断方式要求的是前者,而你所描述的需求目前我只能看出对后者的要求,看不出前者。因此,建议用更合适的方式来进行这个判断。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我能想到的是必须要用 capture group, 或者有更好的方式实现也可以的
比如我这边的场景是希望能做到如下,以 openai 模型为例:
第一种: gpt-xxx 系列能映射到 openai/gpt-xxx
第二种: 反之 openai/gpt-xxx 也能映射到 gpt-xxx
如果是之前提到的以 ~ 开头代表正则式,第二种好像不太好配置
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
~
开头代表正则表达式只是影响 if strings.Contains(k, "(") && strings.Contains(k, ")") {
这句判定,以及实际做匹配的时候要移除开头的 ~
,其他的和你现在的是一样的。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
而且,有括号不代表一定是正则,建议这里还是用更明确的语义来进行配置。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
那么就是以 ~ 开头代表使用正则匹配,而不是判断是否带了 "()",但 capture group 的能力我觉得还是需要保留
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
…matching in modelMapping Signed-off-by: Xijun Dai <[email protected]>
Signed-off-by: Xijun Dai <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Ⅰ. Describe what this PR did
openrouter.ai 中模型调用是需要指定 Provider 前缀的,如 openai/gpt-4o-mini
我们有场景需要在 openai 不可用时,fallback 到 openrouter.ai,这时需要保持调用模型名不变,但需要加上 openai/前缀
所以希望是通过正则来做替换,而不是单纯的前缀匹配
比如 gpt(.*) 需要替换为 openai/gptXXX
Ⅱ. Does this pull request fix one issue?
Ⅲ. Why don't you add test cases (unit test/integration test)?
Ⅳ. Describe how to verify it
docker-compose.yaml
envoy.yaml