Skip to content

大佬您好,关于RWKV6的代码什么时候可以公开呢? #15

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
15552549353 opened this issue Dec 18, 2024 · 7 comments
Open

Comments

@15552549353
Copy link

大佬您好,关于RWKV6的代码什么时候可以公开呢?

@Yaziwel
Copy link
Owner

Yaziwel commented Dec 24, 2024

代码如下,建议把模型里面的T_Max调小一些,也就是训练的时候用小patch size,测试的时候再设置512*512的T_Max,这样训练速度会快一些
Restore_RWKV6.zip

@15552549353
Copy link
Author

15552549353 commented Dec 26, 2024 via email

@Yaziwel
Copy link
Owner

Yaziwel commented Dec 26, 2024

大佬,这个rwkv6的模型,flops 160.101827G,params 29.617249M,这个计算量是不是有点太大了,体现不出rwkv的线性复杂度 朱丽程 @.***  

------------------ 原始邮件 ------------------ 发件人: "Yaziwel/Restore-RWKV" @.>; 发送时间: 2024年12月24日(星期二) 中午1:40 @.>; @.@.>; 主题: Re: [Yaziwel/Restore-RWKV] 大佬您好,关于RWKV6的代码什么时候可以公开呢? (Issue #15) 代码如下,建议把模型里面的T_Max调小一些,也就是训练的时候用小patch size,测试的时候再设置512512的T_Max,这样训练速度会快一些 Restore_RWKV6.zip — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.**>

rwkv6是比v4计算复杂度高,我之前写了一版一直没改进。flops 160输入分辨率是128×128吗?你可以试着调下模型参数,但cuda算子是固定的

@15552549353
Copy link
Author

15552549353 commented Dec 26, 2024 via email

@15552549353
Copy link
Author

15552549353 commented Dec 27, 2024 via email

@Newones11
Copy link

大佬,请问我运行时候程序编译到3/3时出现“已放弃(核心已转储)”是什么问题呀,该怎么解决嘞

代码如下,建议把模型里面的T_Max调小一些,也就是训练的时候用小patch size,测试的时候再设置512*512的T_Max,这样训练速度会快一些 Restore_RWKV6.zip

@Tiehr2000
Copy link

大佬您好!我试着将rwkv6的block加入Unet中,由于Unet通道数每层翻倍,导致head_size是固定值的话就会有问题,这个怎么办呀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants