Skip to content

[Bug] FE CPU 100% #50838

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
2 of 3 tasks
wawa8900 opened this issue May 13, 2025 · 10 comments
Open
2 of 3 tasks

[Bug] FE CPU 100% #50838

wawa8900 opened this issue May 13, 2025 · 10 comments

Comments

@wawa8900
Copy link

Search before asking

  • I had searched in the issues and found no similar issues.

Version

2.0.12

What's Wrong?

几乎没有请求的情况下,FE的cpu超100%
top:
Image

top -H:

Image

Image

What You Expected?

CPU 正常

How to Reproduce?

No response

Anything Else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@XLPE
Copy link
Contributor

XLPE commented May 13, 2025

It is possible that System.gc() was explicitly called due to an insufficient -Xmx setting or an improper allocation ratio between the young and old generations.
It is recommended to use the G1 garbage collector configuration from a newer version of Doris and increase the -Xmx setting based on the machine's memory.

@yiguolei
Copy link
Contributor

And sometimes it is fullgc. You could use jstat command to check if FE is doing fullgc.

@wawa8900
Copy link
Author

fe的日志一直在报这些日志,是什么意思呢?
2025-05-13 16:36:30,904 INFO (stateListener|80) [DatabaseTransactionMgr.replayUpsertTransactionState():2035] replay a committed transaction TransactionState. transaction id: 11208376, label: insert_b46ae313eff43d4_8b51c42ef4f9e791, db id: 10369, table id list: 35751446, callback id: -1, coordinator: FE: 172.177.2.10, transaction status: COMMITTED, error replicas num: 0, replica ids: , prepare time: 1743057801345, commit time: 1743057801353, finish time: -1, reason:
2025-05-13 16:36:30,904 INFO (stateListener|80) [DatabaseTransactionMgr.replayUpsertTransactionState():2038] replay a visible transaction TransactionState. transaction id: 11208376, label: insert_b46ae313eff43d4_8b51c42ef4f9e791, db id: 10369, table id list: 35751446, callback id: -1, coordinator: FE: 172.177.2.10, transaction status: VISIBLE, error replicas num: 0, replica ids: , prepare time: 1743057801345, commit time: 1743057801353, finish time: 1743057801372, reason:
2025-05-13 16:36:30,904 INFO (stateListener|80) [LoadManager.replayCreateLoadJob():186] LOAD_JOB=35751450, msg={replay create load job}
2025-05-13 16:36:30,904 INFO (stateListener|80) [DatabaseTransactionMgr.replayUpsertTransactionState():2035] replay a committed transaction TransactionState. transaction id: 11208377, label: insert_cd10a0fb12634de1_9cc1776464edaa8e, db id: 10369, table id list: 1913141, callback id: -1, coordinator: FE: 172.177.2.10, transaction status: COMMITTED, error replicas num: 0, replica ids: , prepare time: 1743057801377, commit time: 1743057801383, finish time: -1, reason:
2025-05-13 16:36:30,904 INFO (stateListener|80) [DatabaseTransactionMgr.replayUpsertTransactionState():2038] replay a visible transaction TransactionState. transaction id: 11208377, label: insert_cd10a0fb12634de1_9cc1776464edaa8e, db id: 10369, table id list: 1913141, callback id: -1, coordinator: FE: 172.177.2.10, transaction status: VISIBLE, error replicas num: 0, replica ids: , prepare time: 1743057801377, commit time: 1743057801383, finish time: 1743057801395, reason:
2025-05-13 16:36:30,904 INFO (stateListener|80) [LoadManager.replayCreateLoadJob():186] LOAD_JOB=35751451, msg={replay create load job}
2025-05-13 16:36:30,904 INFO (stateListener|80) [DatabaseTransactionMgr.replayUpsertTransactionState():2035] replay a committed transaction TransactionState. transaction id: 11208378, label: insert_c816ed03b3a043d8_a4591498edfc04d8, db id: 10369, table id list: 1912980, callback id: -1, coordinator: FE: 172.177.2.10, transaction status: COMMITTED, error replicas num: 0, replica ids: , prepare time: 1743057801401, commit time: 1743057801406, finish time: -1, reason:
2025-05-13 16:36:30,904 INFO (stateListener|80) [DatabaseTransactionMgr.replayUpsertTransactionState():2038] replay a visible transaction TransactionState. transaction id: 11208378, label: insert_c816ed03b3a043d8_a4591498edfc04d8, db id: 10369, table id list: 1912980, callback id: -1, coordinator: FE: 172.177.2.10, transaction status: VISIBLE, error replicas num: 0, replica ids: , prepare time: 1743057801401, commit time: 1743057801406, finish time: 1743057801418, reason:
2025-05-13 16:36:30,904 INFO (stateListener|80) [LoadManager.replayCreateLoadJob():186] LOAD_JOB=35751452, msg={replay create load job}
2025-05-13 16:36:30,905 INFO (stateListener|80) [DatabaseTransactionMgr.replayUpsertTransactionState():2035] replay a committed transaction TransactionState. transaction id: 11208379, label: insert_21c0bbb1d30d480a_bbdad031908e57b2, db id: 10369, table id list: 11055514, callback id: -1, coordinator: FE: 172.177.2.10, transaction status: COMMITTED, error replicas num: 0, replica ids: , prepare time: 1743057801425, commit time: 1743057801444, finish time: -1, reason:
2025-05-13 16:36:30,905 INFO (stateListener|80) [DatabaseTransactionMgr.replayUpsertTransactionState():2038] replay a visible transaction TransactionState. transaction id: 11208379, label: insert_21c0bbb1d30d480a_bbdad031908e57b2, db id: 10369, table id list: 11055514, callback id: -1, coordinator: FE: 172.177.2.10, transaction status: VISIBLE, error replicas num: 0, replica ids: , prepare time: 1743057801425, commit time: 1743057801444, finish time: 1743057801461, reason:

@wawa8900
Copy link
Author

一直在不断的报上边的日志

@wawa8900
Copy link
Author

还有就是停止fe之后,再启动fe,好长时间才能启动完成(好长时间之后 http_port和query_port才被监听)

@wawa8900
Copy link
Author

And sometimes it is fullgc. You could use jstat command to check if FE is doing fullgc.

jstack的输出
Image

@XLPE
Copy link
Contributor

XLPE commented May 13, 2025

And sometimes it is fullgc. You could use jstat command to check if FE is doing fullgc.

jstack的输出 Image

M column is 98.17%, suggesting that increase -Xmx.
recommended to set Xmx to more than 16GB.

@wawa8900
Copy link
Author

wawa8900 commented May 13, 2025

And sometimes it is fullgc. You could use jstat command to check if FE is doing fullgc.

jstack的输出 Image

M column is 98.17%, suggesting that increase -Xmx. recommended to set Xmx to more than 16GB.

但是数据量很小,be的storage目录一共才1.6G,这么点数据也需要16G内存吗

我把-Xmx改成了16g,还是一样的。

fe启动的时候,一直在不停的出下面的日志,而且fe的启动时间很长,从执行启动命令到http端口被监听,至少需要半小时

Image

@wawa8900
Copy link
Author

fe下的doris_meta目录有32G。be的数据只有1.6G,但是fe的meta有32G,表也不多(总共大概200个),这是正常的吗?
怎样查看这些meta数据中都有哪些内容?

Image

@XLPE
Copy link
Contributor

XLPE commented May 13, 2025

These are normal logs. Check if the CPU usage is still consistently high.
may be jvm memory is insufficient, fail to generate image, and the historical journal keeps growing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants