1. 解释load averrage

system load avg over the last 1, 5 and 15 minutes

(参考 top(1) - Linux manual page (man7.org))

满载时,并不是1, 而是逻辑cpu的个数. 即2cpu的PC,最大值是2.

(参考 uptime(1) - Linux manual page (man7.org))

       System load averages is the average number of processes that are
       either in a runnable or uninterruptable state.  A process in a
       runnable state is either using the CPU or waiting to use the CPU.
       A process in uninterruptable state is waiting for some I/O
       access, eg waiting for disk.  The averages are taken over the
       three time intervals.  Load averages are not normalized for the
       number of CPUs in a system, so a load average of 1 means a single
       CPU system is loaded all the time while on a 4 CPU system it
       means it was idle 75% of the time.

2.分析

该参数统计有两个process状态. 运行中,不可中断睡眠.

2.1 crash

查看内核线程状态的方法,命令

crash

(需要安装, 并且需要一些依赖包,)

yum install crash

执行crash命令,还需要依赖内核符号表

kernel-debuginfo-3.10.0-957.el7.x86_64.rpm

kernel-debuginfo-common-x86_64-3.10.0-957.el7.x86_64.rpm

查看是否是自己开发的驱动造成 ,thread长期处于RU UN状态.

2.2 bt thread_id

并且分析RU UN的运行位置

3.sleep造成的UN(不可中断性睡眠状态)

(这里不讨论RU状态的解决方法,可能需要优化驱动运行流程.)

如果bt分析到stack在msleep()内部.

内核函数, 注意这里不是用户空间的sleep函数.

msleep(10);

驱动中直接调用msleep函数,会造成线程处理UN状态,造成load average升高.

如果内核线程需要长期sleep,偶尔处理一个操作,那么会造成该cpu的load average直接逼近1.

while(1)
{
msleep(10)
//todo something little
}

但是其实该CPU并没有进行什么繁重的操作. 会误导用户.

4. 解决方法

应该在驱动中尽量避免使用sleep,采用另外的逻辑来处理

4.1. 采用等待队列唤醒

wait_queue_head_t event;

init_waitqueue_head(&event);

wait_event_interruptible(event, kthread_should_stop() || !queue_empty(bq));

wake_up(&event);

4.2.实在要使用睡眠,可采用可中断的睡眠函数

msleep_interruptible(2000);

其缺点是可能被别的信号唤醒,造成睡眠时间不足预期. 不过应该很少遇到.

(参考 delays - Information on the various kernel delay / sleep mechanisms — The Linux Kernel documentation)

4.3.使用内核定时器

(未实验)

#include <linux/timer.h>

5.总结

驱动中造成cpu load average高的原因之一是,长期使用内核函数 msleep(). 避免的办法是不应该在长时段等待中使用不可中断睡眠.

1. 解释load averrage

system load avg over the last 1, 5 and 15 minutes

(参考 top(1) - Linux manual page (man7.org))

满载时,并不是1, 而是逻辑cpu的个数. 即2cpu的PC,最大值是2.

(参考 uptime(1) - Linux manual page (man7.org))

       System load averages is the average number of processes that are
       either in a runnable or uninterruptable state.  A process in a
       runnable state is either using the CPU or waiting to use the CPU.
       A process in uninterruptable state is waiting for some I/O
       access, eg waiting for disk.  The averages are taken over the
       three time intervals.  Load averages are not normalized for the
       number of CPUs in a system, so a load average of 1 means a single
       CPU system is loaded all the time while on a 4 CPU system it
       means it was idle 75% of the time.

2.分析

该参数统计有两个process状态. 运行中,不可中断睡眠.

2.1 crash

查看内核线程状态的方法,命令

crash

(需要安装, 并且需要一些依赖包,)

yum install crash

执行crash命令,还需要依赖内核符号表

kernel-debuginfo-3.10.0-957.el7.x86_64.rpm

kernel-debuginfo-common-x86_64-3.10.0-957.el7.x86_64.rpm

查看是否是自己开发的驱动造成 ,thread长期处于RU UN状态.

2.2 bt thread_id

并且分析RU UN的运行位置

3.sleep造成的UN(不可中断性睡眠状态)

(这里不讨论RU状态的解决方法,可能需要优化驱动运行流程.)

如果bt分析到stack在msleep()内部.

内核函数, 注意这里不是用户空间的sleep函数.

msleep(10);

驱动中直接调用msleep函数,会造成线程处理UN状态,造成load average升高.

如果内核线程需要长期sleep,偶尔处理一个操作,那么会造成该cpu的load average直接逼近1.

while(1)
{
msleep(10)
//todo something little
}

但是其实该CPU并没有进行什么繁重的操作. 会误导用户.

4. 解决方法

应该在驱动中尽量避免使用sleep,采用另外的逻辑来处理

4.1. 采用等待队列唤醒

wait_queue_head_t event;

init_waitqueue_head(&event);

wait_event_interruptible(event, kthread_should_stop() || !queue_empty(bq));

wake_up(&event);

4.2.实在要使用睡眠,可采用可中断的睡眠函数

msleep_interruptible(2000);

其缺点是可能被别的信号唤醒,造成睡眠时间不足预期. 不过应该很少遇到.

(参考 delays - Information on the various kernel delay / sleep mechanisms — The Linux Kernel documentation)

4.3.使用内核定时器

(未实验)

#include <linux/timer.h>

5.总结

驱动中造成cpu load average高的原因之一是,长期使用内核函数 msleep(). 避免的办法是不应该在长时段等待中使用不可中断睡眠.

智算服务

应用商城

定价

合作伙伴

开发者

支持与服务

了解天翼云

linux驱动开发中top命令统计load average高的原因之一

1. 解释load averrage

2.分析

2.1 crash

2.2 bt thread_id

3.sleep造成的UN(不可中断性睡眠状态)

4. 解决方法

4.1. 采用等待队列唤醒

4.2.实在要使用睡眠,可采用可中断的睡眠函数

4.3.使用内核定时器

5.总结

linux驱动开发中top命令统计load average高的原因之一

1. 解释load averrage

2.分析

2.1 crash

2.2 bt thread_id

3.sleep造成的UN(不可中断性睡眠状态)

4. 解决方法

4.1. 采用等待队列唤醒

4.2.实在要使用睡眠,可采用可中断的睡眠函数

4.3.使用内核定时器

5.总结

活动

智算服务

应用商城

定价

合作伙伴

开发者

支持与服务

了解天翼云

linux驱动开发中top命令统计load average高的原因之一

1. 解释load averrage

2.分析

2.1 crash

2.2 bt thread_id

3.sleep造成的UN(不可中断性睡眠状态)

4. 解决方法

4.1. 采用等待队列唤醒

4.2.实在要使用睡眠,可采用可中断的睡眠函数

4.3.使用内核定时器

5.总结

linux驱动开发中top命令统计load average高的原因之一

1. 解释load averrage

2.分析

2.1 crash

2.2 bt thread_id

3.sleep造成的UN(不可中断性睡眠状态)

4. 解决方法

4.1. 采用等待队列唤醒

4.2.实在要使用睡眠,可采用可中断的睡眠函数

4.3.使用内核定时器

5.总结