一:概述
使用QEMU+GDB的方式调试内核,与传统的方式相比,具有以下显著的优点:
- 可以极大的提高研究内核机制的效率
- 可以动态的分析内核的执行流程(可设置断点)
二:环境搭建步骤
在使用本方法之前,要先准备一个待测的虚拟机,如何创建虚拟机请参考相关文档完成。
Step1:编译内核,并确保如下配置选项正确【在虚拟机内执行】
1) 使能kernel debug
Kernel hacking --->
[*] Kernel debugging
2) 关闭KASLR(关闭地址随机化,不然断点处无法停止)
Processor type and features --->
[ ] Randomize the address of the kernel image (KASLR)
Step2:编译qemu,并使能debug【在host上执行,要同时debug qemu侧的代码执行,若不需要可不执行】
Configure时增加--enable-debug参数
Step3:更新环境上的qemu【host上执行,要同时debug qemu侧的代码执行,若不需要可不执行】
- 以centos为例,安装以下包
qemu-kvm-common-ev-2.12.0-18.el7_6.7.1.x86_64.rpm
qemu-kvm-ev-2.12.0-18.el7_6.7.1.x86_64.rpm
Step4:更新虚拟机内核【虚拟机内执行】
- 以centos为例,安装内核包
rpm –Uvh kernel-3.10.0-957.5.1.el7.x86_64.rpm
Setp5:修改虚拟机配置使能gdb调试
- 1)qemu命令行方式启动虚拟机
- 如果是以qemu命令行方式启动的虚拟机,则在命令行中增加如下参数
- -s
- 是-gdb tcp::1234缩写,监听1234端口,在GDB中可以通过target remote localhost:1234连接
- -serial stdio
- 将qemu模拟的虚拟机串口数据输出到控制台,此方式可与虚拟机在控制台直接交互
- 2)xml方式启动虚拟机
- 如果是以xml方式启动虚拟机,则需要对xml做如下修改
- <domain type='kvm'> 修改为 <domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
- 如果是以xml方式启动虚拟机,则需要对xml做如下修改
增加如下qemu命令行参数
<qemu:commandline>
<qemu:arg value='-gdb’/>
<qemu:arg value='tcp::1234’/>
</qemu:commandline>
完整的配置如下:
<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
…
…
<qemu:commandline>
<qemu:arg value='-gdb'/>
<qemu:arg value='tcp::1234'/>
</qemu:commandline>
</domain>
Setp6:修改虚拟机grub选项,增加串口相关参数
有两种修改虚拟机grub配置的方式,如下:
- 离线修改方式【host上执行】
- 1.mount虚拟机镜像到本地目录 -a指定镜像文件 -i指定mount点
- 命令:guestmount -a <img file> -i <mount point>
- 示例:
[root@bj03-no_use-172e28e205e64 vms]# guestmount -a buildenv-online-new.qcow2 -i /mnt
- 2.进入mount修改需要修改的文件
[root@bj03-no_use-172e28e205e64 vms]# cd /mnt/
- 3.修改grub配置,增加串口相关参数
- 启动项中增加参数:console=ttyS0,115200
- 4.修改完成卸载挂载点
- 命令:guestunmount <mount point>
- 示例:
[root@bj03-no_use-172e28e205e64 vms]# guestunmount /mnt
- 在线修改方式【虚拟机内部执行】
- 在虚拟机内部直接修改grub配置文件,启动项中增加参数:console=ttyS0,115200
Setp7:将虚拟机内核代码同步到host上
将虚拟机内整个kernel编译目录拷贝到host上
Setp8:启动虚拟机,使用gdb进行调试
- 以xml方式启动虚拟机为例
- 1.virsh create xxx.xml
- 2.用gdb连接虚拟机
- gdb vmlinux 【此vmlinux为虚拟机内核编译生成的vmlinux】
- 示例:
-
gdb /root/rpmbuild/BUILD/kernel-3.10.0-957.5.1.el7/linux-3.10.0-957.5.1.el7.x86_64/vmlinux
- (gdb) target remote :1234
连接上虚拟机后,可以用b命令设置断点进行调试了,示例如下:
三:问题与解答
- 问题1:如何在后端gdb加载虚拟机内核模块符号?
- 以virio_blk.ko为例
- 1.在虚拟机内部,获取.text地址
- 命令:cat /sys/module/virtio_blk/sections/.text
- 示例:
-
[root@wxb-compile ~]# cat sys/module/virtio_blk/sections/.text 0xffffffffa0190000
- 2.在后端gdb界面加载符号
- gdb命令为: add-symbol-file <path of virtko_blk.ko> <.text addr>
- 3.设置断点
- gdb命令为: b <break point>
- 问题2:gdb报错Remote ‘g’ packet reply is too long如何解决?
- 报错信息
- 解决办法
1.下载源码 wget http://ftp.gnu.org/gnu/gdb/gdb-7.6.1.tar.gz
2.解压并修改代码
1)tar xf gdb-7.6.1.tar.gz
2) cd gdb-7.6.1
3) vi gdb/remote.c à把status 1处的代码替换为status 2,如下红色所示
3.重新编译并安装gdb
四:使用示例
- 以打点virtio-blk通知后端流程为例
- 虚拟机内部获取相关模块的.text段地址(虚拟机内部执行)
[root@wxb-compile ~]# cat /sys/module/virtio/sections/.text
0xffffffffa002b000
[root@wxb-compile ~]# cat /sys/module/virtio_ring/sections/.text
0xffffffffa0046000
[root@wxb-compile ~]# cat /sys/module/virtio_pci/sections/.text
0xffffffffa003f000
[root@wxb-compile ~]# cat /sys/module/virtio_blk/sections/.text
0xffffffffa00e8000
[root@wxb-compile ~]#
- 后端gdb加载相关模块 (后端GDB执行)
(gdb) add-symbol-file /root/rpmbuild/BUILD/kernel-3.10.0-957.5.1.el7/linux-3.10.0-957.5.1.el7.x86_64/drivers/block/virtio_blk.ko 0xffffffffa00e8000
add symbol table from file "/root/rpmbuild/BUILD/kernel-3.10.0-957.5.1.el7/linux-3.10.0-957.5.1.el7.x86_64/drivers/block/virtio_blk.ko" at
.text_addr = 0xffffffffa00e8000
(y or n) y
Reading symbols from /root/rpmbuild/BUILD/kernel-3.10.0-957.5.1.el7/linux-3.10.0-957.5.1.el7.x86_64/drivers/block/virtio_blk.ko...done.
(gdb) add-symbol-file /root/rpmbuild/BUILD/kernel-3.10.0-957.5.1.el7/linux-3.10.0-957.5.1.el7.x86_64/drivers/virtio/virtio.ko 0xffffffffa002b000
add symbol table from file "/root/rpmbuild/BUILD/kernel-3.10.0-957.5.1.el7/linux-3.10.0-957.5.1.el7.x86_64/drivers/virtio/virtio.ko" at
.text_addr = 0xffffffffa002b000
(y or n) y
Reading symbols from /root/rpmbuild/BUILD/kernel-3.10.0-957.5.1.el7/linux-3.10.0-957.5.1.el7.x86_64/drivers/virtio/virtio.ko...done.
(gdb) add-symbol-file /root/rpmbuild/BUILD/kernel-3.10.0-957.5.1.el7/linux-3.10.0-957.5.1.el7.x86_64/drivers/virtio/virtio_ring.ko 0xffffffffa0046000
add symbol table from file "/root/rpmbuild/BUILD/kernel-3.10.0-957.5.1.el7/linux-3.10.0-957.5.1.el7.x86_64/drivers/virtio/virtio_ring.ko" at
.text_addr = 0xffffffffa0046000
(y or n) y
Reading symbols from /root/rpmbuild/BUILD/kernel-3.10.0-957.5.1.el7/linux-3.10.0-957.5.1.el7.x86_64/drivers/virtio/virtio_ring.ko...done.
(gdb) add-symbol-file /root/rpmbuild/BUILD/kernel-3.10.0-957.5.1.el7/linux-3.10.0-957.5.1.el7.x86_64/drivers/virtio/virtio_pci.ko 0xffffffffa003f000
add symbol table from file "/root/rpmbuild/BUILD/kernel-3.10.0-957.5.1.el7/linux-3.10.0-957.5.1.el7.x86_64/drivers/virtio/virtio_pci.ko" at
.text_addr = 0xffffffffa003f000
(y or n) y
Reading symbols from /root/rpmbuild/BUILD/kernel-3.10.0-957.5.1.el7/linux-3.10.0-957.5.1.el7.x86_64/drivers/virtio/virtio_pci.ko...done.
(gdb)
- 设置断点【后端GDB执行】
(gdb) b vp_notify
Breakpoint 1 at 0xffffffffa0040b70: file drivers/virtio/virtio_pci_common.c, line 45.
(gdb) c
Continuing.
- 发起IO操作【虚拟机内部】
[root@wxb-compile ~]# hexdump -C /dev/vda -n 12
- 触发断点【后端GDB】
Breakpoint 1, vp_notify (vq=0xffff880819974000) at drivers/virtio/virtio_pci_common.c:45
45 {
(gdb)
- 查看函数调用栈【后端GDB】
(gdb) bt
#0 vp_notify (vq=0xffff880819974000) at drivers/virtio/virtio_pci_common.c:45
#1 0xffffffffa004607c in virtqueue_notify (_vq=0xffff880819974000) at drivers/virtio/virtio_ring.c:550
#2 0xffffffffa00e885e in virtio_queue_rq (hctx=0xffff880819e0e000, bd=0xffff880812aeb8d0) at drivers/block/virtio_blk.c:230
#3 0xffffffff813c8c24 in blk_mq_dispatch_rq_list (q=q@entry=0xffff880819829040, list=list@entry=0xffff880812aeb928, got_budget=got_budget@entry=true) at block/blk-mq.c:1205
#4 0xffffffff813ce6ae in blk_mq_do_dispatch_sched (hctx=hctx@entry=0xffff880819e0e000) at block/blk-mq-sched.c:203
#5 0xffffffff813cf29e in blk_mq_sched_dispatch_requests (hctx=hctx@entry=0xffff880819e0e000) at block/blk-mq-sched.c:300
#6 0xffffffff813c6c39 in __blk_mq_run_hw_queue (hctx=hctx@entry=0xffff880819e0e000) at block/blk-mq.c:1309
#7 0xffffffff813c6d98 in __blk_mq_delay_run_hw_queue (hctx=hctx@entry=0xffff880819e0e000, async=async@entry=false, msecs=msecs@entry=0) at block/blk-mq.c:1348
#8 0xffffffff813c6f43 in blk_mq_run_hw_queue (hctx=hctx@entry=0xffff880819e0e000, async=async@entry=false) at block/blk-mq.c:1385
#9 0xffffffff813cf891 in blk_mq_sched_insert_requests (q=q@entry=0xffff880819829040, ctx=ctx@entry=0xffff88081efe6c80, list=list@entry=0xffff880812aebac8, run_queue_async=run_queue_async@entry=false) at block/blk-mq-sched.c:527
#10 0xffffffff813c9ed9 in blk_mq_flush_plug_list (plug=plug@entry=0xffff880812aebbf8, from_schedule=from_schedule@entry=false) at block/blk-mq.c:1630
#11 0xffffffff813bd25e in blk_flush_plug_list (plug=plug@entry=0xffff880812aebbf8, from_schedule=from_schedule@entry=false) at block/blk-core.c:3524
#12 0xffffffff813bdac4 in blk_finish_plug (plug=plug@entry=0xffff880812aebbf8) at block/blk-core.c:3586
#13 0xffffffff8120b61f in read_pages (nr_pages=4, pages=0xffff880812aebbe8, filp=0xffff88080af873c0, mapping=0xffff88017f810a40) at mm/readahead.c:140
#14 __do_page_cache_readahead (mapping=mapping@entry=0xffff88017f810a40, filp=filp@entry=0xffff88080af873c0, offset=0, nr_to_read=4, lookahead_size=<optimized out>) at mm/readahead.c:202
#15 0xffffffff8120b83d in ra_submit (filp=0xffff88080af873c0, mapping=0xffff88017f810a40, ra=0xffff88080af874c0) at mm/readahead.c:251
#16 ondemand_readahead (mapping=mapping@entry=0xffff88017f810a40, ra=0xffff88080af874c0, filp=filp@entry=0xffff88080af873c0, hit_readahead_marker=hit_readahead_marker@entry=false, offset=offset@entry=0, req_size=req_size@entry=1) at mm/readahead.c:485
#17 0xffffffff8120bc67 in page_cache_sync_readahead (mapping=mapping@entry=0xffff88017f810a40, ra=ra@entry=0xffff88080af874c0, filp=filp@entry=0xffff88080af873c0, offset=offset@entry=0, req_size=req_size@entry=1) at mm/readahead.c:518
#18 0xffffffff811fd752 in do_generic_file_read (actor=0xffffffff811fb550 <file_read_actor>, desc=0xffff880812aebd80,
---Type <return> to continue, or q <return> to quit---
ppos=0xffff880812aebe50, filp=0xffff88080af873c0) at mm/filemap.c:1764
#19 generic_file_aio_read (iocb=iocb@entry=0xffff880812aebe18, iov=iov@entry=0xffff880812aebe08, nr_segs=1, pos=<optimized out>) at mm/filemap.c:2124
#20 0xffffffff812e160c in blkdev_aio_read (iocb=0xffff880812aebe18, iov=0xffff880812aebe08, nr_segs=<optimized out>, pos=<optimized out>) at fs/block_dev.c:1667
#21 0xffffffff81298a53 in do_sync_read (filp=<optimized out>, buf=<optimized out>, len=<optimized out>, ppos=0xffff880812aebf18) at fs/read_write.c:441
#22 0xffffffff812994d2 in vfs_read (file=file@entry=0xffff88080af873c0, buf=buf@entry=0x7fd0e6d84000 <Address 0x7fd0e6d84000 out of bounds>, count=<optimized out>, count@entry=4096, pos=pos@entry=0xffff880812aebf18) at fs/read_write.c:465
#23 0xffffffff8129a3ca in SYSC_read (count=4096, buf=0x7fd0e6d84000 <Address 0x7fd0e6d84000 out of bounds>, fd=<optimized out>) at fs/read_write.c:576
#24 SyS_read (fd=<optimized out>, buf=140535202856960, count=4096) at fs/read_write.c:569
#25 <signal handler called>
#26 0x00007fd0e688df70 in ?? ()
#27 0x0000000000000000 in ?? ()
(gdb)