场景描述
概述
息壤·科研助手是一款适用于高校科研使用场景的一站式科研实训平台,可调度各种类型的计算资源,支持一键部署、随时随地、无需配置开箱即用。科研助手支持用户实现AI教学场景的模型训练、推理、调优等。
实践内容
钢筋是建筑业的重要材料,庞大的数量、工地现场环境复杂以及人工点验错漏等现实因素为钢筋点验工作制造了难度,那么如何才能快速且准确地完成对于整个建筑施工过程极为重要的钢筋点验工作环节呢?本次实践内容为“AI数钢筋”——通过人工智能技术实现钢筋数量统计。所谓“AI数钢筋”就是,通过多目标检测机器视觉方法以实现钢筋数量智能统计,达到提高劳动效率和钢筋数量统计精确性的效果。目标检测算法通过与摄像头结合,可以实现自动钢筋计数,再结合人工修改少量误检的方式,可以智能、高效地完成钢筋计数任务。
步骤简介
教程包括如下步骤:
- 应用商城选购:在科研助手的应用商城中选购自己所需的教学与实践镜像
- 创建开发机:购买开发机,选择所需的资源规格
- 访问开发机并开始实践:在开发机中进行相应的教学与实践
- 获取结果:将实践生成的内容保存至本地
实践步骤
应用商城选购
步骤1:进入科研助手控制台,点击【找应用】:
步骤2:在【找应用】中筛选【教学与实践】类型:
步骤3:选择自己所需的教学与实践内容,我们以“钢筋计数模型训练教学”为例:
创建开发机
步骤1:选择应用后,会跳转至【创建开发机】页面。
在购买过程中,
【主机规格】请按需选择
【存储配置】-【科研文件】如有外部持久化需求,请按需选择
【主机规格】本次案例需要GPU资源,请选择GPU卡,
推荐配置:
厦门4 gn3.m1.12gb CPU: 10核 内存: 20GB
扬州7 gn3.m1.12gb CPU: 10核 内存: 20GB
【镜像框架】默认已选择,无需修改
步骤2:点击【确认订单】,完成开发机创建并启动。
登录开发机
步骤1:购买完成后,可以看见开发机状态显示为【启动中】,等待新创建的开发机状态进入到【运行中】,然后点击右侧操作栏【打开】;
步骤2:点击【打开】跳转到开发机,在Jupyter主页面中默认打开了教学与实践的内容,双击左侧已内置的notebook【钢筋计数模型训练.ipynb】打开:
教学与实践
实践说明:
步骤1:在教学与实践中,我们将每个操作做了一一的拆解和注释,在每一个步骤中,又分为三个子步骤:
子步骤1:选择对应的代码块
子步骤2:点击箭头,将运行选定的代码块中的内容。提示:代码块运行的状态会持续存在,例如定义了一个变量,他再后续的代码块中执行也是持续存在的。
子步骤3:左侧的[ ]符号中会显示执行的状态,空为未执行,“*”为执行中,数字为已完成执行后的编号。
步骤2:等待第一个代码块执行完成后,其他步骤以此类推,将该教学与实践中的剩余代码块按顺序执行完成.
您也可以点击如下双箭头按钮,让程序一次性运行;
依次执行后,可以看到第九个代码块正在执行模型训练,我们可以实时查看它的训练进度。
步骤3:完成模型训练后,可在页面左侧的文件浏览器中下载生成的模型到本地电脑
模型保存在目录:/home/rebar_count/model_snapshots
以下是对教学内容的任务解析:
任务一:环境与数据准备
1、将本实践中所需要的钢筋数据集下载
输入:
import os
if not os.path.exists('./rebar_count'):
print('Downloading code and datasets...')
os.system("wget -N -nv https://jiangsu-10.zos.ctyun.cn/bucket-7262/rebar_count.zip")
os.system("unzip rebar_count.zip;")
if os.path.exists('./rebar_count'):
print('Download code and datasets success')
else:
print('Download code and datasets failed, please check the download url is valid or not.')
else:
print('./rebar_count already exists')
2、加载需要的python模块
输入:
import os
import sys
import cv2
import time
import random
import torch
import numpy as np
from PIL import Image, ImageDraw
import xml.etree.ElementTree as ET
from datetime import datetime
from collections import OrderedDict
import torch.optim as optim
import torch.utils.data as data
import torch.backends.cudnn as cudnn
sys.path.insert(0, './rebar_count/src')
from rebar_count.src.data import VOCroot, VOC_Config, AnnotationTransform, VOCDetection, detection_collate, BaseTransform, preproc
from models.RFB_Net_vgg import build_net
from layers.modules import MultiBoxLoss
from layers.functions import Detect, PriorBox
from utils.visualize import *
from utils.nms_wrapper import nms
from utils.timer import Timer
import matplotlib.pyplot as plt
%matplotlib inline
ROOT_DIR = os.getcwd()
seed = 0
cudnn.benchmark = False
cudnn.deterministic = True
torch.manual_seed(seed) # 为CPU设置随机种子
torch.cuda.manual_seed_all(seed) # 为所有GPU设置随机种子
random.seed(seed)
np.random.seed(seed)
os.environ['PYTHONHASHSEED'] = str(seed) # 设置hash随机种子
任务二:查看训练数据样例
1.查看训练数据样例
输入:
def read_xml(xml_path):
'''读取xml标签'''
tree = ET.parse(xml_path)
root = tree.getroot()
boxes = []
labels = []
for element in root.findall('object'):
label = element.find('name').text
if label == 'steel':
bndbox = element.find('bndbox')
xmin = bndbox.find('xmin').text
ymin = bndbox.find('ymin').text
xmax = bndbox.find('xmax').text
ymax = bndbox.find('ymax').text
boxes.append([xmin, ymin, xmax, ymax])
labels.append(label)
return np.array(boxes, dtype=np.float64), labels
2.显示原图和标注框
输入:
train_img_dir = './rebar_count/datasets/VOC2007/JPEGImages'
train_xml_dir = './rebar_count/datasets/VOC2007/Annotations'
files = os.listdir(train_img_dir)
files.sort()
for index, file_name in enumerate(files[:2]):
img_path = os.path.join(train_img_dir, file_name)
xml_path = os.path.join(train_xml_dir, file_name.split('.jpg')[0] + '.xml')
boxes, labels = read_xml(xml_path)
img = Image.open(img_path)
resize_scale = 2048.0 / max(img.size)
img = img.resize((int(img.size[0] * resize_scale), int(img.size[1] * resize_scale)))
boxes *= resize_scale
plt.figure(figsize=(img.size[0]/100.0, img.size[1]/100.0))
plt.subplot(2,1,1)
plt.imshow(img)
img = img.convert('RGB')
img = np.array(img)
img = img.copy()
for box in boxes:
xmin, ymin, xmax, ymax = box.astype(int)
cv2.rectangle(img, (xmin, ymin), (xmax, ymax), (0, 255, 0), thickness=3)
plt.subplot(2,1,2)
plt.imshow(img)
plt.show()
输出:
任务三:模型训练
1.定义训练超参,模型、日志保存路径
输入:
# 定义训练超参
num_classes = 2 # 数据集中只有 steel 一个标签,加上背景,所以总共有2个类
max_epoch = 25 # 默认值为1,调整为大于20的值,训练效果更佳
batch_size = 4
ngpu = 1
initial_lr = 0.01
img_dim = 416 # 模型输入图片大小
train_sets = [('2007', 'trainval')] # 指定训练集
cfg = VOC_Config
rgb_means = (104, 117, 123) # ImageNet数据集的RGB均值
save_folder = './rebar_count/model_snapshots' # 指定训练模型保存路径
if not os.path.exists(save_folder):
os.mkdir(save_folder)
log_path = os.path.join('./rebar_count/logs', datetime.now().isoformat()) # 指定日志保存路径
if not os.path.exists(log_path):
os.makedirs(log_path)
2.构建模型,定义优化器及损失函数
输入:
net = build_net('train', img_dim, num_classes=num_classes)
if ngpu > 1:
net = torch.nn.DataParallel(net)
net.cuda() # 本案例代码只能在GPU上训练
cudnn.benchmark = True
optimizer = optim.SGD(net.parameters(), lr=initial_lr,
momentum=0.9, weight_decay=0) # 定义优化器
criterion = MultiBoxLoss(num_classes,
overlap_thresh=0.4,
prior_for_matching=True,
bkg_label=0,
neg_mining=True,
neg_pos=3,
neg_overlap=0.3,
encode_target=False) # 定义损失函数
priorbox = PriorBox(cfg)
with torch.no_grad():
priors = priorbox.forward()
priors = priors.cuda()
3.定义自定义学习率函数
输入:
def adjust_learning_rate(optimizer, gamma, epoch, step_index, iteration, epoch_size):
"""
自适应学习率
"""
if epoch < 11:
lr = 1e-8 + (initial_lr-1e-8) * iteration / (epoch_size * 10)
else:
lr = initial_lr * (gamma ** (step_index))
for param_group in optimizer.param_groups:
param_group['lr'] = lr
return lr
4.定义训练函数
输入:
def train():
"""
模型训练函数,每10次迭代打印一次日志,20个epoch之后,每个epoch保存一次模型
"""
net.train()
loc_loss = 0
conf_loss = 0
epoch = 0
print('Loading dataset...')
dataset = VOCDetection(VOCroot, train_sets, preproc(img_dim, rgb_means, p=0.0), AnnotationTransform())
epoch_size = len(dataset) // batch_size
max_iter = max_epoch * epoch_size
stepvalues = (25 * epoch_size, 35 * epoch_size)
step_index = 0
start_iter = 0
lr = initial_lr
for iteration in range(start_iter, max_iter):
if iteration % epoch_size == 0:
if epoch > 20:
torch.save(net.state_dict(), os.path.join(save_folder, 'epoch_' +
repr(epoch).zfill(3) + '_loss_'+ '%.4f' % loss.item() + '.pth'))
batch_iterator = iter(data.DataLoader(dataset, batch_size,
shuffle=True, num_workers=1, collate_fn=detection_collate))
loc_loss = 0
conf_loss = 0
epoch += 1
load_t0 = time.time()
if iteration in stepvalues:
step_index += 1
lr = adjust_learning_rate(optimizer, 0.2, epoch, step_index, iteration, epoch_size)
images, targets = next(batch_iterator)
images = Variable(images.cuda())
targets = [Variable(anno.cuda()) for anno in targets]
# forward
t0 = time.time()
out = net(images)
# backprop
optimizer.zero_grad()
loss_l, loss_c = criterion(out, priors, targets)
loss = loss_l + loss_c
loss.backward()
optimizer.step()
t1 = time.time()
loc_loss += loss_l.item()
conf_loss += loss_c.item()
load_t1 = time.time()
if iteration % 10 == 0:
print('Epoch:' + repr(epoch) + ' || epochiter: ' + repr(iteration % epoch_size) + '/' + repr(epoch_size)
+ '|| Totel iter ' +
repr(iteration) + ' || L: %.4f C: %.4f||' % (
loss_l.item(),loss_c.item()) +
'Batch time: %.4f sec. ||' % (load_t1 - load_t0) + 'LR: %.8f' % (lr))
torch.save(net.state_dict(), os.path.join(save_folder, 'epoch_' +
repr(epoch).zfill(3) + '_loss_'+ '%.4f' % loss.item() + '.pth'))
- 开始训练,每个epoch训练耗时约60秒,这里共运行25个epoch,耗时较久,请耐心等待。
输入:
t1 = time.time()
print('开始训练,本次训练总共需%d个epoch,每个epoch训练耗时约60秒' % max_epoch)
train()
print('training cost %.2f s' % (time.time() - t1))
输出:
开始训练,本次训练总共需25个epoch,每个epoch训练耗时约60秒
Loading dataset...
Epoch:1 || epochiter: 0/50|| Totel iter 0 || L: 3.5865 C: 4.3866||Batch time: 4.4935 sec. ||LR: 0.00000001
Epoch:1 || epochiter: 10/50|| Totel iter 10 || L: 4.1511 C: 3.8391||Batch time: 1.0780 sec. ||LR: 0.00020001
.....#中间内容省略#
Epoch:25 || epochiter: 30/50|| Totel iter 1230 || L: 0.3628 C: 0.5933||Batch time: 1.1411 sec. ||LR: 0.01000000
Epoch:25 || epochiter: 40/50|| Totel iter 1240 || L: 0.6436 C: 0.6199||Batch time: 1.5384 sec. ||LR: 0.01000000
training cost 1556.92 s
任务四:模型推理
1.已完成训练,下面开始测试模型,首先需定义目标检测类
输入:
cfg = VOC_Config
img_dim = 416
rgb_means = (104, 117, 123)
priorbox = PriorBox(cfg)
with torch.no_grad():
priors = priorbox.forward()
if torch.cuda.is_available():
priors = priors.cuda()
class ObjectDetector:
"""
定义目标检测类
"""
def __init__(self, net, detection, transform, num_classes=num_classes, thresh=0.01, cuda=True):
self.net = net
self.detection = detection
self.transform = transform
self.num_classes = num_classes
self.thresh = thresh
self.cuda = torch.cuda.is_available()
def predict(self, img):
_t = {'im_detect': Timer(), 'misc': Timer()}
scale = torch.Tensor([img.shape[1], img.shape[0],
img.shape[1], img.shape[0]])
with torch.no_grad():
x = self.transform(img).unsqueeze(0)
if self.cuda:
x = x.cuda()
scale = scale.cuda()
_t['im_detect'].tic()
out = net(x) # forward pass
boxes, scores = self.detection.forward(out, priors)
detect_time = _t['im_detect'].toc()
boxes = boxes[0]
scores = scores[0]
# scale each detection back up to the image
boxes *= scale
boxes = boxes.cpu().numpy()
scores = scores.cpu().numpy()
_t['misc'].tic()
all_boxes = [[] for _ in range(num_classes)]
for j in range(1, num_classes):
inds = np.where(scores[:, j] > self.thresh)[0]
if len(inds) == 0:
all_boxes[j] = np.zeros([0, 5], dtype=np.float32)
continue
c_bboxes = boxes[inds]
c_scores = scores[inds, j]
c_dets = np.hstack((c_bboxes, c_scores[:, np.newaxis])).astype(
np.float32, copy=False)
keep = nms(c_dets, 0.2, force_cpu=False)
c_dets = c_dets[keep, :]
all_boxes[j] = c_dets
nms_time = _t['misc'].toc()
total_time = detect_time + nms_time
return all_boxes, total_time
2.定义推理网络,并加载前面训练的loss最低的模型
输入:
trained_models = os.listdir(os.path.join(ROOT_DIR, './rebar_count/model_snapshots')) # 模型文件所在目录
lowest_loss = 9999
best_model_name = ''
for model_name in trained_models:
if not model_name.endswith('pth'):
continue
loss = float(model_name.split('_loss_')[1].split('.pth')[0])
if loss < lowest_loss:
lowest_loss = loss
best_model_name = model_name
best_model_path = os.path.join(ROOT_DIR, './rebar_count/model_snapshots', best_model_name)
print('loading model from', best_model_path)
net = build_net('test', img_dim, num_classes) # 加载模型
state_dict = torch.load(best_model_path)
new_state_dict = OrderedDict()
for k, v in state_dict.items():
head = k[:7]
if head == 'module.':
name = k[7:]
else:
name = k
new_state_dict[name] = v
net.load_state_dict(new_state_dict)
net.eval()
print('Finish load model!')
if torch.cuda.is_available():
net = net.cuda()
cudnn.benchmark = True
else:
net = net.cpu()
detector = Detect(num_classes, 0, cfg)
transform = BaseTransform(img_dim, rgb_means, (2, 0, 1))
object_detector = ObjectDetector(net, detector, transform)
输出:
loading model from /home/./rebar_count/model_snapshots/epoch_024_loss_0.9113.pth
Finish load model!
- 测试图片,输出每条钢筋的位置和图片中钢筋总条数
输入:
test_img_dir = r'./rebar_count/datasets/test_dataset' # 待预测的图片目录
files = os.listdir(test_img_dir)
files.sort()
for i, file_name in enumerate(files[:2]):
image_src = cv2.imread(os.path.join(test_img_dir, file_name))
detect_bboxes, tim = object_detector.predict(image_src)
image_draw = image_src.copy()
rebar_count = 0
for class_id, class_collection in enumerate(detect_bboxes):
if len(class_collection) > 0:
for i in range(class_collection.shape[0]):
if class_collection[i, -1] > 0.6:
pt = class_collection[i]
cv2.circle(image_draw, (int((pt[0] + pt[2]) * 0.5), int((pt[1] + pt[3]) * 0.5)), int((pt[2] - pt[0]) * 0.5 * 0.6), (255, 0, 0), -1)
rebar_count += 1
cv2.putText(image_draw, 'rebar_count: %d' % rebar_count, (25, 50), cv2.FONT_HERSHEY_SIMPLEX, 2, (0, 255, 0), 3)
plt.figure(i, figsize=(30, 20))
plt.imshow(image_draw)
plt.show()
输出:
从推理结果可以看到,模型能较为精准的对钢筋进行计数。