文章目录
- 提取方法
-
- 步骤1.构造dataframe
- 步骤2. 从dataframe中提取信息
- 步骤3.转变格式
- 案例代码
要从dataframe格式的数据中提取数据,然后传入到torch的模型中的方法如下:
提取方法
步骤1.构造dataframe
df = pd.DataFrame(create_float((100, 5))) # 生成50行3列的dataframe
df['label'] = create_float((100, 1))
步骤2. 从dataframe中提取信息
y = df['label'] # label值
x = df.drop(['label'], axis=1)
x_train = torch.from_numpy(x.values)
y_train = torch.from_numpy(y[:, np.newaxis])
注意:如果只有一个预测值,这里y_train
要变为ndarry shape = (num,1)的数据,不能是shape=(num,)
步骤3.转变格式
x_train = x_train.float() # 转为 float32 类型
y_train = y_train.float()
案例代码
下面是一个从dataframe中读取数据然后训练模型的小例子:
import torch
import torch.nn as nn
import numpy as np
import pandas as pd
class LinearRegressionModel(nn.Module):
def __init__(self, input_dim, output_dim):
super(LinearRegressionModel, self).__init__()
self.linear = nn.Linear(input_dim, output_dim)
def forward(self, x):
out = self.linear(x)
return out
def create_float(shape=(1, 1)):
return np.array([np.random.random_sample(shape[1]) for i in range(shape[0])])
if __name__ == '__main__':
df = pd.DataFrame(create_float((100, 5))) # 生成50行3列的dataframe
df['label'] = create_float((100, 1))
# 提取 x与y
y = df['label']
x = df.drop(['label'], axis=1)
# 开始转变
x_train = torch.from_numpy(x.values)
y_train = torch.from_numpy(y[:, np.newaxis])
x_train = x_train.float() # 转为 float32 类型
y_train = y_train.float()
# 指定参数与损失函数
model = LinearRegressionModel(x.shape[1], 1)
epochs = 1000
learning_rate = 0.01
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
criterion = nn.MSELoss()
for epoch in range(epochs):
epoch += 1
# 梯度要清零每一次迭代
optimizer.zero_grad()
# 前向传播
outputs = model(x_train)
# 计算损失
loss = criterion(outputs, y_train)
# 返向传播
loss.backward()
# 更新权重参数
optimizer.step()
# 每隔50次打印一次
if epoch % 50 == 0:
print('epoch {}, loss {}'.format(epoch, loss.item()))
# 预测:
predicted = model(x_train)
print(predicted)