对pandas数据类型的处理,可以将数据转换成list数据类型处理,也可以使用df.apply()方法对df数据进行处理。
方法一:将每列数据转换成列表类型再进行处理
示例代码1:
import pandas as pd
# 读取excel中数据
df = pd.read_excel('./test.xlsx')
print(df)
print("*" * 50)
# 将df数据转换成list数据类型
df_lst1 = df['num1'].values.tolist()
print(df_lst1)
df_lst2 = df['num2'].values.tolist()
print(df_lst2)
print("*" * 50)
# 对每列数据进行单独处理
df_lst1_odd = [i for i in df_lst1 if i % 2 == 1]
df_lst1_even = [i for i in df_lst1 if i % 2 == 0]
print(df_lst1_odd)
print(df_lst1_even)
df_lst2_odd = [i for i in df_lst2 if i % 2 == 1]
df_lst2_even = [i for i in df_lst2 if i % 2 == 0]
print(df_lst2_odd)
print(df_lst2_even)
print("*" * 50)
# 将上述拿到的非结构化数据使用from_dict结构化处理
df = pd.DataFrame.from_dict(
{"num1奇数": df_lst1_odd, "num1偶数": df_lst1_even, "num2奇数": df_lst2_odd, "num2偶数": df_lst2_even}, orient='index')
print(df)
print(df.T)
# 将df数据保存在csv文件中
df.T.to_csv("./new_test.csv", index=False)
运行结果:
num1 num2
0 1 11
1 2 33
2 3 22
3 4 44
4 6 55
5 5 33
6 7 88
7 9 99
**************************************************
[1, 2, 3, 4, 6, 5, 7, 9]
[11, 33, 22, 44, 55, 33, 88, 99]
**************************************************
[1, 3, 5, 7, 9]
[2, 4, 6]
[11, 33, 55, 33, 99]
[22, 44, 88]
**************************************************
0 1 2 3 4
num1奇数 1 3 5 7.0 9.0
num1偶数 2 4 6 NaN NaN
num2奇数 11 33 55 33.0 99.0
num2偶数 22 44 88 NaN NaN
num1奇数 num1偶数 num2奇数 num2偶数
0 1.0 2.0 11.0 22.0
1 3.0 4.0 33.0 44.0
2 5.0 6.0 55.0 88.0
3 7.0 NaN 33.0 NaN
4 9.0 NaN 99.0 NaN
输出的csv文件内容:
示例代码2: 【与示例代码1的区别是输出保留了原数据】
import pandas as pd
# 读取excel中数据
df = pd.read_excel('./test.xlsx')
print(df)
print("*" * 50)
# 将df数据转换成list数据类型
df_lst1 = df['num1'].values.tolist()
print(df_lst1)
df_lst2 = df['num2'].values.tolist()
print(df_lst2)
print("*" * 50)
# 对每列数据进行单独处理
df_lst1_odd = [i for i in df_lst1 if i % 2 == 1]
df_lst1_even = [i for i in df_lst1 if i % 2 == 0]
print(df_lst1_odd)
print(df_lst1_even)
df_lst2_odd = [i for i in df_lst2 if i % 2 == 1]
df_lst2_even = [i for i in df_lst2 if i % 2 == 0]
print(df_lst2_odd)
print(df_lst2_even)
print("*" * 50)
# 将上述拿到的非结构化数据使用from_dict结构化处理
df = pd.DataFrame.from_dict(
{"num1": df_lst1, "num2": df_lst2, "num1奇数": df_lst1_odd, "num1偶数": df_lst1_even, "num2奇数": df_lst2_odd,
"num2偶数": df_lst2_even}, orient='index')
print(df)
print(df.T)
# 将df数据保存在csv文件中
df.T.to_csv("./new_test.csv", index=False)
运行结果:
num1 num2
0 1 11
1 2 33
2 3 22
3 4 44
4 6 55
5 5 33
6 7 88
7 9 99
**************************************************
[1, 2, 3, 4, 6, 5, 7, 9]
[11, 33, 22, 44, 55, 33, 88, 99]
**************************************************
[1, 3, 5, 7, 9]
[2, 4, 6]
[11, 33, 55, 33, 99]
[22, 44, 88]
**************************************************
0 1 2 3 4 5 6 7
num1 1 2 3 4.0 6.0 5.0 7.0 9.0
num2 11 33 22 44.0 55.0 33.0 88.0 99.0
num1奇数 1 3 5 7.0 9.0 NaN NaN NaN
num1偶数 2 4 6 NaN NaN NaN NaN NaN
num2奇数 11 33 55 33.0 99.0 NaN NaN NaN
num2偶数 22 44 88 NaN NaN NaN NaN NaN
num1 num2 num1奇数 num1偶数 num2奇数 num2偶数
0 1.0 11.0 1.0 2.0 11.0 22.0
1 2.0 33.0 3.0 4.0 33.0 44.0
2 3.0 22.0 5.0 6.0 55.0 88.0
3 4.0 44.0 7.0 NaN 33.0 NaN
4 6.0 55.0 9.0 NaN 99.0 NaN
5 5.0 33.0 NaN NaN NaN NaN
6 7.0 88.0 NaN NaN NaN NaN
7 9.0 99.0 NaN NaN NaN NaN
输出的csv文件内容:
方法二:使用df.apply()方法自主定义函数
示例代码:
import pandas as pd
# 读取excel中数据
df = pd.read_excel('./test.xlsx')
print(df)
print("*" * 50)
# 定义奇函数
def odd(x):
if x % 2 == 1:
return x
# 定义偶函数
def even(x):
if x % 2 == 0:
return x
# 单独获取每列数据
df_lst1 = df['num1']
df_lst2 = df['num2']
# 使用df.apply()自定义运算
df_lst1_odd = df['num1'].apply(odd)
df_lst1_even = df['num1'].apply(even)
df_lst2_odd = df['num2'].apply(odd)
df_lst2_even = df['num2'].apply(even)
# 将上述拿到的非结构化数据使用from_dict结构化处理
df = pd.DataFrame.from_dict(
{"num1": df_lst1, "num2": df_lst2, "num1奇数": df_lst1_odd, "num1偶数": df_lst1_even, "num2奇数": df_lst2_odd,
"num2偶数": df_lst2_even}, orient='index')
print(df)
print(df.T)
# 将df数据保存在csv文件中
df.T.to_csv("./new_test.csv", index=False)
运行结果:
num1 num2
0 1 11
1 2 33
2 3 22
3 4 44
4 6 55
5 5 33
6 7 88
7 9 99
**************************************************
0 1 2 3 4 5 6 7
num1 1.0 2.0 3.0 4.0 6.0 5.0 7.0 9.0
num2 11.0 33.0 22.0 44.0 55.0 33.0 88.0 99.0
num1奇数 1.0 NaN 3.0 NaN NaN 5.0 7.0 9.0
num1偶数 NaN 2.0 NaN 4.0 6.0 NaN NaN NaN
num2奇数 11.0 33.0 NaN NaN 55.0 33.0 NaN 99.0
num2偶数 NaN NaN 22.0 44.0 NaN NaN 88.0 NaN
num1 num2 num1奇数 num1偶数 num2奇数 num2偶数
0 1.0 11.0 1.0 NaN 11.0 NaN
1 2.0 33.0 NaN 2.0 33.0 NaN
2 3.0 22.0 3.0 NaN NaN 22.0
3 4.0 44.0 NaN 4.0 NaN 44.0
4 6.0 55.0 NaN 6.0 55.0 NaN
5 5.0 33.0 5.0 NaN 33.0 NaN
6 7.0 88.0 7.0 NaN NaN 88.0
7 9.0 99.0 9.0 NaN 99.0 NaN
输出的csv文件内容:
解决上述csv文件中数据松散问题:
示例代码:
import pandas as pd
# 读取excel中数据
df = pd.read_excel('./test.xlsx')
print(df)
print("*" * 50)
# 定义奇函数
def odd(x):
if x % 2 == 1:
return x
# 定义偶函数
def even(x):
if x % 2 == 0:
return x
# 单独获取每列数据
df_lst1 = df['num1']
df_lst2 = df['num2']
# 使用df.apply()自定义运算
df_lst1_odd = df['num1'].apply(odd).dropna()
df_lst1_even = df['num1'].apply(even).dropna()
df_lst2_odd = df['num2'].apply(odd).dropna()
df_lst2_even = df['num2'].apply(even).dropna()
# 将上述拿到的非结构化数据使用from_dict结构化处理
# 注意:前两列原始数据要使用list转化一下数据类型,否则处理好的输出数据仍然松散
df = pd.DataFrame.from_dict(
{"num1": list(df_lst1), "num2": list(df_lst2), "num1奇数": df_lst1_odd, "num1偶数": df_lst1_even, "num2奇数": df_lst2_odd, "num2偶数": df_lst2_even}, orient='index')
print(df)
print(df.T)
# 将df数据保存在csv文件中
df.T.to_csv("./new_test.csv", index=False)
运行结果:
num1 num2
0 1 11
1 2 33
2 3 22
3 4 44
4 6 55
5 5 33
6 7 88
7 9 99
**************************************************
0 1 2 3 4 5 6 7
num1 1.0 2.0 3.0 4.0 6.0 5.0 7.0 9.0
num2 11.0 33.0 22.0 44.0 55.0 33.0 88.0 99.0
num1奇数 1.0 3.0 5.0 7.0 9.0 NaN NaN NaN
num1偶数 2.0 4.0 6.0 NaN NaN NaN NaN NaN
num2奇数 11.0 33.0 55.0 33.0 99.0 NaN NaN NaN
num2偶数 22.0 44.0 88.0 NaN NaN NaN NaN NaN
num1 num2 num1奇数 num1偶数 num2奇数 num2偶数
0 1.0 11.0 1.0 2.0 11.0 22.0
1 2.0 33.0 3.0 4.0 33.0 44.0
2 3.0 22.0 5.0 6.0 55.0 88.0
3 4.0 44.0 7.0 NaN 33.0 NaN
4 6.0 55.0 9.0 NaN 99.0 NaN
5 5.0 33.0 NaN NaN NaN NaN
6 7.0 88.0 NaN NaN NaN NaN
7 9.0 99.0 NaN NaN NaN NaN
输出的csv文件内容: