介绍
模型权重格式转换与切分工具能够实现主流开源模型权重在Huggingface和Megatron框架间的转换,并支持Megatron框架下模型权重按照不同的DP、TP、PP并行策略进行切分。
如何使用
推荐基于model-convert镜像使用
docker pull harbor.ctyuncdn.cn/ai-algorithm-dmx/nvidia-model-conver
docker run -itd -u root --ipc=host --net=host --name=model-convert -v /data:/data harbor.ctyuncdn.cn/ai-algorithm-dmx/nvidia-model-convert:v0.1.3 \
/bin/bash
cd /work/convert_tool
python model.py --load_platform=huggingface --save_platform=megatron --common_config_path=config/llama-7b.json --tensor_model_parallel_size=2 --pipeline_model_parallel_size=4 --data_parallel_size=1 --use_distributed_optimizer --load_ckpt_path=/workspace/tmp/llama-7b-hf/ --save_ckpt_path=/tmp/llama-7b/tp2-pp4/
参数说明:
--load_platform :指定模型原格式 ['huggingface','megatron','mcore'];
--save_platform :指定模型目标格式 ['huggingface','megatron','mcore'];
--load_ckpt_path :path to load checkpoint
--save_ckpt_path :path to save checkpoint
--common_config_path :path to common config
--megatron_path : Base directory of Megatron repository
--no_load_optim :do not convert optimizer
--no_save_optim :do not save optimizer
--model_type_custom :custom model type
--safetensors :是否采用safetensor格式
--torch_dtype :数据类型 ["float16", "float32", "bfloat16"],
--vocab_size :vocab size
--use_distributed_optimizer :use distributed optimizer
--tensor_model_parallel_size :default=1,target tensor model parallel size
--pipeline_model_parallel_size : default=1,target pipeline model parallel size
--data_parallel_size : default=1,target data parallel size
--expert_parallel_size :default=None,target expert parallel size
--pad_vocab_size_to :default=NonePad the vocab size to this value
--num_layers_per_virtual_pipeline_stage :default=None,Number of layers per virtual pipeline stage
--num_experts :default=None,Number of Experts in MoE (None means no MoE)
--separate-layernorm-and-collinear :separate layernorm and attention/mlp column parallel linear
--huggingface_base_model_path : default=None,path to huggingface base model, used for get token files使用示例:
huggingface模型转megatron模型
python model_convert.py \
--load_platform=huggingface \
--save_platform=megatron \
--common_config_path=config/llama2-7b.json \
--tensor_model_parallel_size=2 \
--pipeline_model_parallel_size=4 \
--data_parallel_size=1 \
--use_distributed_optimizer \
--load_ckpt_path=/data/models/LLaMA2-7B-Chat/ \
--save_ckpt_path=/work/models/mg/llama2-7b/tp2-pp4/
#其中,load_ckpt_path应替换成需要转换的模型源目录
megatron模型转huggingface模型
python model_convert.py \
--load_platform=megatron \
--save_platform=huggingface \
--common_config_path=config/llama2-7b.json \
--tensor_model_parallel_size=2 \
--pipeline_model_parallel_size=4 \
--data_parallel_size=1 \
--use_distributed_optimizer \
--load_ckpt_path=/work/models/mg/llama2-7b/tp2-pp4/release/ \
--save_ckpt_path=/work/models/hf/llama2-7b/ \
--huggingface_base_model_path=/data/models/LLaMA2-7B-Chat/ \
--safetensors
#其中,load_ckpt_path应替换成需要转换的模型源目录; huggingface_base_model_path 应替换成huggingface 初始模型目录(从hf下载的,包括权重及token等相关文件)