模型切分 python tools/model_partition.py --config_file tasks/medusa_llama/config/vicuna_7b_config.json 运行pipeline inference python pipeline_inference.py --world 4 --rank xxx --config_file tasks/medusa_llama/config/vicuna_7b_config.json --load_in_8bit