本文最后更新于:2024年10月2日 15:43
本节任务要点
使用 OpenCompass 评测 internlm2-chat-1.8b 模型在 ceval 数据集上的性能,记录复现过程并截图。
实践流程 环境配置(现在numpy有2.0版本了,加一个限制) 镜像为 Cuda11.7-conda ,并选择 GPU 为10% A100。
1 2 3 4 5 6 7 8 9 10 11 12 13 conda create -n opencompass python=3 .10 conda activate opencompassconda install pytorch==2 .1 .2 torchvision==0 .16 .2 torchaudio==2 .1 .2 pytorch-cuda=12 .1 -c pytorch -c nvidia -ygit clone -b 0 .2 .4 https://github.com/open-compass/opencompasscd /root/project/opencompasspip install -e .apt -get updateapt -get install cmakepip install -r requirements.txtpip install numpy==1 .23 .5 pip install protobuf
数据集准备
1 2 3 4 # 解压评测数据集到 /root/ project /opencompass/ data/ 处 cp /share/ temp/datasets/ OpenCompassData-core-20231110 .zip /root/ project /opencompass cd /root/ project /opencompass unzip OpenCompassData-core-20231110 .zip
列出所有跟 InternLM 及 C-Eval 相关的配置
1 python tools/list_configs.py internlm ceval
结果图
使用命令行配置参数法进行评测 打开 opencompass文件夹下configs/models/hf_internlm/的hf_internlm2_chat_1_8b.py
,贴入以下代码
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 models = [ dict( type =HuggingFaceCausalLM, abbr ='internlm2-1.8b-hf' , path ="/share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b" , tokenizer_path ='/share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b' , model_kwargs =dict( trust_remote_code =True , device_map ='auto' , ), tokenizer_kwargs =dict( padding_side ='left' , truncation_side ='left' , use_fast =False , trust_remote_code =True , ), max_out_len =100, min_out_len =1, max_seq_len =2048, # batch_size =8, batch_size =16, run_cfg =dict(num_gpus=1, num_procs =1), ) ]
调试和运行
1 2 3 export MKL_SERVICE_FORCE_INTEL=1 python run.py --datasets ceval_gen --models hf_internlm2 _chat_1 _8 b --debug
使用配置文件修改参数法进行评测 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 cd /root/project/opencompass/configs conda activate opencompass touch eval_tutorial_demo.py ########################################from mmengine.config import read_basewith read_base(): from .datasets.ceval.ceval_gen import ceval_datasets from .models.hf_internlm.hf_internlm2_chat_1_8b import models as hf_internlm2_chat_1_8b_models datasets = ceval_datasets models = hf_internlm2_chat_1_8b_models ######################################## cd /root/opencompass python run.py configs/eval_tutorial_demo.py --debug
测出来差不多,这里就是个继承,玩过openmmlab的都懂这个配置
总结