- 강의 질문
- AI TECH
Evaluation시 error 문의
안녕하세요.
아래와 같이, 청킹까지 끝난 corpus와 질/답 생성한 qa 세트 가지고 evaluation 을 시켰습니다.
from autorag.evaluator import Evaluator
evaluator = Evaluator(
qa_data_path=qa_path,
corpus_data_path=corpus_path,
project_dir=project_dir,
)
evaluator.start_trial(yaml_path)
--------
그러나 에러가 나는데, 에러 내용이
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) Cell In[109], line 8 1 from autorag.evaluator import Evaluator 3 evaluator = Evaluator( 4 qa_data_path=qa_path, 5 corpus_data_path=corpus_path, 6 project_dir=project_dir, 7 ) ----> 8 evaluator.start_trial(yaml_path) File ~/anaconda3/envs/py312/lib/python3.12/site-packages/autorag/evaluator.py:138, in Evaluator.start_trial(self, yaml_path, skip_validation, full_ingest) 133 from autorag.validator import Validator # resolve circular import 135 validator = Validator( 136 qa_data_path=self.qa_data_path, corpus_data_path=self.corpus_data_path 137 ) --> 138 validator.validate(yaml_path) 140 os.environ["PROJECT_DIR"] = self.project_dir 142 trial_name = self.__get_new_trial_name() File ~/anaconda3/envs/py312/lib/python3.12/site-packages/autorag/validator.py:92, in Validator.validate(self, yaml_path, qa_cnt, random_state) 85 sample_corpus_df.to_parquet(corpus_path.name, index=False) 87 evaluator = Evaluator( 88 qa_data_path=qa_path.name, 89 corpus_data_path=corpus_path.name,
...
---> 57 raise ValueError(f"doc_id: {id_} not found in corpus_data.") 58 else: 59 return fetch_result[column_name].iloc[0] ValueError: doc_id: 2f2352f1-b413-46aa-907a-f0ae35817d2f not found in corpus_data.
------------
위와 같이 corpus_data에 위의 doc_id 를 못찾는다고 에러가 났습니다.
그러나 실제로 찾아보면 해당 id가 있습니다.
> corpus_df = pd.read_parquet(corpus_path)
> '2f2352f1-b413-46aa-907a-f0ae35817d2f' in list(corpus_df['doc_id'])
True
어떤 오류일까요?