Error code: 404 - {'error': {'code': '404', 'message': 'Resource not found'}}

Hi, when I run ```bash scripts/run_generalqa_qwen3_judge.sh```, I get the following errors. Is it due to missing GPT-4o API keys?


(main_task pid=388584) ====================================================================================================                                   
(main_task pid=388584) ====================================================================================================                                   
(main_task pid=388584) Error code: 404 - {'error': {'code': '404', 'message': 'Resource not found'}}                                                          
(main_task pid=388584) messages:  [{'role': 'system', 'content': [{'type': 'text', 'text': "You are an intelligent chatbot designed for evaluating the correct
ness of generative outputs for question-answer pairs.\nYour task is to compare the predicted answer with the correct answer and determine if they match meanin
gfully. Here's how you can accomplish the task:\n------\n##INSTRUCTIONS:\n- Focus on the meaningful match between the predicted answer and the correct answer.
\n- Consider synonyms or paraphrases as valid matches.\n- Evaluate the correctness of the prediction compared to the answer."}]}, {'role': 'user', 'content': 
[{'type': 'text', 'text': 'I will give you a question related to an image and the following text as inputs:\n\n1. **Question Related to the Image**: <image>Wh
o is the author of this book?\nAnswer the question with a short phrase.\n2. **Ground Truth Answer**: Antonio Graceffo\n3. **Model Predicted Answer**: Antonio 
Graceffo\n\nYour task is to evaluate the model\'s predicted answer against the ground truth answer, based on the context provided by the question related to t
he image. Consider the following criteria for evaluation:\n- **Relevance**: Does the predicted answer directly address the question posed, considering the inf
ormation provided by the given question?\n- **Accuracy**: Compare the predicted answer to the ground truth answer. You need to evaluate from the following two
 perspectives:\n(1) If the ground truth answer is open-ended, consider whether the prediction accurately reflects the information given in the ground truth wi
thout introducing factual inaccuracies. If it does, the prediction should be considered correct.\n(2) If the ground truth answer is a definitive answer, stric
tly compare the model\'s prediction to the actual answer. Pay attention to unit conversions such as length and angle, etc. As long as the results are consiste
nt, the model\'s prediction should be deemed correct.\n**Output Format**:\nYour response should include an integer score indicating the correctness of the pre
diction: 1 for correct and 0 for incorrect. Note that 1 means the model\'s prediction strictly aligns with the ground truth, while 0 means it does not.\nThe f
ormat should be "Score: 0 or 1"'}]}]                                                                                                                          
(main_task pid=388584) ====================================================================================================                                   
(main_task pid=388584) Warning: Failed after 3 attempts                                                                                                       
(main_task pid=388584) Warning: Failed after 3 attempts                                                                                                       
(main_task pid=388584) Warning: Failed after 3 attempts                                                                                                       
(main_task pid=388584) Warning: Failed after 3 attempts                                                                                                       
(main_task pid=388584) Warning: Failed after 3 attempts                                                                                                       
(main_task pid=388584) Warning: Failed after 3 attempts                                                                                                       
(main_task pid=388584) Warning: Failed after 3 attempts                                                                                                       
(main_task pid=388584) Warning: Failed after 3 attempts                                                                                                       
(main_task pid=388584) Warning: Failed after 3 attempts                                                                                                       
(main_task pid=388584) top 10 time consuming in reward fn:  []                                                                                                
(main_task pid=388584) there are 512 invalid samples in this batch: ['f3a54872-c197-478b-a3b9-2739c5cca43c', '7ca01723-2205-4575-9a78-852d888451f6', '58bf17d8
-3df0-4d16-9e89-bd00ec4d4e1d', 'dc26a6fa-ec1c-4cb7-b563-217c3c3fd459', '6fed4532-d31f-43dd-820a-6a53ff4ac96f']                                                
(main_task pid=388584) total time:  23.259926319122314                                                                                                        
(main_task pid=388584) ("Initial validation metrics: {'val/test_score/open_source/total': "                                                                   
(main_task pid=388584)  "np.float64(0.0), 'val/test_score/open_source/format_reward': "                                                                       
(main_task pid=388584)  "np.float64(0.0), 'val/test_score/open_source/acc_reward': np.float64(0.0)}")                                                         
(main_task pid=388584) step:0 - val/test_score/open_source/total:0.000 - val/test_score/open_source/format_reward:0.000 - val/test_score/open_source/acc_rewar
d:0.000             


                                                                                                                                          

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Error code: 404 - {'error': {'code': '404', 'message': 'Resource not found'}} #3

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Error code: 404 - {'error': {'code': '404', 'message': 'Resource not found'}} #3

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions