fix random answer bugs #2

DragonFive · 2024-12-31T05:14:20Z

This pr fixes two bugs:

The index2ans code has a bug that different rows share the last row's index2ans. So the response answer and correct answer can't find a proper choise, which will result in random answer, as the code in parse_multi_choice_response says.

      if len(candidates) == 0:  # still not get answer, randomly choose one.
          if default_answer is None:
               pred_index = random.choice(all_choices)

The answer parsing code has a bug that it will change strip the correct answer's preriod punctuation at the end， so the answer will be random because following cde doesn't work.

    for index, ans in index2ans.items():
           print(f"{index}: {ans.lower()}, {response.lower()}")
           if ans.lower() in response.lower():
                candidates.append(index)
                index_ans = False

for example when my first row is:

args_list[0]:('question': 'What are the men doing?',
'audio_path': './OmniBench/mm_data/audio/2_009_four_people.mp3',
'image_path': './OmniBench/mm_data/image/2_009_four_people.png',
'index': 0,
'answer': 'The man in jeans is playing a crossword puzzle.',
'options': ['The man in jeans is taking notes from the newspaper.', 'The man in purple is reading the newspaper.', 'The man in jeans is playing a crossword puzzle.', 'The man on the table is doing a crossword puzzle.']}, 0, input_file='./OmniBench/dataset/batch-5_1142_20240817.jsonl',

),
{'A': 'The man in jeans is taking notes from the newspaper.', 'B': 'The man in purple is reading the newspaper.', 'C': 'The man in jeans is playing a crossword puzzle.', 'D': 'The man on the table is doing a crossword puzzle.'},
['A', 'B', 'C', 'D'])

because of bug1, the index2ans for all rows will be

index2ans:{'A': 'Yes, both feature the same music, although the audio is a comedic rendition.', 'B': 'No, the audio is a different composition entirely.', 'C': 'Yes, but the audio is from a different section of the piece.', 'D': 'No, the audio is a similar but distinctly different Beethoven symphony.'}

after fix bug1, the index2ans for this row will be

index2ans:{'A': 'The man in jeans is taking notes from the newspaper.', 'B': 'The man in purple is reading the newspaper.', 'C': 'The man in jeans is playing a crossword puzzle.', 'D': 'The man on the table is doing a crossword puzzle.'}

because of bug2,

the correct answer will be 'The man in jeans is playing a crossword puzzle', which can not be identified because it has no dot at the end, the real answer is 'The man in jeans is playing a crossword puzzle.'

the index2ans is same for different row, this commit fix tie

fix

fix correct_answer random

DragonFive added 3 commits December 30, 2024 20:36

Update demo_api_call.py

ecda863

the index2ans is same for different row, this commit fix tie

Update demo_api_call.py

609fc66

fix

fix correct_answer random

21a7f75

fix correct_answer random

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix random answer bugs #2

fix random answer bugs #2

DragonFive commented Dec 31, 2024

fix random answer bugs #2

Are you sure you want to change the base?

fix random answer bugs #2

Conversation

DragonFive commented Dec 31, 2024