Skip to content

fix(mbpp): add special oracles#212

Merged
ganler merged 2 commits intomasterfrom
add_mbpp_oracle
Jun 18, 2024
Merged

fix(mbpp): add special oracles#212
ganler merged 2 commits intomasterfrom
add_mbpp_oracle

Conversation

@soryxie
Copy link
Collaborator

@soryxie soryxie commented Jun 9, 2024

fix #210
Add special oracles for

  • Mbpp/581
  • Mbpp/558

They have more than one solutions which could be accepted.

  • Tested with base_input and plus_input
  • Tested with evaluate framework

@soryxie soryxie requested a review from ganler June 9, 2024 03:09
@ganler
Copy link
Member

ganler commented Jun 10, 2024

@soryxie
Copy link
Collaborator Author

soryxie commented Jun 17, 2024

maybe also need to fix https://github.com/evalplus/evalplus/blob/master/tools/mbpp/to_original_fmt.py

I think it's not necessary to modify. This PR merely adds two additional valid solutions, while the original canonical_solution are also valid :)

@soryxie
Copy link
Collaborator Author

soryxie commented Jun 17, 2024

I have tested this on EvalPlus, and this modification does not affect other problems. After the modification, it allows for more solutions to pass.

@ganler
Copy link
Member

ganler commented Jun 18, 2024

maybe also need to fix https://github.com/evalplus/evalplus/blob/master/tools/mbpp/to_original_fmt.py

I think it's not necessary to modify. This PR merely adds two additional valid solutions, while the original canonical_solution are also valid :)

Because now we added special oracles, we should also reflect such oracles when exporting them to original formats, just like:

if entry_point == "find_zero":
imports.add("import math")
aux_fn = inspect.getsource(_poly) + "\n"
assertion = f"assert _poly(*candidate(*inp), inp) <= {atol}"

@soryxie
Copy link
Collaborator Author

soryxie commented Jun 18, 2024

I see.
The original fmt dataset works well in this test script now.

# test 581
exec_code_0 = """\
def surface_Area(base, height):
    return (base * base) + (2 * base * height)
"""

exec_code_1 = """\
import math
def surface_Area(base_edge, height):
    slant_height = math.sqrt((base_edge / 2) ** 2 + height ** 2)
    base_area = base_edge ** 2
    lateral_area = 4 * (base_edge * slant_height) / 2
    total_surface_area = base_area + lateral_area
    return round(total_surface_area)
"""
exec(exec_code_0+data[581]['test'], globals())
exec(exec_code_1+data[581]['test'], globals())

# test 558
exec_code_0 = """\
def digit_distance_nums(n1, n2):
    return sum([abs(int(c1) - int(c2)) for c1, c2 in zip(str(n1), str(n2))])
"""

exec_code_1 = """\
def digit_distance_nums(num1: int, num2: int) -> int:
    str_num1 = str(num1)
    str_num2 = str(num2)

    max_length = max(len(str_num1), len(str_num2))

    padded_num1 = str_num1.zfill(max_length)
    padded_num2 = str_num2.zfill(max_length)

    return sum(abs(int(digit1) - int(digit2)) for digit1, digit2 in zip(padded_num1, padded_num2))
"""

exec(exec_code_0+data[558]['test'], globals())
exec(exec_code_1+data[558]['test'], globals())

@ganler ganler merged commit f86db47 into master Jun 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MbppPlus has problems in reference solutions or instructions

2 participants