You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+28-2Lines changed: 28 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -13,7 +13,7 @@
13
13
14
14
## TODO
15
15
-[x] Release dataset
16
-
-[] Release evaluation code
16
+
-[x] Release evaluation code
17
17
-[ ] EvalAI server setup
18
18
19
19
## :fire: News
@@ -66,7 +66,33 @@ Each entry in the dataset contains the following fields:
66
66
67
67
68
68
## Evaluation
69
-
To be released soon.
69
+
70
+
You can do evaluation by running our evaluation code [eval.py](evaluation/eval.py). Note that access to the GPT-4 API is required, as defined in line 387 of `eval.py`.
71
+
To use our example evaluation code, you need to define your model initialization function, such as:
72
+
```python
73
+
modelname_init()
74
+
```
75
+
at line 357 of eval.py, and the model answer function, such as:
76
+
```python
77
+
modelname_answer()
78
+
```
79
+
at line 226 of eval.py.
80
+
81
+
Alternatively, you may prepare your model results and submit them to the EvalAI server. The model results format should be as follows:
0 commit comments