Skip to content

Commit ad3d28d

Browse files
committed
add eval
1 parent 16bcab0 commit ad3d28d

File tree

3 files changed

+777
-2
lines changed

3 files changed

+777
-2
lines changed

README.md

Lines changed: 28 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@
1313

1414
## TODO
1515
- [x] Release dataset
16-
- [ ] Release evaluation code
16+
- [x] Release evaluation code
1717
- [ ] EvalAI server setup
1818

1919
## :fire: News
@@ -66,7 +66,33 @@ Each entry in the dataset contains the following fields:
6666

6767

6868
## Evaluation
69-
To be released soon.
69+
70+
You can do evaluation by running our evaluation code [eval.py](evaluation/eval.py). Note that access to the GPT-4 API is required, as defined in line 387 of `eval.py`.
71+
To use our example evaluation code, you need to define your model initialization function, such as:
72+
```python
73+
modelname_init()
74+
```
75+
at line 357 of eval.py, and the model answer function, such as:
76+
```python
77+
modelname_answer()
78+
```
79+
at line 226 of eval.py.
80+
81+
Alternatively, you may prepare your model results and submit them to the EvalAI server. The model results format should be as follows:
82+
83+
```json
84+
{
85+
"detailed_results": [
86+
{
87+
"video_id": "eng_vid1",
88+
"model_answer": "a</s>",
89+
},
90+
...
91+
]
92+
}
93+
```
94+
95+
7096

7197

7298
## License Agreement

0 commit comments

Comments
 (0)