ADHA: A Benchmark for Recognizing Adverbs describing Human Actions in Videos

Bo Pang, Kaiwen Zha, Cewu Lu*

(* corresponding author:


ADHA: “Adverbs Describing Human Actions” is the first benchmark for a new problem — recognizing human action adverbs (HAA). This is the first step for computer vision to change over from pattern recognition to real AI. Some key features of ADHA are: a semantically complete set of adverbs describing human actions, a set of common, describable human actions, and an exhaustive labeling of simultaneously emerging actions in each video. An in-depth analysis on the implementation of current effective models in action recognition and image captioning on adverb recognition is committed, and the results show that such methods are infeasible. Moreover, we use expression knowledge in those models and show that it can significantly improve HAA recognition performance.

Three-Stream Model



PBLSTM results. “T1-F1” means task 1 with feature 1. “-e” means using expression knowledge. Task 2 doesn’t recognize actions so “Act” does not have values in task 2.

Two-stream Model results. “-S” means spatial stream. “-M” means motion stream. “-F” means fusion streams. Task 2 doesn’t recognize actions so “Act” does not have values in task 2.

Hybrid models results. "-H" means Hybrid model. Task 2 doesn’t recognize actions so “Act” does not have values in task 2.


You can find the code of the model at the Github.


The ADHA dataset consists of 11736 short videos and associated labels from an action set with 32 actions and an adverb set with 51 adverbs. In the dataset we also provide the tracking result of the target person using a semi-automatic annotation framework. In total there are 16716 persons labeled.
Target Person
Second of video
Action-Adverb Pairs

Action Set

brush_hair chew clap climb_stairs dive draw_sword drink eat
fall_floor hit hug kick kiss pick pour pullup
punch push run shake_hands shoot_bow shoot_gun walk wave
sit smoke stand swing_baseball sword sword_exercise talk throw

Adverb Set

promptly fast kindly carefully seriously barely easily slowly
quietly precisely gently surprisedly lightly heavily happily freely
sadly proudly comfortably calmly vigorously nervously reluctantly professionally
politely painfully angrily patiently bitterly incidentally frantically intently
gracefully flatly confidently weakly solemnly expertly inexorably triumphantly
hesitantly dramatically officially anxiously hard amazingly wearily clumsily
sweetly excitedly ironically


Action Distribution

Adverb Distribution


If you want to know more about the dataset, you can download the related paper here.


                   title={Human Action Adverb Recognition: ADHA Dataset and A Hybrid Model},
                   author={Pang, Bo and Zha, Kaiwen and Lu, Cewu},
                   booktitle={arXiv preprint},


Click Here to download the dataset and the related materials.