Describing Unseen Videos via Multi-Modal Cooperative Dialog Agents

Publication
European Conference on Computer Vision
Date