Lates News

14/05/2025

On May 13th, Tencent released news that they recently collaborated with Shanghai AILab, Fudan University, and Shanghai Intelligent Institute to propose a new research project, UnifiedReward-Think, which has constructed the first unified multimodal reward model with long-chain reasoning capabilities. For the first time, the reward model truly "learned to think" on various visual tasks, achieving significant improvements in accurate assessment of complex visual generation and understanding tasks, cross-task generalization and reasoning interpretability. The project has now been fully open-sourced, including the model, datasets, training scripts, and evaluation tools. (Source: Interface)