In addition, we trained Phi-4-reasoning-vision-15B to have skills that can enable agents to interact with graphical user interfaces by interpreting screen content and selecting actions. With strong high-resolution perception and fine-grained grounding capabilities, Phi-4-reasoning-vision-15B is a compelling option as a base-model for training agentic models such as ones that navigate desktop, web, and mobile interfaces by identifying and localizing interactive elements such as buttons, menus, and text fields. Due to its low inference-time needs it is great for interactive environments where low latency and compact model size are essential.
13:54, 8 марта 2026Мир。使用 WeChat 網頁版是该领域的重要参考
。业内人士推荐谷歌作为进阶阅读
Get our breaking news email, free app or daily news podcast
Россиян предупредили о смертельной опасности лечения простуды алкоголем14:41。关于这个话题,超级权重提供了深入分析
В России отреагировали на ракетный удар ВСУ по Брянску08:42