🚗 #GateSquareCommunityChallenge# Round 1 — Who Will Be The First To The Moon?
Brain challenge, guess and win rewards!
5 lucky users with the correct answers will share $50 GT! 💰
Join:
1️⃣ Follow Gate_Square
2️⃣ Like this post
3️⃣ Drop your answer in the comments
📅 Ends at 16:00, Sep 17 (UTC)
Tsinghua KEG Lab and Zhipu AI jointly launched CogAgent, a large image understanding model
Bit News Tsinghua KEG Lab recently cooperated with Zhipu AI to jointly launch a new generation of image understanding large model CogAgent. Based on the previously launched CogVLM, the model uses visual modalities instead of text to provide a more comprehensive and direct perception of the GUI interface through a visual GUI agent for planning and decision-making. It is reported that CogAgent can accept 1120×1120 high-resolution image input, with visual question answering, visual positioning (Grounding), GUI Agent and other capabilities, in 9 classic image understanding lists (including VQAv2, STVQA, DocVQA, TextVQA, MM-VET, POPE, etc.) has achieved the first result in general ability.