Guotai Haitong: GROK-4 leads AI into the next level, exploring the computing power and vertical domain race track.

date
18/07/2025
avatar
GMT Eight
AI solution providers with vertical domain expertise and data barriers will stand out in competition.
Guotai Haitong released a research report stating that on July 10th, Grok 4 surpassed existing models with its benchmark crushing performance and cross-level performance, marking xAI's lead into the next generation of AI. This will encourage industry companies to actively explore the integration of cutting-edge technologies, accelerate innovation, and drive the entire industry towards a higher development stage. Cloud service providers and data center operators will directly benefit from the continuous growth in computing power demand. AI solution providers with vertical domain advantages and data barriers will shine in the competition. Guotai Haitong's main points are as follows: Deep thinking and group decision-making synergy, reconstructing the paradigm of superhuman reasoning computation Grok-4's reasoning computation capability has achieved a breakthrough at the discontinuous level, with its pre-training computational volume and reasoning computation capability more than ten times higher than its predecessor, and the training scale reaching a hundred times that of Grok-2. Through 2500 doctoral-level difficult problems covering natural sciences, engineering, and other disciplines, Grok-4 achieved a score of 45% in the Human-Level Evaluation (HLE), which is twice as high as the most advanced AI Gemini2.5pro in the past. Grok-4 not only surpasses the academic capabilities of human researchers across the board but also sets new records with perfect scores in authoritative tests such as GPQA and AIME25. Among them, the multi-agent collaborative Grok-4 Heavy combines two abilities of deep thinking and group collaborative error correction, successfully achieving a perfect score in AIME25. This non-human reasoning efficiency has rendered traditional human-designed tests meaningless, and its ability boundary is pushing forward the discovery of new technologies and physical laws, expected to produce breakthrough scientific research results within two years. Connecting the full loop of real-world scenarios, verifying cross-industry decision execution capabilities In terms of solving real-world scenario problems, Grok-4 has shown revolutionary progress: the speech function has doubled its response speed and halved the delay, the Eve British tone synthesis technology has endowed conversations with natural magnetism and emotional fluidity, and the user experience is significantly better than competitors. In the automatic vending machine management test (Vending-Bench), Grok-4 generated a net asset value of 4694.15, more than double that of the second-placed ClaudeOpus 4, proving its long-term strategic execution capability. After opening the 256K context API interface, it has assisted the ARC research institute in screening millions of experimental data to generate research hypotheses in the biomedical field, becoming the preferred tool in financial decision-making, and even completing the development of a first-person shooting game in just 4 hours, proving its ability to integrate tools for solving complex tasks across industries throughout the entire process. Focusing on revolutionizing pixel-level video generation, building a new ecosystem of human-machine collaborative perception The only downside is that the current Grok-4's multimodal capabilities are still a significant weakness, especially in the field of image understanding and generation, which still needs significant improvement and has not yet achieved human-level visual and auditory perception and interactive abilities. The next generation of research and development will focus on breakthroughs in video generation technology, achieving AI video creation in the x platform through end-to-end training of "input pixels-output pixels", with plans to launch a 3D resource auto-generation system integrated with the Unreal Engine next year to empower the gaming and film industries. In the short term, the focus will be on strengthening specialized programming models and optimizing image recognition technology, with the ultimate goal of building a super intelligent agent with deep thinking, real-time response, and multimodal collaboration to completely reshape the human-machine cooperation paradigm. Risk warning: Intensified technological competition, inadequate computing power supply, and data privacy compliance risks.