Global Top 2, Domestic No.1! DingTalk AI Achieves Major Breakthrough, Outperforms OpenAI and Claude in DeepResearch Benchmark

Stock News
Nov 12

DingTalk's AI research system "Dingtalk-DeepResearch" has achieved a breakthrough in international authoritative evaluations, scoring 48.49 points in the DeepResearch Bench test, ranking second globally and first domestically, surpassing mainstream systems like OpenAI and Claude. The system has been successfully applied in complex scenarios such as manufacturing and supply chains, demonstrating industry-leading accuracy and robustness in handling heterogeneous tables, multi-stage reasoning, and multimodal generation tasks. This advancement marks a dual breakthrough in both international benchmarks and real-world production applications, positioning Chinese enterprise AI technology in the global first tier.

The core of Dingtalk-DeepResearch lies in its multi-agent deep research framework designed for real enterprise scenarios, integrating deep research generation, heterogeneous table parsing and reasoning, and multimodal report generation into a single system. This design mimics a team of specialists working collaboratively—some analyzing tabular data, others generating reports, and others coordinating tool usage. Through a three-layer architecture (task-oriented agent layer, core engine layer, and data layer), the system supports parallel processing and multi-stage reasoning for complex tasks. For example, it can automatically parse factory production tables with nested and merged cells and transform them into structured, insightful analysis reports.

To adapt to dynamic enterprise environments, the framework features an entropy-guided, memory-aware online learning mechanism, enabling continuous evolution without human intervention. This allows the system to learn from historical interactions and gradually adapt to different business processes and user preferences. For instance, when users repeatedly modify AI-generated report formats, the system autonomously learns and memorizes their preferences for subsequent outputs. These personalized preferences can be shared across teams or entire organizations, enhancing knowledge reuse and efficiency.

To ensure output quality, Dingtalk-DeepResearch incorporates the DingAutoEvaluator assessment system, which conducts multi-dimensional "quality checks" on generated reports, covering data accuracy, logical coherence, and tool usage standards. If issues are detected, the system automatically feeds them back into the training process for model optimization, forming a closed-loop improvement cycle.

Currently, Dingtalk-DeepResearch has been stably deployed in real-world business scenarios, delivering value. In supply chain management, it rapidly analyzes cross-departmental complex tabular data to provide intelligent procurement recommendations. In manufacturing, it converts raw equipment data into visual analysis reports for predictive maintenance and decision-making. All core functionalities have been validated through international benchmark tests, ensuring reliability and technological leadership.

DingTalk's CTO Zhu Hong stated, "Dingtalk-DeepResearch combines adaptive optimization and multimodal reasoning to create a flexible, enterprise-grade AI framework capable of handling complex and evolving real-world tasks. This technology is accelerating deployment in AI search, AI tables, automated workflows, and Agent platforms, bringing cutting-edge AI closer to practical production needs and delivering tangible value to enterprises."

Disclaimer: Investing carries risk. This is not financial advice. The above content should not be regarded as an offer, recommendation, or solicitation on acquiring or disposing of any financial products, any associated discussions, comments, or posts by author or other users should not be considered as such either. It is solely for general information purpose only, which does not consider your own investment objectives, financial situations or needs. TTM assumes no responsibility or warranty for the accuracy and completeness of the information, investors should do their own research and may seek professional advice before investing.

Most Discussed

  1. 1
     
     
     
     
  2. 2
     
     
     
     
  3. 3
     
     
     
     
  4. 4
     
     
     
     
  5. 5
     
     
     
     
  6. 6
     
     
     
     
  7. 7
     
     
     
     
  8. 8
     
     
     
     
  9. 9
     
     
     
     
  10. 10