Get in Touch

Agentic AI Statistics: 2025 Report

SEO Blog
Task Completion Rate By Platform (no Human Intervention)

Last updated: June 11, 2025

Our agency conducted a research study on the rise of autonomous AI agents – their use cases, usage statistics, strengths, and weaknesses. Our original study began on January 14th, 2025 and concluded May 12, 2025, but our team has continued to update this report based on the most current information available to them.

The study consisted of a survey of more than 6,100 agentic AI users to whom we asked a number of questions over a 3 month period. We segmented the data we got back into 7 statistical categories:

Top Ai Agents By Maus

We listed and rank-ordered the top agentic AI systems by number of users as of June 2025.

Task Completion Rate By Platform

We evaluated each agentic AI platform on its performance when asked to complete complex, multi-step tasks.

Research Depth Sources Per Task

We had respondents ask the agentic AI systems for the sources they used as they performed each task and noted how many sources these systems rely upon for each task.

Trust For Agentic Vs Manual Search

When respondents needed to research a subject, we had them ask the agentic AI bot to perform it for them as well as perform the search manually. Subsequently, we asked them which set of results they trusted more.

Time Efficiency Of Agentic Tools

We requested that respondents measure the time it took them to complete the series of tasks manually vs. having the agentic AI bot do it for them.

Most Refused Task Types

We noted the situations where AI agents refused to perform tasks.

User Satisfaction By Task Type

We had respondents rate their satisfaction across several different task types.

Below, you can find the results of our study, which comprise some of the early research on how autonomous AI agents are being used by businesses and consumers.

The Top Autonomous AI Agents of 2025

In this section, we list the top autonomous AI agents by number of active users as of June 2025. Monthly active users is the strongest indicator of user engagement and adoption, and the growth rate of MAUs over time reveals whether a platform is gaining traction in the market or losing momentum.

To create this list, we compiled user data from each of the most popular autonomous AI agents, using founder interviews, first-party published claims, and third-party research.

RankAutonomous AI AgentDescriptionCreator / PlatformMonthly Active Users (Estimated)Quarterly Growth
1OpenAI Code InterpreterExecutes complex data and math tasks, integrated into ChatGPT for analytics and CSV parsing.OpenAI / GPT-42.4 million +19%
2AutoGPTAutonomous agent chaining LLM calls to execute tasks via memory and reasoning loops.Toran Bruce Richards / GPT-4 or GPT-3.52.1 million+19%
3Google Project AstraReal-time, multimodal assistant with computer vision and environment awareness.Google / Google Gemini846,000 +17%
4Google Project MarinerBrowser-native agent automating web tasks by simulating human interactions via Chrome extension.Google / Google Gemini516,000 +14%
5Claude Computer UseDesktop-native agent performing browser and OS-level actions like clicking and typing.Anthropic / Claude 3.5317,000 +11%
6Adept ACT-1Agent controlling software tools by observing UI and simulating user input.Adept AI / Proprietary multimodal model129,000+11%
7OpenDevinOpen-source software engineering agent planning, coding, debugging, and testing in dev environments.Community-driven / Model-agnostic54,000+9%
8GPT-EngineerAI agent writing complete software projects from a spec file, planning and coding autonomously.Community-driven / OpenAI and Anthropic models31,000+ +12%

Task Performance and Completion Rates

A core focus of this study was evaluating agentic system performance on complex, multi-step tasks. Five types of tasks were assigned to 487 users, including itinerary planning, multi-vendor purchasing, financial budgeting, and comparative analysis.

PlatformTask Completion Rate
Claude Computer Use86%
AutoGPT81%
OpenAI Code Interpreter73%
Google Project Mariner69%
Google Project Astra65%

The mean completion rate across platforms was 75.3%. Claude Computer Use led with 86% successful task completions without human intervention, followed by AutoGPT (81%) and OpenAI Code Interpreter (73%). Tasks such as single-vendor comparison and travel planning achieved the highest completion success (87%).

Tasks involving legal interpretations and niche SaaS comparisons showed the highest failure or partial-completion rates. Notably, only 18% of users felt the need to follow up on successful completions, indicating high trust in agent responses.

Research Depth: Sources Per Task

To assess whether autonomous AI agents truly provide academic-quality research support, users were asked to identify how many sources were cited by the platforms for each task. We also noted the minimum and maximum number of sources used by each AI agent across the entire experiment. 

PlatformMedian SourcesSource RangeNotes
AutoGPT73–15Iteratively searches the web and other resources to fulfill complex objectives.
Google Project Mariner52–8Automates web tasks by navigating and extracting information from multiple web pages.
Google Project Astra42–7Utilizes multimodal inputs, including visual and auditory data, to gather contextual information.
Claude Computer Use21-4Primarily interacts with local applications and files; may access web sources if instructed.
OpenAI Code Interpreter21–4Processes user-uploaded files and data; may access additional sources if browsing is enabled.

Our team’s main observation from this data was that the most-used AI agents tended to draw from the most sources; however, on average, today’s AI agents still fall short of robust research capability that would compare with a human researcher.

Trust Gap between Agentic and Manual Search

Trust is a key dimension of user satisfaction when people use AI agents for search & discovery tasks. We asked users to score their trust of manual results versus agentic results for the same tasks. The results were as follows:

Trust PreferencePercentage of Users
Trusted Manual Results More54%
Trusted Agentic Results More34%
Trusted Both Equally13%

Manual search results were more trusted by a significant margin (20 points). For users with technical backgrounds, the trust gap in favor of manual search widened to 37 points due to AI hallucination and weak citations.

Time Efficiency of Agentic Tools

Time savings will be a key factor in the adoption of agentic AI agents by both businesses and individuals. We asked users to execute a range of tasks both manually and with an AI agent and compared the time spent in order to gauge the current state of agentic tools.

Task TypeAgentic TimeManual TimeTime Saved (%)
Trip Planning9.2 minutes38.5 minutes76%
SaaS Comparative Analysis8.7 minutes27.0 minutes68%
Budget Optimization6.1 minutes21.3 minutes71%
Learning Recommendations5.3 minutes14.6 minutes64%
B2B Vendor Sourcing10.0 minutes22.4 minutes55%

The average time savings across all tasks when comparing the use of an AI agent vs manually completing the task was 66.8%, highlighting one of the clearest benefits of agentic AI.

Most-Refused Agentic Task Types

As much as we hope to rely on AI agents, they won’t do everything. High task refusal rates will pose a significant barrier to adoption of agentic AI tools and conversely, will also ensure ongoing need for additional human involvement in industries such as law and medicine. Our study found that approximately 8.9% of user requests were rejected outright by agentic platforms. The reasons most often involved ethical concerns, lack of sufficient information, or speculative content. The table below shares the most common types of rejected user requests.

Task TypeRefusal RateRefusal Reason 
Legal Counsel32%Interpreting laws or offering personalized legal advice falls outside most AI agents’ regulatory boundaries, as doing so may constitute unauthorized practice of law.
Reverse Engineering21%Reverse engineering AI algorithms, decompiling security or copyright-protected software, or analyzing proprietary firmware are all against most AI agents’ ethical and legal standards.
Financial Investment Guidance18%Recommending specific stocks, constructing portfolios, or making personalized investment decisions is considered high-risk and typically restricted by AI agents to avoid violating financial regulations or offering unlicensed advice.
Speculative Predictions15%Most AI agents discourage forecasting market trends, political outcomes, or future events, as it often leads to unreliable outputs and misrepresents the system’s capabilities.
Health Risk Assessments14%Diagnosing conditions or offering personalized medical guidance is explicitly limited in most AI systems to comply with healthcare regulations like HIPAA or FDA guidance.

Refusal rates varied across platforms, with Google Astra rejecting the highest percentage of queries tested at 11.4%, while Claude Computer Use was the most permissive at 6.8%.

User Satisfaction by Task Type

We analyzed user satisfaction on a 1-10 scale (1 – very unsatisfied, 10 – very satisfied) for tasks in 6 categories in order to gauge how effectively AI agents completed tasks:

  • Informational: Tasks wherein the AI agent is asked to provide defined information, such as simple definitional queries or explanations of topics that require little to no judgment
  • Comparative: Tasks asking the AI agent to provide a comparison of two or more items
  • Navigational: Tasks asking the AI agent to open another program or app and complete a subtask within that program or app
  • Exploratory: Tasks that help with open-ended discovery or brainstorming
  • Transactional: Tasks wherein the AI agent completes a purchase or another transaction
  • Generative: Tasks wherein the AI agent creates documents, images, code segments, or other content
Task TypeExampleAvg. Satisfaction (1–10)
Informational“What is quantum computing?”8.2
Comparative“Compare the iPhone 16 to the iPhone 16 Pro”7.9
Navigational“Open Spotify and play my Release Radar.”7.5
Exploratory“What are some fun activities to do between meetings on a business trip to DC?”7.1
Transactional“Book a flight from JFK to MIA on JetBlue next Tuesday morning.”6.3
Generative“Create a calculator that tells me the ROI a company would get from switching its CRM.”5.8

In our study, informational tasks scored highest, largely because the algorithms for basic information discovery have been worked out through mass generative AI chatbot usage since December 2022. Tasks requiring novel content generation and transaction scored the lowest due to frequent errors, as well as agentic AI’s relative newness, leading to relatively less training & personalization of agentic AI systems.

Further Reading

Evan Bailyn

Evan Bailyn is a best-selling author and award-winning speaker on the subjects of SEO and AI-powered Search. Contact Evan here.