Does AI-generated content work for SEO? We conducted a 6 month study study comparing AI-generated and human-generated content from June – December 2023 to answer this question.
How Effective is AI-Generated Content for SEO?
Below, we lay out our study’s design, strengths & limitations, results, conclusions, and key findings.
We designed our study as a direct comparison of human-generated content vs AI-generated content on an established website. The website we chose is the home of a well-established company with thousands of inbound links, and has an ahrefs domain rating in the high 70s. It contains several subdomains. The subdomain we chose as our control publishes only human-generated content and has existed for just over 6 years. The subdomain we chose as our experiment began producing AI-generated content at the start of our study but published human-generated content for 27 months prior. Thus, the comparison tracks two subdomains that were engaged in roughly the same activity until the experimental subdomain changed course and began publishing AI-generated content only.
Study Strengths and Limitations
We made the following observations about the integrity and limitations of our study:
- The articles produced on the control and experimental subdomain were all approximately the same length (avg of 1,358 words + 2 images for control; 1,192 words + 2 images for experimental).
- While the experimental subdomain had existed for a shorter period of time than the control (~2 years vs ~6 years), the difference is de minimis as research shows that Google confers full ranking capability to domains that consistently publish content for 2 years.
- The AI-generated content was lightly edited by humans to correct errors and include calls-to-action for products; thus, this comparison between AI-generated and human-generated content is not pure. We estimate 6% of every “AI-generated” article was written by a human.
- The keywords targeted by both the control and experimental subdomains were roughly equivalent in competitiveness, ranging from 40-60 in ahref’s Keyword Difficulty scale. However, there is the potential for clustering of keywords around the top or bottom of that range within a given month, influencing average ranking. It is unlikely that this effect would significantly impact the key findings of this study.
The first chart compares the average ranking of newly-published content on the control (human-generated) subdomain to the experimental (AI-generated) subdomain.
Average Ranking of Newly-Published Content*
|Avg Ranking – Control
|Avg Ranking – Experimental
*Ranking snapshots taken 1 week after publish date
The next chart tracks the average ranking of 8 articles published during the first month of the campaign on both the control and experimental subdomains.
Average Ranking of First Month Content
|Avg Ranking – Control
|Avg Ranking – Experimental
Our research team spent several weeks studying the data and ultimately reached the following conclusions:
- The AI-generated content ranked similarly to the human-generated content in months 1-3, but began to lose ground in Month 4, and continued to lose ground until the end of our study.
- By Month 6, all new AI-generated content was ranking, on average, 3 spots lower than all new human-generated content. Moreover, the trajectory of the lines in the first graph indicate that the gap was likely to continue widening.
- Over the course of the study, AI-generated content continued to rank lower, concluding the 6 month period 1.5 positions lower than where it started. By comparison, human-generated content ranked .4 positions higher than where it started by the end of the study.
- The trajectory of ranking changes for first-month content indicates that AI-generated content would continue to rank lower had the study period been extended.
The key findings of our study are at the top of this article. These findings are the results of our team’s analysis of the study’s data and conclusions.
Overall, AI-generated content is treated similarly to human-generated content on a domain that has already earned Google’s trust. But this effect seems to last only 1-2 months. Afterwards, the AI-generated content loses ground slowly but consistently.
It is unlikely Google recognized the AI-generated content and docked its overall trust score, as the company recently stated that it is indifferent to how content is created as long as it’s high quality. The more likely reason for the AI-generated content’s lower rankings is that its quality is below that of human-generated content (assuming the human-generated content was written by an expert and follows best practices).
Finally, our team added that AI chatbots such as ChatGPT, Bard, and Grok are very useful for content creation, just not as a writing tool. They found the best uses of generative AI to be related to research prior to writing, including: (1) training writers on their target audience and personas; (2) summarizing complex primary source materials; and (3) generating charts, graphs, and tables.
If you would like a PDF copy of this study, reach out to us through our contact form.