$ $The usage of generative artificial intelligence (AI) tools based on large
language models, including ChatGPT, Bard, and Claude, for text generation has
many exciting applications with the potential for phenomenal productivity
gains. One issue is authorship attribution when using AI tools. This is
especially important in an academic setting where the inappropriate use of
generative AI tools may hinder student learning or stifle research by creating
a large amount of automatically generated derivative work. Existing plagiarism
detection systems can trace the source of submitted text but are not yet
equipped with methods to accurately detect AI-generated text. This paper
introduces the idea of direct origin detection and evaluates whether generative
AI systems can recognize their output and distinguish it from human-written
texts. We argue why current transformer-based models may be able to self-detect
their own generated text and perform a small empirical study using zero-shot
learning to investigate if that is the case. Results reveal varying
capabilities of AI systems to identify their generated text. Google’s Bard
model exhibits the largest capability of self-detection with an accuracy of
94%, followed by OpenAI’s ChatGPT with 83%. On the other hand, Anthropic’s
Claude model seems to be not able to self-detect.
Analysis of Authorship Attribution with Generative AI Tools
In recent years, the advancement of generative artificial intelligence (AI) tools has opened up new realms of possibilities in various industries. These tools, such as ChatGPT, Bard, and Claude, have proven to be powerful in generating human-like text. However, as with any tool, there are important considerations to be made.
One such consideration is the issue of authorship attribution when using AI tools. This becomes particularly critical in academic settings, where the authenticity and originality of work are highly valued. The ability to trace the origin of text generated by AI is crucial to avoid plagiarism and maintain academic integrity.
Currently, plagiarism detection systems are not equipped to accurately detect AI-generated text. Therefore, it is essential to explore new methods and approaches that enable the identification of AI-generated content. This paper proposes the concept of direct origin detection and seeks to evaluate whether generative AI systems can recognize their own output and differentiate it from human-written texts.
The interdisciplinary nature of this research is evident. It encompasses elements from computer science, linguistics, and education. The development and evaluation of AI models require expertise in natural language processing and machine learning. Simultaneously, understanding the impact on student learning and the academic research landscape necessitates insights from education and pedagogy experts.
One interesting aspect of this study is the use of transformer-based models. Transformers have revolutionized natural language processing due to their ability to capture contextual dependencies efficiently. The authors propose that these transformer-based models may have the potential to self-detect their generated text.
The empirical study conducted using zero-shot learning techniques sheds light on the varying capabilities of different AI systems. Google’s Bard model demonstrates an impressive accuracy of 94% in self-detection, indicating a high level of awareness of its own output. OpenAI’s ChatGPT follows closely with an accuracy of 83%. However, Anthropic’s Claude model seems to lack self-detection abilities, suggesting room for improvement.
Overall, this research opens up important avenues for ensuring the responsible use of generative AI tools. By developing techniques for authorship attribution within AI-generated text, academia can protect its integrity and foster meaningful student learning. Further exploration of this area could involve refining detection methods, understanding the limitations of different AI models, and exploring ways to incorporate such tools in educational environments while maintaining ethical practices.