In this article, we explore the use of large language models like ChatGPT as auditors for causal networks. Causal networks are commonly used to model complex relationships between variables in various fields, but they often contain erroneous edges. Correcting these networks typically requires domain expertise that may not be readily available. Our proposed method involves presenting ChatGPT with a causal network, one edge at a time, to gain insights about edge directionality, potential confounders, and mediating variables. We analyze ChatGPT’s perspectives on each causal link and generate visualizations summarizing these viewpoints for human analysts to make informed decisions. By integrating large language models, automated causal inference, and human expertise, we aim to develop comprehensive causal models for any scenario. This paper introduces early results with a prototype of our approach.
Abstract:Causal networks are widely used in many fields, including epidemiology, social science, medicine, and engineering, to model the complex relationships between variables. While it can be convenient to algorithmically infer these models directly from observational data, the resulting networks are often plagued with erroneous edges. Auditing and correcting these networks may require domain expertise frequently unavailable to the analyst. We propose the use of large language models such as ChatGPT as an auditor for causal networks. Our method presents ChatGPT with a causal network, one edge at a time, to produce insights about edge directionality, possible confounders, and mediating variables. We ask ChatGPT to reflect on various aspects of each causal link and we then produce visualizations that summarize these viewpoints for the human analyst to direct the edge, gather more data, or test further hypotheses. We envision a system where large language models, automated causal inference, and the human analyst and domain expert work hand in hand as a team to derive holistic and comprehensive causal models for any given case scenario. This paper presents first results obtained with an emerging prototype.