Automatic presupposition extraction using large language models: A study on implicit questioning techniques in news interviews
The extension of corpus-based analyses to the field of pragmatics, while mutually beneficial, has faced challenges unique to pragmatics compared to other fields of linguistics. This can be attributed to various issues including highly subjective and time-consuming annotations and the emphasis on context rather than surface-level structures (Garassino, 2022). However, new technologies being developed in AI and natural language processing could alleviate the impact of these issues. In a soon-to-be-published paper, Yu et al. (2023) test the capacity of prompt engineering various large language models (LLMs) to automatically annotate speech acts in English language apologies with promising results, showing that adopting generative AI models alongside human quality assurance can reduce the time and resources required for more complex and subjective tasks.
One specific discourse context that has seen primarily small-scale qualitative analyses but could benefit from a quantitative approach that retains the depth of analysis is broadcast news interviews (Lehman-Wilzig, 2022). In these interviews, journalists take on the conventionalized role of interviewer, tasked with remaining neutral while also progressing their own agenda (Clayman and Heritage, 2002; Feldman, 2022). My project aims to explore how journalists utilize pragmatic tools to hide subjective content when questioning in news interviews through the annotation of presuppositions in questions. Furthermore, I propose to apply a pre-constructed LLM as the first annotation stage to alleviate the time-consuming task of presupposition extraction. Following this initial step, the annotations will be checked and corrected to create a dataset of interviewer questions with corresponding presuppositions for further analysis.
The results of this study will shed light on the experimental technique of prompt engineering LLMs to assist in specific usage-based pragmatic annotation tasks, as well as on what techniques journalists adopt during news interviews to push their agendas while maintaining a neutral image.
References
Clayman, S., & Heritage, J. (2002). The news interview: Journalists and public figures on the air. Cambridge University Press.
Feldman, O. (2022). Introduction: Political Interviews—An Analytical Model. In Adversarial Political Interviewing: Worldwide Perspectives During Polarized Times (pp. 1–21). Springer Nature Singapore.
Garassino, D., Brocca, N., & Masia, V. (2022). Is implicit communication quantifiable? A corpus-based analysis of British and Italian political tweets.
Lehman-Wilzig, S. (2022). Political Interviewing Research: Commonalities, Contrasts, Conclusions & Critiques. In Adversarial Political Interviewing: Worldwide Perspectives During Polarized Times (pp. 379–392). Springer Nature Singapore.
Yu, D., Li, L., Su, H., & Fuoli, M. (2023). Assessing the potential of AI-assisted pragmatic annotation: The case of apologies. https://arxiv.org/abs/2305.08339
What was the heuristic basis for using ChatGPT for this task in the first place? I mean, was it assumed ChatGPT would be able to handle the task or was there a particular substantiated basis? Or otherwise were you merely experimenting to test it out? Since ChatGPT is not a thinking or reasoning platform but merely a text pattern repeater, it would not seem suited to this task – and from the video poster results, it seems it ended up doing a poor job, right? Thanks.
Hi Nicholas, thank you for your questions. Yes, this work is definitely more experimental, as research on the annotation capabilities of large language models (LLMs) is new and ongoing. However, there are some promising results coming out that show LLMs can match human annotators on various tasks, including a pragmatic task as shown by Yu et. al (2023).
As for the results, baseline ChatGPT did not extract presuppositions as defined by linguistics, but including few-shot examples and splitting a complex task into two smaller tasks improved the results. I also want to stress that the this would likely be the same with untrained human annotators, who also require detailed guidelines, examples, and well-defined tasks to achieve consistent and high-quality annotations. So, it will be interesting to continue exploring the possibilities and limitations of these LLMs in varying pragmatic tasks.