XI ICCEES World Congress

Lies in the eyes of AI: Auditing how LLM-powered chatbots deal with disinformation about Russia's war in Ukraine across languages

Wed23 Jul11:00am(15 mins)
Where:
Room 5
Presenter:

Authors

Mykola Makhortykh1; Victoria Vziatysheva2; Maryna Sydorova2; Ani Baghumyan2; Elizaveta Kuznetsova31 University of Bern , Switzerland;  2 Institute of Communication and Media Studies, Switzerland;  3 Weizenbaum Institute, Germany

Discussion

The rise of generative artificial intelligence, in particular large language models (LLMs), leads to major transformations in how individuals engage with information. The ability of LLMs to recognize human input and generate textual outputs enables new possibilities for digital platforms to interact with the users who want to stay up to date on important societal developments, including current armed conflicts. Besides the ability to learn more about these topics due to a dialogue-like and personalised format of information delivery, individuals can be more accepting of (counter-attitudinal) information acquired via LLM-powered applications due to their anthropomorphism. These implications are particularly important regarding information about topics targeted by disinformation: here, LLM-powered applications can enable new possibilities for debunking false claims but also amplifying their distribution.

Such an ambiguous role of LLMs is at least partially attributed to the difficulties of assessing their impact on the spread of disinformation due to LLMs’ high complexity and non-transparency. Existing research suggests that performance of LLMs can be affected by different user- (e.g. language of the prompt) and system-side factors (e.g. built-in stochasticity of LLMs’ outputs). Under these circumstances, there is an urgent need for empirical assessments of how LLM-powered applications deal with disinformation claims, in including the ones instrumentalized by the authoritarian regimes, such as Russian how this process is affected by specific factors, and what implications it may have for the effectiveness of present and future disinformation campaigns.

For this aim, we conduct AI audits of three LLM-powered chatbots - Bard, Co-Pilot, and Perplexity AI - using 28 prompts regarding the ongoing war in Ukraine which is characterized by the large number of false claims made by the Kremlin. The selection of chatbots is attributed to our interest in comparing the performance of chatbots designed by major tech giants (Google and Microsoft) with a startup-made chatbot; furthermore, all three chatbots are integrated with web search engines which are among the most commonly used and trusted digital platforms. All prompts were related to common disinformation claims such as Ukraine being ruled by Nazis or developing biological weapons to attack Russia and translated in English, Ukrainian, and Russian to examine the effect of the prompt language on the chatbot performance. To examine the impact of stochasticity, four assistants generated chatbot outputs around the same time with the same set of prompts. The resulting 1,008 responses were examined using qualitative content analysis to evaluate their accuracy and the presence of debunking claims. Preliminary findings indicate major differences in the quality of outputs depending on the prompt language and substantial impact of stochastic factors on chatbot performance.

Hosted By

Event Logo

Get the App

Get this event information on your mobile by
going to the Apple or Google Store and search for 'myEventflo'
iPhone App
Android App
www.myeventflo.com/2531