Conspiracies: Causal Reasoning and Online science scepticism

Causal Reasoning and Online science scepticism (CROSS)

Recent work has applied automated narrative graph extraction to investigate diverse topics, including the anti-vaccine narratives in parenting message boards and character relationships in book reviews. However, the extraction of narrative graphs depends on intricate pipeline of discrete Natural Language Processing (NLP) tasks, many of which were underdeveloped for the Danish language. By working improving model performance for each of these, the project CROSS has developed a comparable narrative graph extraction pipeline for Danish, which incorporates advanced techniques for knowledge graph extraction, allowing researchers to extract meaningful relationships and connections between entities mentioned in the aggregated texts. This serves a crucial role in the CROSS project by enabling richer interpretation of large Danish text corpora and facilitating the analysis of narrative structures, with a particular focus on narrative structures within conspiracy theories and unverified stories.

Identifying the emergence and spread conspiracy theories circulating on social media and in the news media is vital in today's digital age, where the rapid spread of misinformation and conspiracy theories through online platforms poses a significant threat to democratic institutions. The findings of American studies highlight the undermining nature of conspiracy theories, making the automatic discovery of narrative frameworks highly relevant for ensuring public safety and safeguarding democratic institutions.

By applying similar research methodologies in the Danish context, the project aims to make important contributions to these important aspects by creating a Danish version of the NLP pipeline. Additionally, the Danish pipeline’s flexibility allows for its potential application in other contexts to further enhancing its value.

Role of Center for Humanities Computing

Data collection

In this collaboration Center for Humanities plays a role in providing comprehensive data collection services. By gathering data from diverse sources, including Twitter and Danish newspapers, ensuring a wide range of relevant information for analysis.

Creating the Pipeline

Center for Humanities Computing has also played a key role in contributing to the development of the pipeline and creating machine learning pipelines that captures the relationships between actors and their interactions. This innovative model enhances the pipeline's ability to analyse and interpret the narrative structures of conspiracy theories by visually representing the connections and dependencies between different entities mentioned in the text.

Improving Coreference Resolution

Recognizing the lack of a suitable coreference model for Danish, Center for Humanities Computing took on the responsibility of developing an Danish models for coreference resolution. CHC specifically trained it to accurately identify references to the same actants or entities within Danish text. This novel approach ensures a model with a more precise understanding of the relationships between different mentions throughout the text.

Triplet Extraction

The pipeline incorporates triplet extraction, which entails identifying the subject, predicate, and object within sentences or expressions that contain valuable structured information. CHC's experts have contributed to the annotation of triplet extractions, labelling certain sentences as "gold standard" examples. This training enables the model to recognize similar patterns in other texts, improving its capacity to extract pertinent/relevant information and establish meaningful relationships between entities.

Project affiliation

School of Communication and Culture

Funding

AUFF starting grant

Collaboration and Partnership

Collaborate with our Research Software Engineers, Data Scientists or Data Managers

Collaboration and Partnership

Services and Support

Submit a Ticket

Revised 05.03.2025

Line Ejby Sørensen