Chinese Twitter Diplomacy

The way in which Chinese diplomats communicate has changed recently, from being mostly reactive and pragmatic to being rhetorically more combative.
This has generated strong academic, media, and policy interest. However, less academic attention has been devoted to the employment of social media for China’s new diplomatic communication strategy. By analyzing the recent employment of Twitter by China’s Ministry of Foreign Affairs (MFA), this article explores the initial digitalization of China’s public diplomacy from November 2019 to February 2021 (during the Covid-19 pandemic).

Approaching Twitter as both a virtual network structure and an interactive strategic communication process, 61,000 tweets from Chinese diplomats plus 282,000 tweets from Chinese official media were collected and applied with data analytics to examine how the MFA augmented its diplomatic digital presence by responding, reposting/retweeting, mentioning, and hashtagging.

Discourse analysis was also used to investigate how Chinese diplomats selected topics to generate, diffuse, and affect hegemonic discourses.

The role of Center for Humanities Computing

Data engineering

For this study CHC provided data extraction and data engineering using Twitter’s API v2 with scientific access to extract a comprehensive dataset of tweets from 34 Chinese diplomatic handles and 12 official media handles during the period from Nov. 1, 2019 to Feb. 28, 2021.  After data cleaning, we created a dataset of 343,148 tweets (approximately 61,000 tweets from Chinese diplomats plus 282,000 tweets from Chinese official media) for further analysis.

Training of machine learning models and data modeling

This study has employed mixed methods, drawing on a HITL approach to machine learning to deduce and analyze the structure of China’s initial diplomatic communication network set up on Twitter from late 2019 to early 2021.

CHC developed different computational designs by using a Human-In-The-Loop (HITL) approach to machine learning. This enabled exploration of the digitalization of China’s public diplomacy as it transformed into digital communication with the global public. Usage of the The HITL approach to lexical modelling is per definition iterative, meaning that multiple models were trained both formally, for hyperparameter optimization, and conceptually, for human interpretability.

CHC used Latent Dirichlet Allocation (LDA) to train a so-called topic model on our datasets to uncover hidden semantic structures (LDA is a generative probabilistic model enabling users to discover topics in huge data collections). LDA made it possible to categorize the 239,943 documents in different topics and subsequently extract representative tweets of each topic to manually conduct an informed analysis of the numerous discourses unfolding across the dataset. While applications of NLP (natural language processing) techniques to social media data have been growing rapidly over the last decade, our mode of application differed, because it drew heavily on a Human-In-The-Loop (HITL) approach to machine learning

Check out the source code repository for the Chinese Twitter Dimplomacy project.

Project affiliation


Chinese Twitter Dimplomacy was funded by the Aarhus University Research Foundation (AUFF) under Research Grant AUFF-E-2020-9-1

Collaboration and Partnership

Collaborate with our Research Software Engineers, Data Scientists or Data Managers

Services and Support

Contact us by submitting a ticket with the CHC frontoffice