Using GPT-3.5 to Summarise UK's Parliamentary Mentions of Hong Kong

What do UK MPs think about Hong Kong? Are their views on the city more positive or negative? How do they perceive its political development in recent years? Furthermore, do they welcome British National (Overseas) immigrants?

In a previous post, I introduced my project which curates a dataset encompassing all mentions of “Hong Kong” in UK parliamentary speeches. Despite the richness of information these speeches offer, their vast quantity presents a significant challenge for analysis. From June 2001 to December 2023, 1,272 speakers made 7,156 speeches mentioning “Hong Kong” across all UK parliaments, amounting to a total of 2,909,743 words — nearly three times the length of the entire Harry Potter series combined.

While human analysts might require hundreds of hours to sift through three million words, large language models (LLMs) can accomplish this task in minutes, if not seconds. In this project, I have employed OpenAI’s GPT-3.5 (specifically gpt-3.5-turbo-0125) to generate concise summaries for each of the 1,272 parliamentary speakers who have mentioned “Hong Kong” over the past two decades. By distilling extensive parliamentary speech data into succinct, accessible bullet points, I aim to enable readers with similar interests to grasp individual positions without having to peruse every speech. You can use the tool at the top of this page to search and see the summary of an MP.

Prompt set-up

The prompt below is crafted to direct GPT-3.5’s focus towards content specifically relevant to Hong Kong. On occasions, mentions of “Hong Kong” in speeches were tangential to the city itself. This prompt instructs GPT-3.5 to overlook such peripheral mentions and summarise viewpoints that bear a stronger connection to Hong Kong:

Write a concise summary of [speaker name]'s view on Hong Kong based on the following speeches in UK parliaments:
[speech content]
Summary of the speaker's view on Hong Kong in markdown bullet points:

The maximum length of speech content processed by GPT-3.5 is capped at 5000 tokens (~3700 words). Should the total tokens for a speaker exceed this limit, the summary is generated using a two-tier summarisation method. Initially, all speeches are divided into segments within the 5000-token limit. Subsequently, GPT-3.5 generates an intermediate summary for each segment. Lastly, GPT-3.5 synthesises these intermediate summaries into a final comprehensive summary.

Given that the summarisation task relies on the content provided in the prompt, the likelihood of GPT-3.5 inserting “hallucinations” into the summary is minimised. However, while GPT-3.5’s summaries generally appear fluent and coherent, some assessments have highlighted that they may not always be “perfectly faithful to input reviews and over-generalizes certain viewpoints” (p.9 on Prompted Opinion Summarization with GPT-3.5).

Therefore, to ensure transparency and traceability of each speaker’s summary, you can download the full inputs and outputs for each speaker’s intermediate and final summaries by clicking on the download button (it’d appear after you have selected a speaker). For those interested in obtaining all records in a single download, please visit this project’s GitHub page and download the file named ‘speech_summaries.csv’.

Next steps

The approach to summarising each speaker’s viewpoints presents several limitations worth noting. Firstly, it does not consider the evolution of a speaker’s views over time. The summaries treat all speeches as though they were delivered concurrently, disregarding any development or change in perspective. Secondly, parliamentary speeches often contribute to broader debates, and lacking the context of other speakers’ contributions within the same discussion means potentially missing out on significant meanings and nuances. This oversight can lead to a lack of understanding of the full scope of the debate.

Additionally, without directing the model to answer specific questions, it is likely to overlook subtle yet politically significant nuances. In politics, the implications of seemingly minor semantic differences can be profound. For instance, while there is a general consensus among UK MPs on the need for action in response to Hong Kong’s political situation, their approaches can vary significantly — from those advocating for a more assertive stance to those favouring a measured response. The primary distinction often lies in the intensity of the language used to describe China’s actions. Without guiding the model to discern these subtleties, the resulting summaries may miss these critical distinctions.

Moving forward, I aim to refine the process by formulating more precise policy questions and segmenting the summaries chronologically. Right now, the lack of quantitive data makes it hard to present on overview of the speech content. Addressing this could involve employing LLMs to score texts based on specific criteria and scales. I welcome any suggestions or ideas on how to enhance this project, so please feel free to reach out!