ChatGPT and the public service

By Ong Li MinJason Grant Allen

Public sector organisations in Singapore and beyond are trialling versions of ChatGPT and other generative AI to improve productivity and drive citizen-state communications. Ong Li Min, Research Associate at the SMU Centre of AI and Data Governance, and Associate Professor Jason Grant Allen, Director of the Centre, share on the risks and possibilities of generative AI within the public sector.

ChatGPT promises an increase in productivity within the public sector but governments need to ensure potential harms are avoided.

Following the launch of OpenAI’s ChatGPT late last year, an unprecedented number of people have used a Generative AI (GenAI) tool for the first time: almost 30 per cent of professionals have tried ChatGPT at work, as it can be usefully harnessed to make various professional and expert tasks easier and quicker.

Beyond the publicly available version of ChatGPT and its competitors, a number of bespoke “enterprise” tools are currently in development across numerous industry verticals. Recently, it was announced that a version of ChatGPT was being developed for use within the Singapore civil service to assist in research and the crafting of speeches. Following the news that ChatGPT could be used to assist up to 90,000 Singapore civil servants in research and speech-writing, the Ministry of Communications and Information has clarified in parliament (9 May) that usage guidelines have been introduced in the public service.

Public sector use-cases of AI raise unique concerns affecting the citizen-state relationship, particularly accountability and legitimacy. The deployment of AI for public service delivery is not new. Many are familiar with ‘Ask Jamie’, the Singapore Government’s natural language processing driven virtual assistant which assists the public with queries such as what MediShield Life is about. It now also facilitates personalised transactions, such as directing the user to the personal tax portal to file taxes. In this way, the chatbot has enabled interactions between citizens and the state to become intuitive and seamless. Meanwhile, GenAI has been percolating into public service delivery: in Australia, MPs have delivered speeches in parliament that were partially written by ChatGPT; in Colombia, a judge reportedly used ChatGPT in a case concerning a vulnerable minor.

To understand the risks and opportunities of using the next generation of GenAI like ChatGPT, it is helpful to understand how large language models (LLMs) work. LLMs use a layered architecture of interconnected nodes, or “neurons”, to process and transform input data. Each neuron performs a simple computation and passes its output to other neurons in the next layer. Through a process called “training”, the network adjusts the strength of connections between neurons to learn patterns and relationships in the data. This allows it to make predictions on data it has not yet seen.

ChatGPT uses this structure to string words together, in response to a prompt or query, based on a historical data set to produce plausible sentences. Each response is unique to each interaction and the history of questions and answers that precedes it. The derivation of each answer from the data set is basically non-transparent — and still highly unreliable.

Hallucinations and beyond

LLMs are essentially probability engines that do not actually “comprehend” the text they produce, but simply predict the most likely next word in a sentence based on the training data. Sometimes the answers they give are seriously flawed, but stated in a convincing, authoritative tone. While such a tool can be used to summarise text or generate a draft letter of decision, the current state of development totally precludes their use to do things like determine the applicable law, let alone take a decision on an application.

Indiscriminate or careless use of ChatGPT in spheres such as journalism and policy research risk devaluing information and trust in our information society, driving us further into a “post-truth” world. The consequences of misinformation being relied on by public decision-makers is potentially even more grave. For example, if the use of false legal citations in court goes undetected, this could threaten the integrity of the legal system. Likewise, research loaded with misinformation could adversely affect policy and regulation. The use of GenAI means that number of “facts” to fact check is going to rise exponentially, many of which cannot be attributable to a human author. Rectifying misleading “alternative facts” and holding actors to account will be an uphill challenge.

These are, for the most part, avoidable harms. Academic, industry, government, and civil society bodies are considering the necessary guardrails and usage policies. The SMU Centre for AI & Data Governance are working on a set of principles and emerging practices for the responsible integration of GenAI into human workflows in different sectors, and some guardrails for the development of more trustworthy enterprise systems.

Dehumanising exercises of public authority and obfuscating accountability

Even once enterprise-grade systems are created, however, some deeper considerations remain in the public sector. The backlash at a condolence message partly generated by ChatGPT for a tragic shooting at an American university demonstrate that communities expect certain communications to be performed by a human.

The perceived acceptability of using AI in public service communications might influence perceptions of the legitimacy of the government programmes, too. Courts in the US and Canada are already dealing with the use of AI in administrative decision-making, such as the determination of work permit applications. Crucially, we might not want certain decisions that attract serious consequences to be made by a “black box” machine. One recent exploration of the nature of the judicial function stresses a number of distinctively human elements in the judging process, for example. For its part, OpenAI has prohibited the use of its GPT-4 in contexts of “high risk government decision making, including law enforcement and criminal justice, migration and asylum”.

Legal accountability mechanisms depend on examining the states of mind of the individual. The “problem of many hands” can make it difficult to attribute a decision or action down to a particular person within an organisation. When AI applications are integrated into a workflow, questions of responsibility and accountability are even harder. The Singapore Government’s stance that individual public officers are responsible for their own use of ChatGPT might resolve some issues of clarity, but systematic incorporation of such tools into the public service will require deeper and more detailed reflection in due course.

The future of human-machine intelligence

Given that ChatGPT is a general-purpose tool, it is the user who determines how they will use the programme. Human intervention to check the generated output is necessary to mitigate potential risks. The AI governance literature stresses keeping a “human in the loop” (ideally, in a participatory role) in the automated workflow. Although the “loop” metaphor may not be apposite for every type of AI system, human supervision is an important principle and will help in situations where it is necessary to attribute liability for an adverse outcome.

In our view, the problem is not the use of GenAI tool itself but how it is used. A LLM combined with a search engine can be a valuable time-saver for research and preliminary drafting. Is the tool being used to generate ideas, frame problems, or search for facts? Is the user qualified and equipped to validate the output? How will the output be used? What kind of impact might it have including to the public? As risks vary according to the use, usage policies and guidance will need to be increasingly tailored to the use-case to capture benefits while avoiding risks of harm.

Ideally, disruptive technology should be used to free up time for human intelligence and enhance quality of life. As historian Louis Hyman suggested, workers who are able to automate tedious tasks are vital because they are then free to do “more complicated, more rewarding, more human work”. Moreover, as Sam Altman said in Singapore recently, GenAI systems will require both global and local democratic input, to align it with people’s values, history and culture. At the same time, tech needs to give users much more control. Thus, for its current imperfections, this new wave of GenAI represents an opportunity to augment, rather than replace, human intelligence. While short-sighted actors might use language models to replace workforce, the holistic approach is to think of how best to work with machines.

The interaction between human and machine in workflows deserve as much attention as AI standards and verification schemes. We should design human-machine composite systems to leverage machines’ competencies to free up the use of human intelligence including empathy, moral discretion, and reasoning. As David de Cremer and Gary Kasparov argue, the real question is: how can human intelligence work with artificial intelligence to produce augmented intelligence?

ONG Li Min is a Research Associate at the SMU Centre for AI & Data Governance.

Jason Grant ALLEN is an Associate Professor of Law at SMU Yong Pung How School of Law and Director of the SMU Centre for AI & Data Governance.

This research is supported by the National Research Foundation, Singapore under its Emerging Areas Research Projects (EARP) Funding Initiative. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors’ and do not reflect the views of National Research Foundation, Singapore.

A shorter version of this commentary has been published in the Straits Times on 6 July, here. This article was first published on SMU Centre for AI and Data Governance's Medium page here.