Editorial

Q&A: Automated Intelligence talks harnessing data in government

We caught up with Automated Intelligence to talk about the challenges facing government departments looking to find value in their ‘digital heap’ of data, and how a spring clean paid off for the Cabinet Office.

Posted 6 September 2023 by Christine Horton


At a high level, what are some of the challenges government departments face in relation to their data?

Government departments face numerous challenges in relation to their data.

The volume and types of data created over the last few decades has grown exponentially. This raises many concerns for Government departments trying to make sense of what data they have, where is it stored and what they should delete or retain. Government departments collect and manage data in separate silos, which can make it difficult to share information across departments and gain a comprehensive view of operations.

Ensuring the quality of data can be a challenge, particularly when data is collected from multiple sources. Poor data quality can lead to incorrect or misleading insights and decisions.

Government departments are responsible for safeguarding sensitive data, such as personal information, medical records, and financial data. Ensuring the security and privacy of this data can be a complex and ongoing challenge. Departments need to continually adapt to regulatory change such as GDPR or the obligation for permanent preservation with the TNA in order to remain compliant.

Data is often residing in platforms no longer actively used that are expensive to maintain, requiring specialist skills and are difficult to access, or have been moved across systems to avoid obsolescence and also lack original metadata or context. Many government departments still rely on legacy systems that may not be designed to handle modern data volumes or may be difficult to integrate with newer technologies.

Policy doesn’t change as fast as technology. In recent years, technology has been a game-changer increasing efficiencies such as productivity, collaboration, capacity, costs and enhanced governance and security. However, the reality is that the technology, approaches and policy haven’t been applied consistently or effectively for managing the lifecycle of born digital information.

Government departments may not have the resources, such as skilled personnel or funding, to effectively manage and analyse their data. This can lead to inefficiencies and missed opportunities for insights.

A lack of budget, prioritisation and the correct methodology such as a digital appraisal methodology lead to continuous problems. The perceived benefits of low-cost storage has been wiped out by sheer volume, and risk, and there are alternatives to drive the business case for reallocating budget and prioritisation.

Finally, Government departments may use different systems, formats, or languages for their data, making it difficult to combine data from different sources or analyse data across departments.

You describe the ‘digital heap’ within public sector organisations – can you explain what is this and how can it cause problems?

The term “digital heap” typically refers to the massive amount of digital data that is generated and accumulated by individuals, organisations, and systems on a daily basis. This data can come from a wide range of sources, including social media posts, emails, website traffic, search queries and more. It’s an uncontrolled and largely unorganised volume of digital files and other information which accumulates in data systems if you don’t look after it. Tackling this digital heap is, then, a top priority.

As new solutions for document creation and storage emerged, the digital heap became increasingly dispersed and unstructured, and of course, difficult to manage. Even today, most staff do not have the time or the inclination to retrospectively manage the information they create on a daily basis. Without implementing automation and intelligence to manage information governance, the digital heap simply continues to grow.  This can of course cause huge implications for government in terms of managing their data, complying to regulations and associated cost implications. On the one hand, this data can be used to gain insights, make informed decisions, and drive innovation. On the other hand, managing and analysing the digital heap can be overwhelming, requiring sophisticated tools and techniques to extract meaningful information from the vast amounts of data available.

It’s particularly problematic as tension in legislation exists between keeping information and ensuring it’s not over-retained.

What do disruptive technologies, such as ChatGPT, bring to the conversation?

Disruptive technologies such as ChatGPT can help with government data management challenges in several ways:

Automated data analysis: ChatGPT can be used to automate the analysis of large amounts of unstructured data. This can help government departments gain insights more quickly and efficiently than traditional manual analysis methods.

Natural language processing: ChatGPT and other natural language processing technologies can help government departments understand and analyse written or spoken language, making it easier to extract insights from textual data.

Personalisation: ChatGPT can be trained on specific datasets to provide personalised insights and recommendations. This can be particularly useful for government departments that collect large amounts of data on individual citizens.

Improved decision-making and increased efficiency: By automating data analysis and providing insights in real-time, ChatGPT can help government departments make more informed and timely decisions as well as increasing staff productivity and reducing costs.

Increased efficiency: By automating data analysis and other tasks, disruptive technologies such as ChatGPT can provide powerful tools for government departments to manage their data more effectively, leading to better decision-making, increased efficiency, and improved citizen services.

You also say that the data behind the large language model is key to discovering informed insights – can you expand on that?

By using large language models to analyse and interpret data, researchers and businesses can gain valuable insights that might be difficult or impossible to detect through other means. This can lead to better decision-making, more accurate predictions, and a deeper understanding of complex phenomena. Additionally, large language models can help automate tasks that might otherwise require significant human intervention, allowing researchers and businesses to scale their operations and achieve their goals more efficiently.

Can you talk a bit about your work with the Cabinet Office? What challenges was it facing around managing its data?

We deployed our cloud-based data analytics software solution – Datalift, which reviewed, analysed and cleansed the entire department’s metadata. This included processing 11 million files held in 170 different formats to locate, retrieve and appraise its records, providing the opportunity to address its digital heap by automatically recommending content that should be kept or deleted.

The Cabinet Office found difficulty with the sheer volume of its data. As it contained such large volumes, it became difficult to identify where its data was held, what it contained and the time consuming process around managing it which was hard to do with limited resources.

The Cabinet Office’s DKIM team needed to create a secure method to sort and move information from its digital heap to a controlled environment where it could be reviewed for retention or remediation.

The Public Records Act (PRA), which mandates the transfer of records with historical value for permanent preservation to The National Archives (TNA) and the principal of storage limitation under the General Data Protection Regulation (GDPR) were other challenges that added to the complex nature of its data estate.

How did you help the organisation address and overcome those challenges?

Automated Intelligence were able to help the Cabinet Office perform analysis of their digital files. In a previous project, the solution saved £0.5m per year on storage costs as the DKIM (Department of Knowledge Information Management) team developed their digital appraisal methodology against a dataset of 11m files.

As part of this process, a time and motion study revealed without Datalift it would have taken 59 viewers a year to review these files for disposal, at a cost of £2.2m to the public purse. In addition, this process has ensured compliance with the public legislation records act whilst also enabling the Cabinet Office to explore the use of in-house subject matter expertise, data science skills and techniques to enable faster innovation opportunities.

The Cabinet Office has recently signed a two-year contract with Automated Intelligence which will support the Cabinet Office’s “annual ‘Spring Clean’ processes in which it analyses the departments legacy information awaiting a disposal decision. Currently, the department’s DKIM team have 4.9 million files located in ‘holding pens’ where a decision needs to be made on whether to delete or retain the information.”

What has been the result?

The Cabinet Office has successfully achieved a range of outcomes including:

  • Being able to efficiently manage and automated its information lifecycle with data being more easily identifiable and more readily accessible. The solution provided enhanced visualisation of data through a real time dashboard.
  • The department saved time and effort by automating processes. The solution allowed them to easily decide on which data to delete and which data to preserve.
  • Risks were efficiently uncovered and reduced through enhanced visibility and defensible audit trails.
  • It led to a total cost avoidance of £3 million (£0.5 million per year in infrastructure costs and £2.5 million in labour costs)—money which would’ve been taken from the public purse.
  • Increased staff productivity and capacity – reducing the need to manually trawl through data, no longer need to deploy large numbers of people to review documents.

Think Data for Government returns with a stellar guest list of speakers who will address some of the hottest topics affecting both local and central government. It’s an in-person only event, so delegates will maximise their face to face networking. Register your place today.

Event Logo

If you are interested in this article, why not register to attend our Think Data for Government conference, where digital leaders tackle the most pressing issues facing government today.


Register Now