Gds.blog.gov.uk

The characteristics of a register

2015-10-13

We’ve talked about registers as authoritative lists you can trust, but what do we mean when we say “register”?

Across government we manage and hold data that we need to deliver services to users and to inform policymaking. We make that data in a variety of ways — from bespoke online tools, dumps of databases, through to published lists. A question we’re often asked is:

What is a register, how is it more than just a database, a statistical report, or a simple list?

To try and answer this question we’ve started to collect a list of characteristics based on the things we discovered during our early discovery and alpha work.

Some of this gets a bit technical, but we think that’s a good thing. Getting the technical stuff right at the start is an important first step.

These characteristics will be refined in the coming months as we learn more by working with people to build beta registers, but here is our first attempt to list them.

1. Registers are canonical and have a clear reason for their existence

A register is the only authoritative list of a specific type of thing. It is the source of that information, kept accurate and up-to-date. For example, the company register administered by Companies House should be the single, authoritative place to go to find data directly related to a limited company such as the date it was formed and the date it was dissolved and a link to the registered office.

The purpose of a register should fall within the bounds of a registrar’s public task — its core role or function.

2. Registers represent a ‘minimum viable dataset’

A register only holds the data it was created to record, and nothing else. It never duplicates data held in other registers. Registers link to data in other registers to avoid the need for any duplication.

To make those links work, each record in a register must have a stable, unique identifier. For example, registers should use the ISO-3166-alpha-2 country code to unambiguously reference a country, relying upon the country register to hold the country’s official-name, local-name and other information for the code.

Registers are long-lived because services and other registers depend on them. A register is just the data. It is the role of services to present data in a variety of different ways which make sense to users.

3. Registers are live lists, not simply published data

Registers are digital and may be accessed or searched by humans or machines using an API. The same data may already be periodically published as a document on a website, but that is not the same as operating a register.

For example, it would be difficult for a developer to use the PDF of sports governing bodies as a selection on a visa application form. They would have to notice when the document is republished and repeat the same work of downloading and processing the document whenever it is updated.

Making changes to a register shouldn’t take long; at most a matter of hours to give custodians the opportunity to check a new entry and guard the register against fraud and error.

Registers should have a standard interface for reading and querying their contents, which follows the API principles set out in the service manual.

There should be a clear process for challenging data held in a register with high standards for transparency, adjudications, and the processing of other issues discovered by users with register data.

Register data should be available in a variety of different standard representations, including JSON for Web developers, comma-separated values (CSV) for people working with tabular data tools like spreadsheets, and RDF for those with needs for linked-data.

A register API should be highly available. Public register data should be cacheable by intermediaries and web clients to enable the incorporation of the register directly in live services, as well as being easily downloaded in bulk for offline applications, and updated using a streaming API.

4. Registers use standard names consistently with other registers

Wherever possible a register reuses standard names for fields to enable discovery — find all registers containing a “company” field, and search — find all the records in all public registers containing “school:1234” or “company:9876”.

The data held in a register may evolve over time: new fields may be added to new entries in a register so long as they have a sensible default value for entries, and existing field names are not used for a new, different purpose.

5. Registers are able to prove integrity of record

Each individual entry in a register is immutable, addressable using a ‘fingerprint’ which may be used by a user as a digital proof of record.

A record in a register is a series of entries sharing the same identifier. The latest entry being the current value for a record. Older entries for a record must remain addressable, but their contents may be removed if instructed by law.

The record of changes made to a register is transparent and independently verifiable.

6. Registers are clearly categorised as open, shared or private

The privacy of a register should be clear, and either open, shared or private:

open registers are public. The data may be accessed, copied and derived freely, by anyone, either as single register entries or as a complete register, with clear licensing terms designed for reuse

shared registers allow access to a single register entry. There will be some form of access control, such as having an access token, paying a small fee, or signing-in in with GOV.UK Verify

private registers contain sensitive information which cannot be accessed directly by services. They may be able to provide answers to simple questions, subject to access control such as “Is the registered keeper of this boat over 21 years of age?” without revealing further details about the individual

a closed register contains data private to a single organisation, is locked away, and not connected directly to a digital service

Following the Identity Assurance Principles means we don’t anticipate a single register of people, but registers may list people against specific roles. For example, DVLA should continue to maintain a register of drivers and a register of keepers of a vehicle.

Public registers should not reference private registers. For example, whilst the headteacher of a school may expect to appear in a public register of educational establishments, and have their name appear on a sign outside the school, they wouldn’t expect their passport, driving licence, tax reference codes or National Insurance number to be made public.

7. Registers contain raw not derived data

Data held in a register should be factual raw data, not informational content, or counts, statistics, and other forms of derived data.

8. Registers must have a custodian

A register should directly meet a user-need or legal obligation.

Someone is responsible for each register, as with The Public Guardian, The Chief Land Registrar and The Registrar General.

We'll be refining these characteristics as we continue our work on registers and we'll keep you updated on our findings.

Follow Paul on Twitter and don't forget to sign up for email alerts.