The National Archives

Saturday 7 November

   
 
 NDAD: The National Digital Archive of Datasets
Welcome (home page) About NDAD Users Contributors  
Search Browse News Help (new window)  
 
 

Dataset details: CRDA/20/DS/1

1901-1992 dataset

 
 
Quick reference Full details
 
  View in hierarchy
 

Jump to :

  Context   |   Identity statement   |   Administrative context   |   Source of acquisition   |   Nature and content   |   Conditions of access and use   |   Allied materials   |   Structure   |   Validation   |   Related datasets   |  Notes

Context

Historic Mortality Data Files
Top of pagetop of page

Identity statement

Title 1901-1992 dataset
NDAD reference CRDA/20/DS/1
Dates of creation of datasets c.1979 - c.1994
Dates of contents of datasets 1901-1992
Date of last input to datasets 1994?
Date of last access to datasets 1994?
Extent of datasets 1 dataset: 16.6 MB after processing by NDAD; 10 tables comprising 978,370 records
ISAD(G) level of description File
Top of pagetop of page

Administrative context

Aim and purpose
Statement of responsibility
Top of pagetop of page

Source of acquisition

Source of acquisition

This dataset was transferred from the Office for National Statistics on a CD-ROM which was received by NDAD on 1 October 1998. For the purposes of the transfer, ONS copied the data from four floppy disks on which it had been held. This was the format in which the dataset had been issued to users.1

Top of pagetop of page

Nature and content

Scope and content

This dataset includes population and mortality data covering the years 1901-1992. There is some uncertainty over when this edition of the Historic Mortality Data Files database was issued for sale to the public. It is thought that the 1901-1992 data was first issued by OPCS in 1994; it would not have been sold any later than 1997, when a redesigned version of the database (known as "Twentieth Century Mortality Files") was introduced. However, it is possible that versions of the database containing extra years of data were also produced during this period.2 Further information about the Historic Mortality Data Files database is provided in the Series Catalogue.

Digital processing and conversion

The tables in CRDA/20/DS/1 were transferred in a fixed length format. A file was transferred for each table. The files contained an "end of file" marker which was deleted from the converted data files. As fixed length format is one of the standard formats in which NDAD preserves data, no other conversion was required.

Top of pagetop of page

Conditions of access and use

Access conditions
Top of pagetop of page

Allied materials

Related units of description

Explanatory notes relating to the dataset have been transferred to NDAD and can be consulted via the Dataset Documentation Catalogue, reference CRDA/20/DD/1/1.

Associated material

A copy of the dataset is held in the UK Data Archive, where it is know by the title "Historic Mortality and Population Data, 1901-1992" (dataset reference number 2902).

Publications produced by the originating department
Publications produced by researchers working on the datasets
Top of pagetop of page

Structure

Logical structure and schema

The dataset comprises 10 tables: a Population table (hspops92) covering the period 1901-1992, and nine Historic Deaths tables (rev01-rev09) which cover the period 1901-1910 and the periods corresponding to the different revisions of the ICD which have been implemented in England and Wales down to 1992.

The dataset comprises the following table(s):

Table number NDAD reference Name Title
1 CRDA/20/DS/1/1 rev01 Historic Deaths, 1901-1910
2 CRDA/20/DS/1/2 rev02 Historic Deaths, 1911-1920
3 CRDA/20/DS/1/3 rev03 Historic Deaths, 1921-1930
4 CRDA/20/DS/1/4 rev04 Historic Deaths, 1931-1939
5 CRDA/20/DS/1/5 rev05 Historic Deaths, 1940-1949
6 CRDA/20/DS/1/6 rev06 Historic Deaths, 1950-1957
7 CRDA/20/DS/1/7 rev07 Historic Deaths, 1958-1967
8 CRDA/20/DS/1/8 rev08 Historic Deaths, 1968-1978
9 CRDA/20/DS/1/9 rev09 Historic Deaths, 1979-1992
10 CRDA/20/DS/1/10 hspops92 Population, 1901-1992
How data was originally captured and validated

Details of the sources which were used to produce Historic Mortality Data Files and how the data was checked by OPCS are given in the Series Catalogue. This section outlines certain aspects of the computer coding of cause of death codes which are particular to the 1901-1992 dataset.

The Historic Deaths tables in the dataset record the cause of death in the form of four-digit numeric codes. In many cases the computer codes correspond to the codes used for causes of death in contemporary revisions of the ICD. There are, however, instances where the codes used in the dataset differ from those used in the ICD:

  • In the period 1901-1910 an unnumbered list of causes of death was used in England and Wales. In the dataset codes ranging from 010 to 191 have been assigned to causes in this list. The fourth digit is represented by "0" except for the category of "other specified diseases", which has been assigned the code 1741.
  • The second through to the fifth revisions of the ICD employ a numeric code of between one and three digits for the "major cause grouping" of cause of death. These codes are sometimes combined with subdivision codes for more specific causes of death, which take the form of a number, letter of the alphabet or combination of both (e.g. in the third revision of the ICD, "arterio-sclerosis with cerebral vascular lesion" is represented by the code 91b(1)). In the Historic Deaths tables the first three digits of the computer code represent the "major cause grouping" as used in the ICD (the code of "1" in the ICD will thus be represented as "001" in the dataset). ICD subdivision codes are represented by a fourth digit of 1-9, with "0" being used as the fourth digit where no ICD subdivision existed. The main exception to this rule is in the Historic Deaths table for the fifth ICD revision, where 1571-1585 (excluding 1580) are used to represent the 14 subdivisions within the major cause grouping of "congenital malformations" (ICD code 157).
  • The sixth through to the ninth revisions of the ICD employ four digit numeric codes, with the first three digits representing the major cause grouping and the fourth digit being used for any subdivisions. These are normally reproduced exactly in the Historic Deaths tables, with "0" being used as a fourth digit where the ICD has no subdivisions (but see below regarding "fourth digit cause code discrepancies"). Codes in the range 8000-9999 refer to causes in the ICD corresponding to "external causes of injury". ICD codes for the nature of injury are not included in the Historic Deaths tables for these revisions, to avoid the possible double counting of deaths.
  • For the period 1901-1958 deaths data was derived from published sources. In some cases these sources did not include data for particular ICD sub-divisions (i.e. deaths were assigned to a major cause grouping but not to a sub-division), or sub-divisions were only partially utilised. In such instances deaths falling within the major cause grouping have been arbitrarily assigned to one of the unused subdivisions. Similar "fourth digit cause code discrepancies" affect some of the computer codes for the period 1959-1978. This appears to have resulted partly from changes to OPCS's implementation of the 8th revision of the ICD in the 1970s.3

The explanatory notes which accompany the 1901-1992 dataset (see the Dataset Documentation Catalogue, reference CRDA/20/DD/1/1) contain tables which explain the computer codes used for unnumbered causes of death in the period 1901-1910, the codes for ICD codes for the second through to the fifth revisions of the ICD, and "fourth digit cause code discrepancies" for 1901-1958 and 1959-1978.

Further details of the codes used in the dataset (e.g. for age groups) are given in the catalogues of the individual Historic Deaths tables (see Logical structure and schema).

Constraints on the reliability of the data
Top of pagetop of page

Validation

Content validation

A number of computer codes in the Historic Deaths tables for the second and fourth ICD revisions are not explained in the notes accompanying the dataset which relate computer codes to ICD codes. Explanations of these codes were supplied by ONS after clarification was sought by NDAD. The explanations are incorporated in the appropriate field descriptions in the table catalogues (see Logical structure and schema). The codes affected are 0631 to 0633, 1041 to 1048, 1051 to 1054 and 1056 to 1058 (in rev02, the table for the second ICD revision), and 0441 to 0447 and 0723 (in rev04, the table for the fourth ICD revision).4

Transformation validation

As no conversion was required, other than the deletion of the "end of file" marker, no validation of the converted files was performed.

Top of pagetop of page

Links to related datasets

Related datasets
NDAD reference Title (link leads to Dataset Catalogue)
CRDA/20/DS/2 1901-1995 dataset

Top of pagetop of page

Notes

 

1. Dataset transfer form for CRDA/20/DS/1 (completed by ONS), held in NDAD accession file for CRDA/20/DS/1.

2. Dataset transfer form for CRDA/20/DS/1; UK Data Archive, catalogue entry for "Historic Mortality and Population Data, 1901-1992" (dataset reference number 2902), consulted on 29 July 1998 (http://biron.essex.ac.uk/cgi-bin/biron/); note of telephone conversation between NDAD and ONS on 11 March 1999.

3. Dataset Documentation Catalogue, reference CRDA/20/DD/1/1, pp. 6-9, 21-74.

4. Note of telephone conversation between NDAD and ONS on 16 February 1999; email from NDAD to ONS on 24 February 1999; email from ONS to NDAD on 4 March 1999.

Top of pagetop of page

Last updated 2003-04-15 11:09:32

 
 

NDAD v3.0