The National Archives - link to home page    

Thursday 20 November

 

Main website navigation:

   
 
 NDAD: The National Digital Archive of Datasets
Welcome (home page) About NDAD Users Contributors  
Search Browse News Help (new window)  
 
 

Sub-series details: CRDA/24/DS/1997

1997

 
 
Quick reference Full details
 
  View in hierarchy
 

Jump to :

  Context   |   Identity statement   |   Administrative context   |   Source of acquisition   |   Nature and content   |   Conditions of access and use   |   Allied materials   |   Original system attributes   |   Structure   |   Validation   |   Links to dataset catalogues  

Context

Public Health Common Dataset
Top of pagetop of page

Identity statement

Title 1997
NDAD referenceCRDA/24/DS/1997
Dates of creation of datasets1997
Dates of contents of datasets1996
Date of last input to datasets [1997]
Date of last access to datasets[1997]
Extent of datasets6 datasets
ISAD(G) level of description Subseries
Top of pagetop of page

Administrative context

Aim and purpose
Statement of responsibility
Custodial history
Top of pagetop of page

Source of acquisition

Source of acquisition

This dataset was transferred from the Department of Health on a CD-ROM which was received by NDAD on 1 October 1999.

Top of pagetop of page

Nature and content

Scope and content

This dataset holds public health data issued in 1997. Further information about the Public Health Common data set is provided in the Series Catalogue and Dataset Documentation Catalogue . The sub-series consists of 6 datasets comprising a total of 336 tables, which contain various indicators for health regions in England and Wales. The 1997 PHCDS provides, where possible, data for:

  • Health Authorities (HAs) on the basis of boundaries in April 1996;
  • Local Authorities (LAs) on the basis of boundaries in April 1996 (not available for Census supplement indicators);
  • Regional Offices;
  • Government Office Regions;
  • ONS area groups.

The eight Regional Offices and their constituent Health Authorities and codes (as at April 1996) are listed here.

There are 11 ONS Area Classification Groups; a list of these, together with lists of the Local Authorities in each group, is provided in Appendix 3 of the 'Data definitions and user guide' (see the Dataset Documentation Catalogue , reference CRDA/24/DD/1/8/1). The 10 Government Office Regions are: 'North East', 'North West', 'Merseyside', 'Yorkshire And Humberside', 'East Midlands', 'West Midlands', 'South West', 'Eastern', 'London', 'South East'.

As for the 1996 PHCDS, the HA data relate to boundaries as at April 1996. However, there has been a change to the LA data, which were based on December 1995 boundaries in the 1996 PHCDS, and are based on April 1996 boundaries in the 1997 PHCDS. The 1997 'Data definitions and user guide' (see the Dataset Documentation Catalogue , reference CRDA/24/DD/1/8/1) says: "The LA data for 1996 incorporate these boundary changes. Retrospective data have, as far as is possible, been adjusted to reflect the boundary changes. However, it is not possible to recode retrospective data for the boundary changes to some areas, namely Harrogate, Selby, Ryedale and York Unitary Authority. Pooled (1994-96) and trend (1986-96) data for these areas should therefore be interpreted with caution, since 1996 data relate to the new boundaries and pre-1996 data relate to the 1995 boundaries. In the case of Harrogate, Selby and Ryedale, the boundary changes are minor and unlikely to have a significant impact on the data for these areas. However, York Unitary Authority has undergone major boundary changes, hence pre-1996 data will not be comparable with data for 1996.

Another caveat to the LA data is that, because of the complex boundary changes affecting the Unitary Authorities of East Riding of Yorkshire and North Lincolnshire, for some indicators data are not available for these areas."

Several features of the 1996 PHCDS supplementary release (held in NDAD as CRDA/24/DS/1996/7) have been incorporated into the 1997 PHCDS, namely:

  • trend data for cancer registration (indicators HON-B2, HON-B3 and CDS-D1);
  • current data for expanded list of causes of death (indicators CDS-C3A, C3B and C4);
  • trend data for expanded list of causes of death (indicator CDS-C3A).

The 1997 PHCDS contains information from the 1991 Census for HAs with boundaries as at April 1996, Regional Offices and ONS area classification groups. (This is held in NDAD as CRDA/24/DS/1997/6.)

Digital processing and conversion

The tables in CRDA/24/DS/1997 were transferred to NDAD in the form of Lotus 1-2-3 (WK3) spreadsheets with the file names having meaningful prefixes to reflect the type of data. See CRDA/24/DD/1/8/1 for a more detailed explanation of the prefixes. These files were opened in Microsoft Excel 97 (under Windows 95/98). Visual Basic Macros were written to process them so that the headings, subheadings, other metadata, and blank columns or rows between the data were removed and the format of the cells set to Microsoft Excel General format before saving the data in CSV (Comma separated variable) format. In a number of the original files (within CRDA/24/DS/1997/2 and CRDA/24/DS/1997/4), the data for each area is split over two lines. A Visual Basic module was written to move the figures so that there is one line per area. For an example of this type of table, see CRDA/24/DS/1997/4/2/1 (cdc301t) by clicking on the image below:

cdc301t screenshot

Some additional processing was carried out on cda5 (CRDA/24/DS/1997/3/5) which lists the English Health Authorities classified hierarchically into 'families' and 'groups'; in order to retain the classification across the records, the 'families' and 'groups were copied into the appropriate, blank cells. In cda5l (CRDA/24/DS/1997/3/38), the Local Authorities within each of the ONS Area Classification Groups are listed in 3 columns; the entries were re-formatted in order that there is one record per area.

In CRDA/24/DS/1997/5/26 (hob4) the display of data in the original spreadsheet differs from the standard format. The tables in CRDA/24 normally have a single 'set' of rows of data (one row per area). This table displays 7 columns of data on the top half of the worksheet with 7 columns below (ie each area appears twice). In order to create a structure the same as in the other tables, the second set of columns (excluding the area code and name) were moved to the top half of the table, so that they became additional columns 8 to 12. hob4 is one of the tables in which, as mentioned in the section "Constraints on the reliability of the data" in the Series Catalogue, figures displayed as percentages in the spreadsheet are, following the conversion to CSV, held in the archive as figures between 0 and 1.

Some cells in the original spreadsheets contain real numbers but they are formatted to display as integers and others which have many figures after the decimal point are displayed with just 2 decimal places. In order to preserve the actual data, this formatting has been removed. However, it must be assumed that original users of the spreadsheets would have seen the data as it had been, originally formatted. Although some files contain fields that have many decimal places, users are advised that NDAD recommends that all data in the PHCDS is not quoted at more than two decimal places. This is because fields are not formatted with more than two decimal places in the original spreadsheet files. NDAD assumes that this is because the data creators considered the data to be accurate to, at most, two decimal places. The field formats set by NDAD (DOUBLE or INTEGER) have been specified according to the data the cells contain, rather than how they are displayed within the original spreadsheet.

The PHCDS in its original format does not use specific field names as such: generally, spreadsheet packages do not require data to be held within named fields. (The indicators within PHCDS do have original names which have been preserved but these equate to the table name within NDAD). To identify fields within a table, the first field is named either H_CODE (if the data relates to Health Authorities) or L_CODE (if it relates to Local Authorities) and the second AREA (for the name of the area). The rest of the fields have been named sequentially, starting with the third field as F3, the fourth as F4 etc. These are not from the original data files but have been allocated by NDAD during the data conversion process. The column headings in the spreadsheet, supplemented at times by information from the 'Data definition and user guide', form the basis of the field descriptions. Where a spreadsheet contains one or more footnotes, the text was automatically extracted for inclusion in the relevant Table catalogue and is provided at the end of that catalogue under the heading 'Other information'.

Accruals
Top of pagetop of page

Conditions of access and use

Legal status
Access conditions

No access conditions apply.

Copyright requirements
Data Protection Act requirements
Language

The language of the materials is English.

Top of pagetop of page

Allied materials

Related units of description

Public Health Common data set data definitions and user guide for computer files relating to the dataset have been transferred to NDAD and can be consulted via the Dataset Documentation Catalogue.

Associated material
Publications produced by the originating department
Publications produced by researchers working on the datasets
Top of pagetop of page

Original system attributes

Hardware
Operating system
Application software
User interface
Top of pagetop of page

Structure

Logical structure and schema

The data has been divided into six datasets by topic. For access to these datasets, see Links to dataset catalogues.

Where data availability allows, for each indicator there are two tables: one each for HAs and LAs. The HA files also contain data for Regional Offices, computed on the basis of HA boundaries in April 1996. The LA files contain data for Government Office Regions instead of Regional Office data. Both sets of files also contain data for ONS area classification groups.

The data for the ONS area classification groups are computed on the basis of data for the LAs in each group (not the HAs). The data are based on LA boundaries as at April 1996. Whereas, for the 1996 PHCDS the data are based on LA boundaries as at December 1995. An exception to this is the Census indicators, these are only available for HAs. For census indicators the data for the ONS area groups are based on the HAs in each group. A list of HAs in each ONS area group is in indicator CDS-A5, held in NDAD as 'cda5' reference CRDA/24/DS/1997/3/5.

Dynamic or closed
How data was originally captured and validated

Details of the sources which were used to produce the Public Health Common data files and how the data was checked by ONS are given in the Series Catalogue. For further information about the presentation of the data please see the Dataset Documentation Catalogue reference CRDA/24/DD/1/8/1 for further information.)

Constraints on the reliability of the data
Top of pagetop of page

Validation

Content validation

No discrepancies were noted in the original spreadsheet files as compared to the expected contents as described in the User Guide for the 1997 PHCDS, see the Dataset Documentation Catalogue, references CRDA/24/DD/1/8/1. However, during processing/checking of the data, the following points were noted:

In CRDA/24/DS/1997/1, the data in the tables covering indicators A8 and A9 data are only available to regional level and the regional data relates to the Standard Regions used by ONS for statistical purposes. These Standard Regions (based on County boundaries which are not in general co-terminous with Health Authority areas; see the Documentation Catalogue, reference CRDA/24/DD/1/8/1) have not been encoded at all in the tables and therefore tables CRDA/24/DS/1997/1/8 and CRDA/24/DS/1997/1/9 have not been included in the relationships set up by NDAD between the tables (to allow users to link data in similar tables within a dataset via the code for the area). Tables hna5b6 and hna10 (CRDA/24/DS/1997/1/5 and CRDA/24/DS/1997/1/10 respectively, which provide data for Regional Health Authorities with boundaries as at April 1995), and hna6 and hna7 (CRDA/24/DS/1997/1/6 and CRDA/24/DS/1997/1/7 respectively, which provide data for Regional Offices), also contain no codes for the area and therefore again have not been included in the relationships between the tables.

The codes for the areas in the two tables in dataset CRDA/24/DS/1997/5 which relate to boundaries at April 1995 (hoa2-3 and hoa2-4) differ to the other tables in that dataset (for instance in these 2 tables, the code for 'NORTHERN AND YORKSHIRE' is A00+B00 and the code for 'Leeds' is B61 whereas all other tables in this dataset have code Y01 for RO ''NORTHERN AND YORKSHIRE' and code QDH for HA 'Leeds'). Clearly this affects linking of data between tables.

Transformation validation

Spot checks were carried out to compare the transformed data against the data in the original Lotus 1-2-3 files. These included comparing the values of specific fields and checking that the totals of numeric fields were the same. In addition, each table was checked to ensure that the overall number of records and fields remained the same. No discrepancies were detected between the original and transformed data. The only differences found resulted from rounding and/or floating point representation, particularly for example where the original numbers had 12 figures after the decimal point. The transformed data is restricted to the level of accuracy provided by the general format in Excel (generally 8 figures after the decimal point).

Top of pagetop of page

Links to dataset catalogues

Links to dataset catalogues

Dataset catalogues provide more detailed information about individual datasets, and are currently available for the following dataset(s):

NDAD referenceTitle (link leads to Dataset Catalogue)
CRDA/24/DS/1997/11997 - Health of the Nation Indicators: Monitoring Data
CRDA/24/DS/1997/21997 - Health of the Nation Indicators: Trend Data
CRDA/24/DS/1997/31997 - Public Health Common Data Set Indicators: Current Data
CRDA/24/DS/1997/41997 - Public Health Common Data Set Indicators: Trend Data
CRDA/24/DS/1997/51997 - Population Health Outcome Indicators
CRDA/24/DS/1997/61997 - Census Indicators
Top of pagetop of page

Last updated 2005-05-16 12:25:51

 
 

NDAD v3.0