The National Archives

Friday 9 January

   
 
 NDAD: The National Digital Archive of Datasets
Welcome (home page) About NDAD Users Contributors  
Search Browse News Help (new window)  
 
 

Dataset details: CRDA/34/DS/1

1984 Survey

 
 
Quick reference Full details
 
  View in hierarchy
 

Jump to :

  Context   |   Identity statement   |   Administrative context   |   Source of acquisition   |   Nature and content   |   Conditions of access and use   |   Allied materials   |   Structure   |   Validation   |   Related datasets  

Context

Elderly and their Medicines
Top of pagetop of page

Identity statement

Title 1984 Survey
NDAD referenceCRDA/34/DS/1
Dates of creation of datasets1984
Dates of contents of datasets1984
Date of last input to datasets
Date of last access to datasets
Extent of datasets1 dataset: 0.53 MB; 3 tables comprising: 'Elderly_People' (805 records), 'General_Practitioners' (401 records), 'Medicines' (2554 records).
ISAD(G) level of description File
Top of pagetop of page

Administrative context

Aim and purpose
Statement of responsibility
Top of pagetop of page

Source of acquisition

Source of acquisition

This dataset was transferred from the Essex Data Archive on a CD-ROM which was received by NDAD on 29 February 2000.

Top of pagetop of page

Nature and content

Scope and content

The dataset contains data collected from two questionnaires completed by interviewers representing the Institute of Social Studies in Medical Care. These were the main Elderly and their Medicines survey and the follow up survey of general practitioners. Examples of both questionnaires may be seen by consulting the Dataset Documentation Catalogue, references CRDA/34/DD/2/1 and CRDA/34/DD/2/4. The data was collected during the period May 1984 - November 1984. For background information on the Elderly and their Medicines survey, see the Series Catalogue.

Digital processing and conversion

The dataset was transferred to NDAD as three converted data files, named f174a.dat, f174b.dat and f174c.dat, and a WinZip file 2174img.zip which contained the digital versions of the paper documentation. The data files were already in fixed-length record format, and so required no further transformation by NDAD.

The names of the data files transferred by the Essex Data Archive were not original names, but were related to the Essex reference number. It was decided, therefore, to use the tablenames stated in the documentation, which reflect the content of the tables. The Essex file named f174a.dat therefore became 'Elderly_People', the file f174b.dat became 'General_Practitioners', and file f174c.dat became 'Medicines'.

The only conversion that was necessary was to ensure that each record of data occupied one line only in the data files. The format in which the data files had been created consisted of groupings of 80 columns of data per line of the file. These groupings related to the cards upon which the data had originally been collected. Therefore in Table 1, which contained information from four separate cards, each record occupied four lines of the file and needed to be converted onto one line. Table 2 contained information from two cards, and so covered two lines per record initially, until converted. Table 3 contained information from one card only, and so required no further conversion.

In each table the key, or serial number of interview, was a concatenation of the first four fields, AREA, 2ndDigit, 3rdDigit, 4thDigit.

Further details of the processing of each of the three data files are provided in the Table Catalogues (see Logical structure and schema)

Top of pagetop of page

Conditions of access and use

Access conditions
Top of pagetop of page

Allied materials

Related units of description

Documents relating to the Elderly and their Medicines dataset have been transferred to NDAD. See the Dataset Documentation Catalogue for further details. In particular, users are advised to consult the survey questionnaires (references CRDA/34/DD/2/1-4).

Associated material
Publications produced by the originating department
Publications produced by researchers working on the datasets
Top of pagetop of page

Structure

Logical structure and schema

The data is contained in three tables. Tables 1 and 3 are linked by a unique key or serial number given to each interview subject. This serial number is derived from a concatenation of four fields AREA, 2ndDigit, 3rdDigit, 4thDigit. Although Table 2 also contains this serial number it does not link directly with the other two tables.

The dataset comprises the following table(s):

Table numberNDAD referenceNameTitle
1CRDA/34/DS/1/1Elderly_PeopleQuestions Relating to Elderly People
2CRDA/34/DS/1/2General_PractitionersQuestions Relating to General Practitioners
3CRDA/34/DS/1/3MedicinesQuestions Relating to Medicines
How data was originally captured and validated
Constraints on the reliability of the data
Top of pagetop of page

Validation

Content validation

The data was checked for discrepancies and inconsistencies.

In the Elderly_People table (Table 1, reference CRDA/34/DS/1/1), the following discrepancies were found:

Field Error
ConsultInter There are two code values, 3 and 4, whose meaning is not known. These have been described as 'Unknown'.

It was noticed that in several questions the coding, as described in the original questionnaire, differed slightly from the coding used in the data files. This occurred in several fields, for example, M1YearofBirth, M52/53TimesinHosp (where 2 questions were merged together), also M56/57AttOPOrDayH, M55/58, and IP27. Other fields contained coding which was not clear, due to a lack of explanation in the documentation or to abbreviated codes. For example, M65RegHelp3, M83ConfusionScore, IP3HomeVisits.

In the second part of this table (the fields described as belonging to Card 2), there are several fields which do not fit any questions in the existing questionnaires. Although field names and codes are available for these fields, it is not possible to discover what the data itself relates to, nor even what the codes signify (for example the fields called 'Move', 'Pain', 'Sleep', etc. all contain a code 'NHP not comp', which has no explanation in the documentation, and so cannot be fully understood). It is possible that these fields relate to the Health Profile (Dataset Documentation Catalogue reference CRDA/34/DD/2/2), but there is no documentation explaining this.

At the end of this table (in the section described as Card 6, from variable 29 onwards), there are several fields for which nothing is known. There is no documentation which provides field names, or details of coding, or a context for the data. These fields contain mostly blank records, and only a few coded responses.

In the General_Practitioners table (Table 2, reference CRDA/34/DS/1/2), there were no problems with the data itself. However, the meaning of the second half of the table, which consists of fields from Card 8, is not clear. There is no documentation available which adequately explains these fields, and although the details of coding and field names are available, it is not known to which questionnaire they refer. It is probable that the questions relate to general practitioners, and that they are perhaps interpretations from other questions in the interviews, rather than questions which were actually asked of the GP directly.

There are also some unclear abbreviations in the coding for several fields in this table, namely fields VP30ImpElderHelp2, VP30ImpElderHelp3, and VP30ImpElderHelp4.

In the Medicines table (Table 3, reference CRDA/34/DS/1/3), the following discrepancies were found:

Field Error
GM17HowLongTake The code for the response 'Uncertain' is described in the documentation as having a value '7'. However, in the data file, the value is '9'.
ISSMCClassCode There are 33 occurrences of ISSMC code '090', for which there is no documentation, or corresponding BNF code. This has therefore been described as 'Unknown code'.

The field ISSMCClassCode, at the beginning of this table, has a length of 3 digits. It contains the Institute for Social Studies in Medical Care (ISSMC) classification codes. These codes are different from, but relate to, the standard drug codes published twice yearly by the British National Formulary (BNF). In the documentation accompanying the dataset, both the BNF and ISSMC codes have been listed, and so both sets of codes are provided for this field.

It is still not clear which volume of the BNF is being consulted for the data on the code sheets. Since these codes are updated every six months, the information is of limited value unless it is known which volume of the BNF they are from. This information does not appear in the documentation.

Several fields in this table were found to contain slightly different coding to the originals in the questionnaires. These fields included: PM4Produced/5Container, which consists of two questions merged together, MCategory, PM20TypeofGP, PM24HowOftenTake. There are also several fields, such as PM8QuantityonLabel, PM9Directions, PM11AvoidAlcohol, PM12DateonLabel, M59/60WhyStoppedTaking, PM31ImpTakeAdvised, for which the codes appear to have been calculated at a later date.

In this table, as in Table 1, there are several fields for which there is little or no documentation. These include the fields prefixed PH, which are questions asked of pharmacists about medicines taken by elderly people. The fields prefixed HD, which are questions asked of hospital doctors about medicines taken by elderly people, also have no documentation. No questionnaires are available which contain these questions.

There are other parts of the survey which are not clearly linked to any part of the data files transferred to NDAD. For example, it is not clear where the information on Consultants' Views and Practices is recorded. This was one of the questionnaires included in the survey, as is stated in the documentation. The questionnaire itself, however, and any relevant data, is missing.

Similarly, the Helpers Questionnaire, which is mentioned in the documentation, seems to be missing. There is no separate data for questions answered by helpers rather than by the subjects themselves, and it appears that this data has been simply integrated into the main data files at the initial data-recording stage.

Tables 1 and 3 are related to each other, via the key fields AREA, 2ndDigit, 3rdDigit, and 4thDigit, which make up the serial number used to identify each person. Table 3 records details of medicines for each elderly person interviewed, and so each serial number in Table 3 refers back to a record in Table 1. There are, however, three anomalies in Table 3 - where a serial number occurs which has no corresponding record in Table 1. They are as follows:

Row # AREA 2ndDigit 3rdDigit 4thDigit
1539 5 1 1 9
1540 5 1 1 9
1708 6 1 3 0

It is probable, in these cases, that the anomalies are due to a simple error in the field 4thDigit, as these records are similar in other respects to previous or subsequent records.

Although Table 2 contains the same four key fields as Tables 1 and 3, it does not appear to have a direct relationship with either of the other two tables, and so no link has been created. The data in this table relates not to the elderly people, but to the doctors themselves.

Transformation validation

The number of records in each file of the original transferred dataset were compared to the number of records in each file that was created after conversion. These corresponded exactly. Similarly, the number of fields in each file of the original data were compared to the number of fields in each file of the transformed data, and found to correspond exactly. A number of checks were carried out on the transformed data, and no discrepancies were found (except those already listed under Content validation, which are anomalies occurring in the original data.) The original data files were opened in SPSS format in order to carry out some tests: some simple queries were performed on a random selection of fields in the 3 tables of the original data files. These queries were repeated on the transformed data and in all cases gave results which were consistent with the original queries.

The document 'Full report of all variables', (Dataset Documentaton Catalogue, reference CRDA/34/DD/4/1), contains, for most fields in Table 1, handwritten totals for occurrences of each code value in each field. As an extra check, these totals were compared with the values calculated in SPSS, and with a few exceptions, were found to match exactly. It is thought likely that where the two values do not match, this is as a result of human error when adding up or writing down the totals on the document. The anomalies are as follows:

Field Code value Total in documentation Total in SPSS
M78MaritalStatus 1 6 66
M81AgreementFrom 1 604 594
M81AgreementFrom 2 180 182
M81AgreementFrom x 21 29
M44DrugRecord 4 0 1
M44DrugRecord 5 111 110
Patient-GPTie 2 29 28
Patient-GPTie 3 16 15
Patient-GPTie 5 17 18
Patient-GPTie 8 75 76

The document 'Full report of all variables', (Dataset Documentation Catalogue, reference CRDA/34/DD/4/1), also contains handwritten totals for the data in Table 2 (Cards 7 and 8). It is not possible to make any comparisons between these figures and results obtained by our own tests, however, as it appears that the handwritten totals add up to 805, whereas Table 2 only contains 401 records. It is not clear from the documentation exactly what the totals represent.

Data from Table 3 is not described in 'Full report of all variables', and so does not have handwritten totals for each field. The documents 'Codes for pharmacist about each drug', and 'Medicines reported by doctors' do contain handwritten numbers, but it is not clear what these represent. They do not appear to match totals found in the data.

Top of pagetop of page

Links to related datasets

Related datasets

There are no related datasets in this series.

Top of pagetop of page

Last updated 2003-04-15 09:08:17

 
 

NDAD v3.0