| |
|
|
|
| | | | Top of page | Identity statement |
|---|
| Title | 1975-1976 |
|---|
| NDAD reference | CRDA/33/DS/1 |
|---|
| Dates of creation of datasets | [1975-1976] |
|---|
| Dates of contents of datasets | January 1975-September 1976 |
|---|
| Date of last input to datasets | 1976 |
|---|
| Date of last access to datasets | |
|---|
| Extent of datasets | 1 dataset: 30.3 KB; 3 tables comprising: 'January_1975' (101 records), 'September_1975' (159 records), 'Follow-up' (114 records). |
|---|
| ISAD(G) level of description | File |
|---|
| Top of page | Administrative context |
|---|
| Aim and purpose | |
|---|
| Statement of responsibility | |
|---|
| Top of page | Source of acquisition |
|---|
| Source of acquisition | The Children's Difficulties on Starting Infant School dataset was
transferred by the Essex Data Archive on a single CD, as files
containing fixed-length records, and was received by NDAD on the
29th of February 2000. |
|---|
| Top of page | Nature and content |
|---|
| Scope and content | This dataset contains data from a study conducted by the Thomas
Coram Research Unit on behalf of the Department of Health and
Social Security. The study provides information relating to
difficulties experienced by children on starting infant school, and
factors affecting these difficulties. It includes a follow-up study
of some of the children, to assess the extent to which initial
difficulties persisted. The assessments were carried out by the
children's teachers, who selected appropriate responses from a set
list provided. In most cases the responses took the form of a
choice of 4 options (ranging from the child coping well with the
activity in question, to the child experiencing difficulty).
Several questions simply required a number within a defined range
(for example, 'Age in Months').
The main issues dealt with in the dataset concern the following
aspects of the child's behaviour: settling in, co-operation with
others, relationship with the teacher, concentration, use of play
materials, self-reliance, verbalisation, ability to follow
instructions, ability to cope with personal needs, sociability,
physical co-ordination, fine motor control, and general
difficulties. Further details of the administrative background of
this dataset are provided in the
Series Catalogue. |
|---|
| Digital processing and conversion | The dataset was transferred to NDAD as three converted data files,
three SPSS syntax files, and a WinZip file, 1514img.zip, which
contained digital versions of the paper documentation. The data
files were named e514a.dat, e514b.dat, and e514c.dat, and were
already in fixed-length record format and so required no further
conversion by NDAD. Each data file contains nearly the same number
of fields (20 fields containing useable data, plus either 9 or 10
extra fields containing unknown data or blanks, therefore file
e514a.dat contains 29 fields in total, e514b.dat contains 30, and
e514c.dat contains 29 fields). In each file the key is a
concatenation of two fields, V1 and V2 (AREA and CHILD NUMBER).
The three SPSS syntax files were named e514a.sps, e514b.sps, and
e514c.sps, and contained information on length of fields, 'variable
labels' and 'value labels' for each field, and on 'missing values'
where appropriate. This information has been used to catalogue the
fields within each table. The 'variable labels' provided in SPSS
have been used as descriptions within the table catalogues.
Similarly the value labels are used in the encoding section for
fields and the missing values in the missing value(s) section.
The names of the data files transferred by the Essex Data
Archive were not the original names used within the Government
Department, but relate to the Data Archive reference number. It was
decided, therefore, to use alternative tablenames which would
reflect the content. The Essex file named e514a.dat therefore
became 'January_1975', the file e514b.dat became 'September_1975',
and file e514c.dat became 'Follow-up'. |
|---|
| Top of page | Conditions of access and use |
|---|
| Access conditions | |
|---|
| Top of page | Allied materials |
|---|
| Related units of description | |
|---|
| Associated material | |
|---|
| Publications produced by the
originating department | |
|---|
| Publications produced by
researchers working on the datasets | |
|---|
| Top of page | Structure |
|---|
| Logical structure and schema | The dataset consists of three tables. Table 1 consists of records of 101 children, Table 2 has details
of 159 children, and Table 3 has records of 114 children. The key
fields 'V1' and 'V2' are the first two fields in each table. These
fields uniquely identify individual children, as is clear from
comparisons of Tables 2 and 3, which show that the same individuals
are being studied. The follow-up data (Table 3) consists only of
records of children mentioned previously in Table 2. Children
mentioned in Table 1 have no follow-up data. Not every child from
Table 2 is mentioned in Table 3, however (159 records in Table 2,
as compared with 114 records in Table 3). Therefore, although the
data in Tables 2 and 3 are considered to have a 1:1 relationship,
not every record in Table 2 will have a corresponding record in
Table 3. The dataset comprises the following table(s): | Table number | NDAD reference | Name | Title |
|---|
| 1 | CRDA/33/DS/1/1 | January_1975 | Initial data, January 1975 intake | | 2 | CRDA/33/DS/1/2 | September_1975 | Initial data, September 1975 intake | | 3 | CRDA/33/DS/1/3 | Follow-up | Follow-up data, September 1975 intake |
|
|---|
| How data was originally captured and validated | |
|---|
| Constraints on the reliability of
the data | There are no known constraints on the data. |
|---|
| Top of page | Validation |
|---|
| Content validation | In each table, there are several fields that are not defined in the
original SPSS syntax file. For example the field Column5 contains
records with the value '0'. It is not listed in the original syntax
file, but the dataset documentation lists it as '0 -Missing data'.
Fields like this were not defined as fields in the original data,
and so did not have field names. But because the data is in
fixed-length record format, any blanks or missing values in the
data need to be defined. It was therefore necessary to define a new
field for every gap between the existing defined fields. The field
names (eg Column18-21) explain where in the data the gap occurred,
and a length and data type is specified. See Table Catalogue for
further descriptions of each field.
The following fields correspond with spaces in the data files
which are not explained in the syntax files. These fields are
explained in the dataset documentation (see Dataset Documentation Catalogue,
reference CRDA/33/DD/2):
Table 1
Column5, Column11-15, Column 18-21, Column23, Column26,
Column28-34, Column36, Column39-42, Column 47-78
Table 2
Column5, Column11-15, Column18-21, Column23, Column26,
Column28-34, Column36, Column39-43, Column48-49, Column50-78
Table 3
Column 5, Column11-15, Column18-21, Column23, Column26,
Column28-34, Column36, Column39-44, Column49-78
Several checks were carried out on the data (see Transformation Validation). These checks revealed no
inconsistencies in the data, and only one very small inconsistency
in the documentation. This inconsistency concerns the code for
missing values, which was designated as '9'. The 'Notes for
disentangling the data' document (see Dataset Documentation Catalogue,
reference CRDA/33/DD/3/1) states that for each table each question
from Item1 to Item 12 contains the missing value coding. The SPSS
data dictionary file, however, states that only a few particular
questions require the missing value code. The data itself
corresponds most closely with the data dictionary file, and so this
is the version that was used when the missing values were
defined. |
|---|
| Transformation validation | Although no transformations were carried out on the data (it was
transferred to NDAD in a format which required no alteration),
several simple validation checks were carried out to ensure that
the data was unchanged. No discrepancies were found, although
several points concerning the original data arose during this
process, which are discussed in section 7.1 Content Validation. The
checks included counts of numbers of records per table, and number
of fields. Several queries were performed on a selection of fields,
checking aspects of the data such as area, gender, happiness of
child at start. These were checked against the original data and no
inconsistencies were found. Another check that was carried out was
on the total number of responses for each choice given in a
particular question. These totals were checked against Marginals
information which was provided in the documentation (see
Dataset Documentation Catalogue,
references CRDA/33/DD/2/1, CRDA/33/DD/2/2, and CRDA/33/DD/2/3), and
no discrepancies were found. |
|---|
| Top of page | Links to related datasets |
|---|
| Related datasets | There are no related datasets in this series. |
|---|
| Top of page |
Last updated 2003-04-10 16:56:40
|
|
|