The National Archives NDAD
Print page Close window
 

Glossary

 
Help Glossary Frequently asked questions Contact us Site map  
A B C D E F G H I J K L M N O P R S T U W X

The NDAD glossary offers succinct explanations of the many technical terms used on the web site.It is intended to help users understand (1) the terminology of computing and (2) the terminology of archival practices used in The National Archives and central government record-keeping.

30-year rule

This was the rule relating to the standard period for the closure of public records under the Public Records Acts 1958 and 1967. Its effect was that public records were open to public access after 30 years unless steps were taken to open them earlier, or to close them for longer periods. Since the Freedom of Information Act came into force on 01 January 2005, the 30-year rule is now redundant.

Accelerated opening

This was the process whereby public records were made available for public access in advance of the usual 30-year closure period via means of a Lord Chancellor s Instrument. Since the Freedom of Information Act came into force on 01 January 2005, accelerated opening is now redundant.

Accession

A collection of records constituting a whole dataset or part of a dataset transferred to NDAD at any one time.

Accompanying documentation

Documentation supplied by the transferring government department with the dataset to assist NDAD specialists in understanding and documenting the system/dataset; or Documentation of secondary importance to the dataset. Accompanying documentation is non-archive material and therefore is not catalogued, nor made available to the public. It may not be preserved permanently within the NDAD. See also Dataset Documentation, finding aids.

Administrative history

An account of the origin, progress, development and work of an organisation, very often a Government Department or agency. See also finding aid.

Aggregation

One of the processes which may be carried out to prevent viewing of data which the transferring government department and/or TNA has designated as in any way sensitive and/or confidential. Summary forms of the data are produced which contain fewer records and/or less detail than the raw data. The data may, for instance, be averaged over a geographical area (such as a parish or census district) or over a time period. From 01 January 2005, this is done by invoking relevant FOI Exemptions.

American Standard Code for Information Interchange (ASCII)

An internationally-agreed character set; widely used in the computer industry. ASCII is a 7-bit code; the Standard ASCII Character Set consists of 128 decimal numbers ranging from zero through 127 assigned to letters, numbers, punctuation marks, and the most common special characters. A computer stores each character in a single byte, using the 7-bit code assigned to that character by the ASCII standard.See also binary encoded data, extended ASCII, EBCDIC

Anonymisation

Process carried out to prevent viewing of any data which the transferring government department and/or TNA has designated as in any way sensitive and/or confidential. Anonymisation is carried out either by blocking display of certain fields (such as names, addresses or telephone numbers), or by producing summary forms of data. A summary form will contain fewer records and less detail than the raw data. The data may have been averaged over a small geographical area (such as a parish or census district) or over a small time period. From 01 January 2005, this is done by invoking relevant FOI Exemptions. See also redaction.

Application software

Software to perform functions for the users (such as Word Processing or a Payroll System) as distinct from systems software (such as the Operating System).

Binary encoded data

Numeric data held in binary format (as opposed to in eg ASCII format) See also ASCII, extended ASCII, EBCDIC.

Binary Large Object (BLOB)

Binary large object. A term used in more modern databases which are able to store information such as images, sound or video as well as simple values which consist of numbers or short character strings. The data in the image, sound or video clip is referred to as a BLOB.

Bit

Binary digit - BIT - the smallest unit of data recognisable by a computer, it is either a 0 or a 1. Eight bits equals one byte (or 1 character)

Boolean searching

A method of searching text based on boolean operators: an "and" operator between two words or other values (for example, "pear AND apple") means one is searching for documents containing both of the words or values. An "or" operator between two words or other values (eg "pear OR apple") means one is searching for documents containing either of the words.

Byte

A computing storage unit. It consists of eight bits; a byte is the amount of storage space required to hold one alphanumeric character such as the letter A, the numerical digit 9, or a single standard punctuation mark eg a comma.

Catalogue descriptions

The description of record series, datasets and related documentation in Finding Aids prepared by NDAD.

CD-ROM

Compact Disk, Read-Only Memory. A form of data storage that uses laser optics for reading data. Write-once CD-ROM is one of the formats on which data can be supplied to NDAD users.

Coded data

Data which are held in code form, where short sequences of letters or numbers are used to represent information in a database. An example is the use of single-letter codes to represent different types or classes of vehicle. NDAD needs explanations of the codes either in a separate computer file or on paper.

COM/Fiche Output to fiche (COM)

COM is the direct recording of computer output on to microfilm or microfiche.

Checksum

A computed value (of a file). If re-calculated after the data have, for instance, been transferred from one computer to another, and the two values are the same, there is a high degree of confidence that the data were transferred correctly.

Client Manager

A Client Manager is employed by The National Archives to liaise with central government departments concerning all matters to do with record-keeping and archives. A Client Manager supervises and guides the records work done by several government departments, and ensures that they keep documents worthy of permanent preservation and throw away any documents that aren't once they are no longer of current use. The Client Manager role is outward-looking and involves engaging with government departments to advise them on their current electronic and paper records management.

Command Line Interface (CLI)

A form of user interface where the user types in commands, usually one line at a time. In many modern computing applications, command line interfaces have generally been superseded by graphical user interfaces (GUIs).

Comma Separated Variables (CSV) file

A format of file used to facilitate transfer of data between applications. The data are in the form of a table, with each field separated by a comma; text may be enclosed in double quotes.

Computer Readable Data Archive (CRDA)

A working title for the project which became the UK National Digital Archive (Datasets). Much of the early publicity material produced both by the Public Record Office (now The National Archives) and the University of London refers to the project by this name. See also NDAD.

Database

A collection of information, usually covering subject areas which are related in some way, structured to enable effective retrieval of the information. Databases are organised into a hierarchy of files, records, and fields. A file is a group of related information, such as names and addresses of members of a sports club. All the information about a particular member (name, address, etc.) is stored in a record. A record is a collection of related data items called fields (an example of a field would be a member's name).

Data dictionary

Details (often held on-line as part of, for instance, a database management system) of components of a system, including file and field names, characteristics, relationships and structure.

Data model/structure

A data model is a graphical representation of the structure and relationships of the items of data (files, fields etc) within a system.

Dataset

A computer file or related set of computer files, and where applicable associated metadata (e.g data dictionary) in digital form, which are organised under a single descriptive title and are capable of being described as a coherent unit in the Archive s finding aids. A dataset may comprise one or more accessions and may form part of a series of related datasets transferred to NDAD over time. As an example, in the case of an annual survey which is transferred to NDAD annually, a dataset would comprise all the data for one year s survey. For the purposes of transfer to NDAD, datasets are distinguished from digital documents which are provided by Departments to assist in interpreting, or providing context to, a dataset.

Dataset Catalogue

A finding aid listing the contents of an assembly of documents or electronically held information usually including a brief list of the organisation and functions of the organising body or individual. See also finding aid.

Dataset Documentation

That part of the Archive which consists of documents supplied with datasets (e.g. printouts, survey forms, user manuals, reports produced using the data), which are deemed to be worthy of permanent preservation as archives. These documents could originate on paper and/or in electronic form. See also Accompanying Documentation, finding aid.

Departmental Record Officer (DRO)

The officer in a government department charged with responsibility for the care of all public records of that department while they are in its custody, and for the transfer to The National Archives of those selected for permanent retention. See also Client Manager, the TNA representative who assists them.

Digital documents

Digital Documents are documents that are stored on a computer. The documents may have been created on a computer, as with word-processing files and spreadsheets, or they may have been converted into digital documents by means of document imaging. Digital documents are also referred to (somewhat inaccurately) as electronic documents. The term 'Electronic documents' however is widely used by TNA and others, and is now the preferred term for describing them in Dataset Documentation Catalogues. Within the NDAD Transfer process, references to digital documents cover material (which could just as easily be in paper form) sent with the dataset to assist in interpreting, or providing context to, the dataset; i.e. the term excludes the dataset itself and any accompanying metadata in digital form which was an integral part of the original system (such as a data dictionary). Digital documents would not have been part of (i.e. held as data within) the original computer system although they could include the system documentation (e.g. system specification, user manual etc). [In the case of a document management system, digital documents do constitute the data within the system but such systems are not normally to be transferred to NDAD; they fall within the remit of the The National Archives electronic records management programme].

Disk Operating System (DOS)

The part of the Operating System which deals with access to and management of files and programs stored on disk. Also the name of the operating system used on IBM-compatible PCs; DOS translates the user's commands and allows application programs to interact with the computer's hardware and supplies the file management system for disk input and output. In the past (and before the invention of the PC), almost every computer supplier produced an operating system called DOS, or some derivation of that name. Examples include IBM's DOS (which ran on 1960's IBM 360 computers), Data General's RDOS (which ran on their Nova minicomputers) and Digital's DOS-11 and DOS-8, designed for their PDP-11 and PDP-8 computers. The different DOS systems bore no relation to each other.

Electronic documents

See Digital documents.

Encryption

Encryption is the manipulation of data in order to prevent any but the intended recipient from reading that data. The inverse of encryption is decryption.

Extended ASCII

A somewhat imprecise term, also referred to as 8-bit ASCII. The Extended ASCII Character Set consists of 128 decimal numbers and ranges from 128 through 255 representing additional (ie over standard ASCII) special, mathematical, graphic, and foreign characters. Different computer suppliers have at different times used the phrase 'extended ASCII' to denote different, and incompatible, extended character sets. In the NDAD, the only 8-bit character set used is the International Standard ISO Latin-1 character set. This is supported by all web browsers. See also binary encoded data, ASCII, EBCDIC.

Extended Binary Coded Decimal Interchange Code (EBCDIC)

A character-to-number encoding invented by IBM and used primarily by their large computer systems, eg IBM mainframes. It was also adopted by some other manufacturers, such as ICL, at various times in their history. EBCDIC never became a formal international or national standard, and suffers from the problem that many variants of the character set were in use at different times and in different countries. See also binary encoded data, extended ASCII, ASCII.

Extended closure

The extension of the closure period of a public record beyond 30 years, in accordance with a Lord Chancellor's Instrument. Since the Freedom of Information Act came into force on 01 January 2005, the extended closure process no longer exists.

Field

Holds a single data item of a specified type. It is part of a record. A field has a field name which identifies the field and should give some idea of the data it will hold eg a field containing the name of a member of a sports club may be called Member_Name. See also database.

File

In computer terms, a file is a collection of information treated as a unit by the computer. A file will usually contain a related collection of records (eg customer file would contain information on all your customers. Each record, which would hold data about a particular customer, would consist of fields for individual data items, such as customer name, customer number, customer address). See also database.

In archival terms, a file is a level of description used by NDAD in cataloguing a dataset, or an item of dataset documentation. See also ISAD(G).

File Transfer Protocol (FTP)

A protocol which allows a user on one computer to access and transfer files to and from another computer over a network. FTP is the specific standard for file transmission between computers using a TCP connection. Programs which carry out the transfer are called FTP programs.

Finding aid

Information (including but not limited to guides, catalogues and indexes) about the contents, context and structure of archives in conjunction with the means of retrieving this information. The elements of description in many of NDAD's finding aids conform to the International Standard for Archival Description (General) or ISAD(G), published by the International Council on Archives. See also Administrative History, Dataset Documentation, Dataset Catalogue, ISAD(G).

Floating-point

A form of notation and data storage in which numbers are expressed as a fractional value together with an integer exponent eg 123.45 would be expressed as 1.2345 x 102 Floating-point numbers are used to store numeric values which cannot be represented as Integers (whole numbers).

Fonds

An archival term referring to the whole of the documents, regardless of form or medium, organically created and/or accumulated and used by a particular person, family, or corporate body in the course of that creator's activities and functions.

Freedom of Information Act

After 01 January 2005, all datasets transferred to NDAD are assumed to be open to the public when transferred. Closure or redaction of a dataset will only take place if relevant FOI Exemptions are invoked. This replaces the 30 year closure rule, and the system of accelerated opening and/or extended closure by means of Lord Chancellor's Instruments.

Freedom of Information (FOI) Exemptions

See Freedom of Information Act.

Gigabyte

Giga signifies one thousand million. A gigabyte is properly 2 to the 30th power = 1,073,741,824 bytes . However when used by most computer disk and tape suppliers it denotes the slightly smaller value of 109 bytes - 1,000,000,000 bytes.

Graphical User Interface (GUI)

A GUI allows the user to point at a list of command options or click on an icon instead of typing a character-based command. An example of a GUI is Microsoft Windows.

Graphics Interchange Format (GIF)

One of a number of standard formats for display of images on the World Wide Web. This protocol is used as a standard for exchanging graphical raster-based images between computers. GIF can handle up to 256 simultaneous colors, and uses a data compression mechanism to reduce the file size, thus saving download time. GIF employs a compression mechanism (Lempel-Ziv) which is protected by a patent held by Unisys. For this reason, use of GIF is often deprecated in open systems in favour of other image encoding schemes such as PNG which are not subject to patent protection or proprietary licences.

Hierarchical Storage Management (HSM)

A means of storing data in a computer system in which frequently used data is stored on more expensive, faster disks and less frequently used data migrates to slower but cheaper forms of storage such as tape.

Hypertext

Text that contains links to other documents. HTML documents are examples of hypertext.

HyperText Markup Language (HTML)

The set of markup symbols or codes inserted in a file intended for display on a World Wide Web browser. The markup tells the Web browser how to display a Web page's words and images for the user. HTML is the usual language for documents that are 'published' on the Web. HTML is an application of SGML.

HyperText Transfer Protocol (HTTP)

The protocol describing how a web browser requests documents from a web server and how documents (in any format) are to be transmitted from a Web server to a web browser.

Image scan

Scanned image produced by NDAD, usually derived from a paper document. A scanned (or digitised) image is only a picture, and although it contains characters, they cannot be recognised by a computer. Conversely, OCR documents can be recognised by a computer, and can read into a Word processing package. See also OCR.

International Standard of Archival Description (General) (ISAD(G))

An agreed set of general rules for archival description. These rules ensure the creation of consistent, appropriate, and self explanatory descriptions; and facilitate the retrieval and exchange of information about archival material. ISAD(G) comprises 26 descriptive elements, arranged in a hierarchical structure. It is now common practice among archivists to use subsets of these elements (rather than all 26 of them) in preparing catalogues.

Internet Protocol (IP) address

The standard way of identifying a computer that is connected to the Internet, similar to the way a telephone number determines a telephone on a telephone network. It may be expressed either as a 4-part number (e.g., 123.124.12.13) or in words: ndad.ulcc.ac.uk

JPEG

A graphic image format (for still picture compression) defined by the Joint Photographic Expert Group, it is commonly supported by Web browsers. JPEG is designed to provide a compact means of storing photographic images. It is not as well suited to representing graphical images (i.e. those drawn by hand) or images with a very small number of colours.

Key field

The field within a record which uniquely identifies a record eg a national insurance number could be the unique key for a file of social security claimants. Also referred to as primary key.

Kilobyte (KB)

A unit of measure for computer memory or storage equivalent to approximately one thousand (1,024) bytes.

Local Area Network (LAN)

A data network (ie a network connecting a number of computers together allowing them to share information and/or peripheral devices) covering a restricted area (usually a few square miles or less).

Logical field

A field that can have only two values - true or false (although these may be held as 1,0 and represented as Yes, No).

Logical operator

An operator, such as AND, that combines logical values (true, false) to produce a logical result

Lord Chancellor's Instrument

A legal instrument whereby the Lord Chancellor (exercising powers under the Public Records Acts 1958 and 1967) reduces or extends the statutory 30-year closure period of a public record. Since the Freedom of Information Act came into force on 01 January 2005, this is now redundant.

Mark sense forms/reader

Optical mark recognition: method of data capture used where a user can choose from a finite and predictable set of responses, for instance multiple choice examination questions, by making a mark in a particular position on the form. A current application is the National Lottery tickets. An OM (Optical Mark) reader reads and interprets the marks on the page.

Megabyte (MB)

A unit of measure for computer memory or storage equivalent to approximately one million (1,048,576) bytes .

Multipurpose Internet Mail Extensions (MIME)

An extension to Internet email which provides the ability to transfer non-textual data, such as graphics, audio, video and fax ie an encoding scheme for allowing non-ASCII data to be included in an e-mail message. It is also used by web browsers and web servers to describe data being transferred between them - this is how a browser knows to display one file as an image, and another as text, sound or video.

Multimedia

The presentation of information by a computer system using a combination of still graphics, animation, sound and text.

National Archives, The (TNA)

The National Archives, which covers England, Wales and the United Kingdom, was formed in April 2003 by bringing together the Public Record Office and the Historical Manuscripts Commission. It is responsible for looking after the records of central government and the courts of law, and making sure everyone can look at them.

National Digital Archive of Datasets (NDAD)

The National Digital Archive of Datasets - a TNA-sponsored initiative to conserve and where possible provide access to many computer datasets from central government departments and agencies. The data will remain in the legal custody of The National Archives, but will be managed by ULCC and the University of London Library (ULL). See also CRDA.

NDAD reference

As part of the finding aid, NDAD allocate a unique alphanumeric code to every single dataset, and to every single dataset document. Series Catalogues and Administrative Histories are also identified by unique NDAD references. Do not confuse this element with the TNA Series Number.

Operating system

The set of programs which tell the machine how to perform actions, enabling it to run applications and to interface with peripherals and users. Examples of operating systems are DOS, Windows, UNIX, VMS, and VME.

Optical Character Recognition (OCR)

A process which takes an image and turns it into editable text. OCR scanning differs from image scanning in that although both accept a printed document as input, OCR identifies each character and creates an output file which can be used by, for instance, a word processing package. Image scanning results only in a 'picture' of the document.

Pen plots [or] Plotters

Graphical output of data items using computer-controlled pens. Output of this form is typically limited to line drawings, graphs, maps etc.

Piece number

The reference assigned by The National Archives to a document within a TNA Series.

Portable Network Graphics (PNG)

A file format for compressed graphic images. It provides a number of improvements over the GIF format and, unlike GIF, is patent-free.

PostScript

A page description language used to pass instructions to printers for setting up the page to be printed ie describing to the printer the appearance of the whole page, including graphics.

Public Record Office (PRO)

The former name of The National Archives.

Protocol

Agreed-upon standard. A communications protocol is a set of rules describing the transfer of data between devices or programs.

Punched card

A cardboard rectangle used for entering data into a computer or other machine and for storing data. A standard Punched Card held rows of 80 characters of data coded as a series of holes punched in columns on the card. These holes were read by a card reader which sensed which holes had been punched out in each column and translated the column dots into machine-readable character codes.

Random Access Memory (RAM)

Refers to types of memory devices whereby any location in memory can be found, on average, as quickly as any other location. Computer internal memories and disk memories are random access memories.

Record

(1) In archive terms, a record is a document, regardless of form or medium, organically created and/or accumulated and used by a particular person, family, or corporate body in the course of that creator's activities and functions.

(2) In computer terminology, a record is a collection of data items (fields), for example the various items of information about a customer. Multiple computer records can be contained in a computer file.

Redaction

Process which may be carried out to prevent viewing of data/parts of documentation which the transferring Government Department and/or TNA has designated as in any way sensitive and/or confidential. It includes anonymisation, i.e. blocking the display of certain fields such as names, addresses or telephone numbers. From 01 January 2005, this is done by invoking relevant FOI Exemptions.

Registered file

A collection of documents, relating to a particular subject or having some other common characteristic, which are created or stored by a government department in the course of its business and are therefore public records. The file is controlled by the Registry responsible for the papers produced by the department; the DRO has overall responsibility for registered files from the time they are created through review until their destruction or transfer to TNA.

Relational database

A database where the data are structured as a number of tables and the database management system allows the tables to be linked together for data to be searched, displayed etc. A relational database management system allows there to be a number of views of the data.

Rich Text Format (RTF)

Format of a file used to store data from a word processor, including information on fonts, styles etc. It is most often used as a platform-independent format for sharing documents among different word processing packages, though word processors differ in their levels of support for the RTF standard.

Scanned images

The output from a scanner which is a device for capturing graphic images from a page and converting the data into a binary code. The image can then be displayed, edited with a painting program, or pasted into another document. Unlike an OCR'd document it cannot be read as text into a word processor.

Server

A provider of resources. The term Server is used to refer both to a computer program that provides services to other computer programs in the same or other computers and to the computer that a server program runs on (although it may contain a number of server and client programs). Specific to the Web, a Web server is the computer program (housed in a computer) that serves requested HTML pages or files.

Software

A general term for all types of programs which can be run on a computer system.

Source code

The code written in a high-level computer language by programmers or code generators, to be subsequently translated by the computer.

Standard Generalized Markup Language (SGML)

An international standard for describing the markup of structured documents. The basic idea behind SGML is that information can be made independent of particular hardware and software, but more particularly that markup allows one to describe the structure of a document (such as where chapter headings, footnotes, etc occur) without saying exactly how that structure should be represented on the printed page or screen.

Table

A file in a relational database is often referred to as a table (because the data is held in the form of a table - records being the rows and fields the columns).

Tagged Image File Format (TIFF)

An image file. TIFF provides a way of storing and exchanging digital image data.

Terminal emulator

A program that allows one computer workstation to act as a terminal for accessing a remote computer over a network.

Thesaurus

A dictionary arranged by meaning rather than spelling and including pointers to wider, narrower and related terms.

TNA

See National Archives.

TNA series number

A reference assigned by The National Archives to a series of records. Series numbers are assigned by The Tna to datasets and related documentation which are transferred to NDAD. TNA Series numbers are included in NDAD catalogues but are not used by NDAD for reference purposes; NDAD has a distinct system of references for datasets and related documentation in its holdings (see NDAD reference).

Transfer

This term is used to cover the various steps involved in transferring a dataset to NDAD.

Transfer form

A brief inventory and description of materials (data, documents, etc.) being transferred in a batch to NDAD. This would usually include the title of the material and inclusive dates of materials held therein.

Twos-complement

A method of storing integers (whole numbers) in a computer system. Every computer in common use today uses twos-complement to store whole numbers. The name refers to the means by which negative numbers (such as -3) are distinguished from positive numbers.

Uniform Resource Locator (URL)

The unique address of a single HTML page or file on the Web. The address includes a unique Internet server address and a hierarchical description of a file location on the server. The address of the file is in a format that can be interpreted by a Web server, which then retrieves the file.

University of London Computer Centre (ULCC)

The University of London Computer Centre has been providing IT services for nearly 30 years. ULCC was established as a computing centre for the University of London and is now provides national networking and data archiving services, including the National Digital Archive of Datasets (NDAD).

UNIX

An operating system typically used on workstations and computers. Some Internet servers run on UNIX systems.

User interface

Term used to describe the part of a computer system that enables humans to issue commands to the computer, and see the results. The two most common types of user interface are the command line interface (CLI) and the graphical user interface (GUI), which use a screen and a keyboard and/or mouse, however other user interfaces exist for specialised tasks, or for people with disabilities

Web Server

A networked program that responds to requests from web browsers for documents available via the World Wide Web.

Wide Area Network (WAN)

A network, usually constructed with serial lines, which covers a large geographic area.

Write Once, Read Many (WORM)

This term refers to a type of computer data storage which can be written to only once, but which can then be read many times. Optical disks (also known at some times as laser discs) are an example of WORM storage.

eXtensible Markup Language (XML)

XML is the eXtensible Markup Language, endorsed and developed by the World-Wide Web Consortium (W3C). Like SGML, XML is a meta-markup language, defining rules and structures for use in creating new markup applications.

 
 

NDAD v3.0

 
 
Go to top of page Print page Close window