A-Level Computer Science · Databases · 16 min read · Updated 2026-05-06

Databases — A-Level Computer Science Study Guide

For: A-Level Computer Science candidates sitting A-Level Computer Science.

Covers: Relational databases and tables, primary/foreign keys, 1NF/2NF/3NF normalisation, core SQL commands, data dictionaries, ER diagrams, transactions and concurrency rules for the A-Level Computer Science exam.

You should already know: Basic programming concepts; one of Python / Java / VB.

A note on the practice questions: All worked questions in the "Practice Questions" section below are original problems written by us in the A-Level Computer Science style for educational use. They are not reproductions of past Cambridge International examination papers and may differ in wording, numerical values, or context. Use them to practise the technique; cross-check with official Cambridge mark schemes for grading conventions.

1. What Is Databases?

A database is an organised, structured collection of data stored electronically, designed to enable efficient retrieval, modification, and management of information at scale. Unlike flat file stores (e.g. spreadsheets), databases eliminate redundant data, enforce consistency rules, and support access for hundreds of concurrent users without data corruption. This topic accounts for ~15% of marks on A-Level Computer Science Paper 4 (Advanced Theory), with question types ranging from short-answer normalisation tasks to 8-mark SQL query writing and ER diagram interpretation. Examiners regularly reward application of rules to real-world scenarios, so memorisation alone is not sufficient to score full marks.

2. Relational databases and tables

The relational database model, developed by Edgar Codd in 1970, is the standard for all modern enterprise databases used in the A-Level Computer Science syllabus. It organises data into relations (commonly called tables) linked by defined relationships. Key terms you must define correctly in exams:

Table (Relation): A collection of related entries for a single type of entity (e.g. students, courses, library books), with no duplicate rows
Row (Tuple): A single entry in a table representing one instance of the entity
Column (Attribute): A single data field that stores the same type of information for all rows in a table
Domain: The set of allowed values for a given attribute (e.g. the YearGroup domain for a secondary school is 7 to 13)

Worked example

A small school’s Students table stores data for all enrolled learners:

StudentID	FirstName	LastName	DOB	YearGroup
101	Amara	Khan	2008-03-12	12
102	Ben	Carter	2007-09-21	13
103	Chloe	Davis	2008-06-04	12

Each row represents one unique student, each column stores a single piece of data about the student, and the table only contains data related to students (no course or enrolment data is included here, to avoid duplication).

3. Primary and foreign keys; normalisation (1NF, 2NF, 3NF)

Keys define relationships between tables and ensure unique identification of rows:

Primary Key (PK): A unique, non-null attribute (or combination of attributes) that identifies each row in a table. No two rows can have the same PK value, and PK values cannot be modified after creation.
Foreign Key (FK): An attribute in one table that references the PK of another table, creating a link between the two tables. FK values must match an existing PK value in the referenced table, or be null if allowed.

Normalisation is the process of restructuring a database to reduce redundancy and eliminate data modification anomalies (insert, update, delete errors). It follows three progressive stages required for the A-Level Computer Science syllabus, using functional dependencies written as $A \to B$ , meaning attribute $A$ uniquely determines attribute $B$ :

1NF (First Normal Form): All attributes have atomic (indivisible) values, no repeating groups in a single cell, all columns have unique names, and the order of rows/columns has no impact on data meaning. Violation example: A Subjects column with the value Maths, Physics, Chemistry for one student. Fix: Split into multiple rows or create a separate enrolment table.
2NF (Second Normal Form): Already in 1NF, and no partial dependencies: all non-key attributes depend on the entire primary key (only applies to tables with composite PKs made of 2+ attributes). Violation example: An Enrolments table with composite PK (StudentID, CourseID) and attribute CourseName, which only depends on CourseID (not the full PK). Fix: Move CourseName to a Courses table, only keep CourseID as FK in Enrolments.
3NF (Third Normal Form): Already in 2NF, and no transitive dependencies: no non-key attribute depends on another non-key attribute, all non-key attributes only depend on the PK. Violation example: A Students table with attribute FormTutorName that depends on YearGroup (not directly on StudentID). Fix: Create a YearGroups table with YearGroup as PK and FormTutorName as an attribute, reference YearGroup as FK in Students.

4. SQL — SELECT, JOIN, WHERE, GROUP BY

SQL (Structured Query Language) is the standard language for interacting with relational databases. You are required to write and interpret queries using the four core clauses below for A-Level Computer Science exams:

SELECT: Specifies which columns to return from a table. Use * to return all columns, or list individual column names. Syntax:

SELECT [columns] FROM [table];

Example: Return first and last names of all Year 12 students:

SELECT FirstName, LastName FROM Students WHERE YearGroup = 12;

WHERE: Filters rows to return only those that meet a specified condition, before any grouping or aggregation. Supports operators =, <, >, <=, >=, !=, LIKE (pattern matching) and IN (list matching).
JOIN: Combines rows from two or more tables using a matching PK-FK pair. The two joins required for A-Level Computer Science are INNER JOIN (returns only rows with matching values in both tables) and LEFT JOIN (returns all rows from the left table, plus matching rows from the right table). Example: Return student names and the courses they are enrolled in:

SELECT s.FirstName, s.LastName, c.CourseName
FROM Students s
INNER JOIN Enrolments e ON s.StudentID = e.StudentID
INNER JOIN Courses c ON e.CourseID = c.CourseID;

GROUP BY: Groups rows with identical values in specified columns, used with aggregate functions COUNT(), SUM(), AVG(), MAX(), MIN(). Example: Return the number of students in each year group:

SELECT YearGroup, COUNT(StudentID) AS StudentCount
FROM Students
GROUP BY YearGroup;

Exam tip: Always follow the strict clause order: SELECT → FROM → JOIN → WHERE → GROUP BY → HAVING → ORDER BY. Use HAVING to filter aggregate results, not WHERE.

5. Data dictionaries and ER diagrams

These two tools are used to design, document and maintain relational databases:

Data dictionary

A centralised repository of metadata (data about data) that defines every component of the database. It includes for each table: attribute names, data types, constraints (non-null, unique, PK/FK), allowed value ranges, and relationships to other tables. Example entry for the Students table:

Attribute	Data type	Constraints
StudentID	Integer	PK, auto-increment, non-null
FirstName	Varchar(50)	Non-null
DOB	Date	Non-null, <= current date
Data dictionaries ensure consistency across development teams, simplify database maintenance, and are used to validate that all schema changes follow design rules.

ER (Entity-Relationship) diagrams

A visual representation of database structure used during the design phase. Core components:

Entities: Rectangles representing real-world objects (e.g. Student, Course)
Attributes: Ovals connected to entities, representing data fields for the entity (e.g. StudentID, CourseName)
Relationships: Diamonds connecting related entities (e.g. Enrols In between Student and Course)
Cardinality: Numbers on relationship lines indicating how many of each entity are related: 1:1 (one-to-one), 1:M (one-to-many), M:N (many-to-many). M:N relationships cannot be stored directly in relational databases, so you must add a junction table to split them into two 1:M relationships.

6. Transactions and concurrency overview

Transactions

A transaction is a sequence of database operations treated as a single atomic unit of work: either all operations complete successfully (and changes are committed to the database), or if any operation fails, all changes are rolled back (undone) to maintain data integrity. Transactions follow the ACID properties, which you must memorise for exams:

Atomicity: All operations in the transaction complete, or none do
Consistency: The transaction moves the database from one valid state to another, following all integrity rules
Isolation: Concurrent transactions do not interfere with each other
Durability: Once committed, changes are permanent even if the system fails immediately after

Concurrency

Concurrency refers to multiple users accessing or modifying the same database at the same time. Without controls, concurrency leads to three common errors:

Dirty read: One transaction reads uncommitted changes from another transaction that is later rolled back
Non-repeatable read: One transaction reads the same row twice, getting different values because another transaction updated it between reads
Phantom read: One transaction runs the same query twice, getting a different number of rows because another transaction added/removed rows between runs

Concurrency is controlled using locking (temporarily restricting access to a record while a transaction modifies it) or timestamp ordering (processing transactions in the order they are initiated).

7. Common Pitfalls (and how to avoid them)

Wrong move: Storing multi-value attributes (e.g. multiple phone numbers in one cell) and claiming the table meets 1NF. Why students do it: They confuse composite values that are never split (e.g. full name, if you never separate first and last names) with non-atomic values that require individual access. Correct move: If you ever need to access parts of a value separately, split it into multiple attributes or rows to meet 1NF requirements.
Wrong move: Using descriptive attributes (e.g. FirstName + LastName) as a primary key. Why students do it: They forget that multiple people can have the same name, or names can change after creation. Correct move: Always use a unique, immutable surrogate key (e.g. StudentID) as the PK, never use modifiable descriptive attributes.
Wrong move: Using WHERE to filter aggregate results, or writing GROUP BY before WHERE in SQL queries. Why students do it: They mix up the order of SQL clauses. Correct move: Follow the mandatory clause order, and use HAVING to filter aggregate results, not WHERE.
Wrong move: Leaving M:N relationships unresolved in ER diagrams. Why students do it: They don’t remember that relational databases cannot directly store many-to-many links. Correct move: Always add a junction table with FKs referencing the PKs of both entities to split the M:N relationship into two 1:M relationships.
Wrong move: Defining foreign keys with a different data type than the primary key they reference. Why students do it: They overlook metadata consistency when designing tables. Correct move: Ensure FKs and their referenced PKs have identical data types (e.g. both INT, not INT and VARCHAR) to avoid reference errors.

8. Practice Questions (A-Level Computer Science Style)

Question 1 (7 marks)

A college keeps the following unnormalised table for student equipment loans:

StudentID	StudentName	Course	EquipmentID	EquipmentType	LoanDate
201	Raj Patel	CS	L10	Laptop	2024-09-01
201	Raj Patel	CS	T05	Tablet	2024-09-15
202	Mia Smith	Physics	L12	Laptop	2024-09-03

a) Explain two violations of normalisation rules in this table (3 marks) b) Restructure the table to 3NF, identifying all primary and foreign keys (4 marks)

Solution

a) i) 2NF violation: The composite PK is (StudentID, EquipmentID). Non-key attributes StudentName and Course only depend on StudentID (part of the PK), and EquipmentType only depends on EquipmentID (part of the PK), so partial dependencies exist. ii) 3NF violation: No additional transitive dependencies exist once partial dependencies are resolved, so the table only violates 2NF as written. b) 3NF tables:

Students (StudentID [PK], StudentName, Course)
Equipment (EquipmentID [PK], EquipmentType)
Loans (StudentID [FK to Students.StudentID], EquipmentID [FK to Equipment.EquipmentID], LoanDate, [PK = (StudentID, EquipmentID, LoanDate)])

Question 2 (5 marks)

Use the 3NF tables from Question 1. a) Write an SQL query to return the full name of all students who have borrowed a laptop (3 marks) b) Write an SQL query to count how many items of equipment Raj Patel has borrowed (2 marks)

Solution

SELECT s.StudentName
FROM Students s
INNER JOIN Loans l ON s.StudentID = l.StudentID
INNER JOIN Equipment e ON l.EquipmentID = e.EquipmentID
WHERE e.EquipmentType = 'Laptop';

SELECT COUNT(l.EquipmentID) AS LoanCount
FROM Students s
INNER JOIN Loans l ON s.StudentID = l.StudentID
WHERE s.StudentName = 'Raj Patel';

Question 3 (3 marks)

State three ACID properties of database transactions, and explain one in context of a bank transfer transaction (sending $100 from Account A to Account B).

Solution

Three properties: Atomicity, Consistency, Isolation, Durability (any three). Explanation of Atomicity: If the withdrawal from Account A succeeds but the deposit to Account B fails (e.g. due to a system crash), the entire transaction is rolled back, so Account A is not debited $100 without Account B receiving it. This ensures no money is lost.

9. Quick Reference Cheatsheet

Concept	Key Rules / Syntax
1NF	All attributes atomic, no repeating groups, unique column names
2NF	1NF + no partial dependencies (all non-key attributes depend on full PK)
3NF	2NF + no transitive dependencies (non-key attributes only depend on PK)
Primary Key	Unique, non-null, immutable identifier for table rows
Foreign Key	References PK of another table, must match PK data type
SQL Clause Order	`SELECT` → `FROM` → `JOIN` → `WHERE` → `GROUP BY` → `HAVING` → `ORDER BY`
ER Diagram Components	Entities (rectangles), attributes (ovals), relationships (diamonds), cardinality (1:1, 1:M, M:N)
ACID Properties	Atomicity, Consistency, Isolation, Durability
Concurrency Risks	Dirty read, non-repeatable read, phantom read

10. What's Next

Mastery of database fundamentals is a prerequisite for two higher-weighted A-Level Computer Science syllabus topics: DBMS architecture, and database application development, where you will learn to connect a front-end program (written in your chosen programming language) to a back-end relational database using embedded SQL queries. This content is also required if you choose to build a data-driven application for your A-Level coursework project, which accounts for 20% of your final grade. If you struggle with any of the concepts in this guide, or want more practice questions tailored to your weak areas, ask Ollie for personalised explanations, extra normalisation or SQL practice tasks, or step-by-step walkthroughs of past paper questions on databases. You can also browse more A-Level Computer Science study resources on the homepage.

Aligned with the Cambridge International AS & A Level Computer Science 9618 syllabus. OwlsAi is not affiliated with Cambridge Assessment International Education.

← Back to topic

Stuck on a specific question?
Snap a photo or paste your problem — Ollie (our AI tutor) walks through it step-by-step with diagrams.
Try Ollie free →

Databases — A-Level Computer Science Study Guide

1. What Is Databases?

2. Relational databases and tables

Worked example

3. Primary and foreign keys; normalisation (1NF, 2NF, 3NF)

4. SQL — SELECT, JOIN, WHERE, GROUP BY

5. Data dictionaries and ER diagrams

Data dictionary

ER (Entity-Relationship) diagrams

6. Transactions and concurrency overview

Transactions

Concurrency

7. Common Pitfalls (and how to avoid them)

8. Practice Questions (A-Level Computer Science Style)

Question 1 (7 marks)

Solution

Question 2 (5 marks)

Solution

Question 3 (3 marks)

Solution

9. Quick Reference Cheatsheet

10. What's Next

More study guides