| Study Guides
A-Level · cie-9618 · A-Level Computer Science · Data Representation · 18 min read · Updated 2026-05-06

Data Representation — A-Level Computer Science Study Guide

For: A-Level Computer Science candidates sitting A-Level Computer Science.

Covers: Number bases (binary, decimal, hexadecimal), two's complement for signed integers, floating-point representation, character encoding (ASCII, Unicode), and bitmap/vector image + sound encoding rules and calculations.

You should already know: Basic programming concepts; one of Python / Java / VB.

A note on the practice questions: All worked questions in the "Practice Questions" section below are original problems written by us in the A-Level Computer Science style for educational use. They are not reproductions of past Cambridge International examination papers and may differ in wording, numerical values, or context. Use them to practise the technique; cross-check with official Cambridge mark schemes for grading conventions.


1. What Is Data Representation?

Data Representation is the set of standardised, hardware-compatible formats used to store all types of information (numbers, text, images, audio) as binary digits (bits, 0s and 1s) in computer memory. Since all digital processors only recognise binary signals, every piece of input to a computer must be converted to this standard format before processing, and converted back to human-readable form for output. This is a core Paper 1 topic for A-Level Computer Science, tested in both multiple choice and structured questions, and accounts for 5-8% of total exam marks on average.

2. Number bases — binary, decimal, hex

All number systems use place values equal to powers of their base, and only use digits smaller than the base value:

  • Decimal (base 10): The human-readable standard, uses digits 0-9, place values are powers of 10.
  • Binary (base 2): The native format for computers, uses only digits 0 and 1, place values are powers of 2.
  • Hexadecimal (base 16): A shorthand for binary used for memory addresses, colour codes, and debugging, uses digits 0-9 and A-F (where A=10, B=11, ..., F=15), place values are powers of 16. Each hex digit maps directly to 4 binary bits, making it far more compact than binary for human use.

Worked Example

  1. Convert decimal 142 to 8-bit binary: Divide by 2 repeatedly and collect remainders from last to first: (pad with one leading zero to reach 8 bits).
  2. Convert the 8-bit binary to hex: Group bits in sets of 4 from the right: , , so total .
  3. Convert hex to decimal: .

Exam tip: Examiners often specify a fixed bit length for binary answers, so always pad leading zeros to meet the required length before converting to hex or performing two's complement calculations.

3. Two's complement for signed integers

Standard unsigned binary only represents positive integers. Two's complement is the global standard for storing signed (positive and negative) integers because it eliminates the need for separate addition and subtraction circuits for negative numbers, simplifying processor design. Key rules for n-bit two's complement:

  1. The leftmost bit is the sign bit: 0 = positive number, 1 = negative number.
  2. The range of values is to (for 8-bit, this is -128 to +127).
  3. To get the two's complement of a negative number: Write the positive equivalent as n-bit binary, flip all bits, then add 1.

Worked Example

Find the 8-bit two's complement representation of -47:

  1. Write +47 as 8-bit binary:
  2. Flip all bits:
  3. Add 1:
  4. Verify: The unsigned value of is 209, so , which is correct.

You can also perform arithmetic directly on two's complement values: Adding (+32) to (-47) gives , which converts to , the correct result of 32 - 47.

4. Floating-point representation

Integers cannot represent fractions, very large numbers, or very small numbers efficiently. Floating-point representation is analogous to scientific notation, and stores values as a combination of a mantissa (significand) and exponent, both stored as two's complement values. Key rules for A-Level Computer Science floating-point systems:

  1. Value =
  2. Normalisation is required to maximise precision and ensure every value has a unique representation: For positive mantissas, the first two bits must be 01; for negative mantissas, the first two bits must be 10.
  3. When shifting the mantissa to normalise, adjust the exponent to compensate: A left shift of k bits reduces the exponent by k, a right shift of k bits increases the exponent by k.

Worked Example

A 16-bit floating point system uses a 10-bit two's complement mantissa and 6-bit two's complement exponent, normalised.

  1. Calculate the value of mantissa = , exponent = :
  • The mantissa is a fractional value:
  • The exponent is
  • Total value =
  1. Normalise an unnormalised mantissa with exponent :
  • Shift the mantissa left 2 times to get (meets positive normalisation rule)
  • Subtract 2 from the exponent:

5. Character encoding — ASCII, Unicode

Characters (letters, numbers, symbols, emojis) are stored as binary values via standardised encoding mappings that assign a unique binary code to each character.

  • ASCII (American Standard Code for Information Interchange): The original character encoding standard, uses 7 bits to store 128 characters, including upper and lowercase English letters, digits, punctuation, and control characters (e.g. line break). Extended ASCII uses 8 bits to store 256 characters, adding accented letters for Western European languages. Limitation: Cannot support non-Latin scripts like Chinese, Arabic, or Cyrillic, or emojis.
  • Unicode: A universal encoding standard designed to represent every written language in the world, plus symbols and emojis. Common Unicode encodings include:
  • UTF-8: Uses 1-4 bytes per character, backward compatible with ASCII, and is the dominant encoding for the web.
  • UTF-16: Uses 2 or 4 bytes per character, common for operating system internal text storage.

Worked Example

  • The ASCII code for uppercase 'A' is , and lowercase 'a' is (the 6th bit is flipped between cases).
  • The Unicode code point for the euro symbol € is U+20AC, represented as in UTF-8, and in UTF-16.

Exam tip: Examiners frequently ask for comparisons of ASCII and Unicode. Always reference use cases: ASCII is sufficient for simple English-only text and uses less storage, while Unicode is required for multilingual applications.

6. Bitmap vs vector images, sound encoding

Multimedia content (images and audio) follows specific encoding rules to balance quality and file size.

Image Encoding

  • Bitmap (raster) images: Stored as a grid of individual pixels, each with a colour value defined by the colour depth (number of bits per pixel, e.g. 24-bit = 16.7 million colours). Resolution is the number of pixels per inch.
  • File size (bytes) =
  • Pros: Supports photorealistic detail, standard for photos.
  • Cons: Pixelates when scaled up, large file sizes for high-resolution content.
  • Vector images: Stored as mathematical definitions of shapes, lines, curves, and fill colours, with coordinate references.
  • Pros: Infinitely scalable without quality loss, very small file sizes for simple graphics.
  • Cons: Cannot replicate photorealistic detail, requires specialised software to edit.

Sound Encoding

Analog sound waves are converted to digital format via sampling:

  • Sample rate: Number of samples of the wave taken per second, measured in Hz (standard CD quality is 44.1 kHz = 44100 samples per second).
  • Bit depth: Number of bits used to store each sample (standard CD quality is 16-bit).
  • File size (bytes) =

Worked Example

  1. File size of a 1920×1080 24-bit bitmap: bytes = ~6MB.
  2. File size of a 3-minute stereo (2-channel) audio track with 44.1 kHz sample rate and 16-bit depth: bytes = ~31.75MB.

7. Common Pitfalls (and how to avoid them)

  • Wrong move: Forgetting to pad leading zeros when converting binary to hex, leading to incorrect grouping of 4 bits. Why: Students often group bits from the left instead of the right. Correct move: Always group bits from the least significant (rightmost) bit first, pad leading zeros to the left to make the total number of bits a multiple of 4.
  • Wrong move: Calculating two's complement of a negative number by only flipping bits and forgetting to add 1. Why: Confusion between one's complement and two's complement rules. Correct move: After flipping all bits of the positive equivalent, always add 1, then verify by converting back to decimal to check your result.
  • Wrong move: Shifting the mantissa during normalisation but forgetting to adjust the exponent. Why: Students focus only on meeting the normalisation format for the mantissa and ignore the scaling factor. Correct move: Every left shift of the mantissa subtracts 1 from the exponent, every right shift adds 1 to the exponent.
  • **Wrong move: Confusing sample rate and bit depth when calculating audio file size. Why: Both are measured in number-based units and are often listed close together in question text. Correct move: Label each value in your working explicitly, then apply the formula step by step to avoid mixing up variables.
  • Wrong move: Stating that Unicode uses 4 bytes for all characters. Why: Misunderstanding of variable-length Unicode encodings. Correct move: Specify that UTF-8 uses 1 byte for ASCII characters and 2-4 bytes for other characters, so Unicode character size varies by encoding, it is not fixed at 4 bytes.

8. Practice Questions (A-Level Computer Science Style)

Question 1

(a) Convert the decimal number 217 to 8-bit binary, then to hexadecimal. (2 marks) (b) Represent the decimal number -72 as an 8-bit two's complement integer. (2 marks)

Solution

(a) Divide 217 by 2 repeatedly, collect remainders from last to first: . Group into 4 bits: , , so hex value . (1 mark for correct binary, 1 mark for correct hex) (b) Write +72 as 8-bit binary: . Flip all bits: . Add 1: . Verify: , correct. (2 marks for correct final value, 1 mark for valid working if final answer is wrong)


Question 2

A 16-bit floating point system uses a 10-bit two's complement mantissa and 6-bit two's complement exponent, normalised. (a) Calculate the decimal value of the following floating point number: Mantissa = , Exponent = . (3 marks) (b) Normalise the following unnormalised floating point number: Mantissa = , Exponent = . Give the new mantissa and exponent. (2 marks)

Solution

(a) Mantissa value = . Exponent value = . Total value = . (1 mark for correct mantissa value, 1 mark for correct exponent value, 1 mark for final answer) (b) Shift mantissa left 3 times to meet normalisation rule: . Subtract 3 from exponent: . New mantissa: , new exponent: . (1 mark for correct mantissa, 1 mark for correct exponent)


Question 3

(a) Calculate the file size of a 1280×720 16-bit colour bitmap image, give your answer in megabytes (to 2 decimal places, use bytes). (3 marks) (b) State two advantages of vector images over bitmap images for logo design. (2 marks)

Solution

(a) File size in bytes = bytes. Convert to MB: MB. (1 mark for correct formula application, 1 mark for correct byte value, 1 mark for correct MB conversion) (b) Any two valid advantages: 1) Vector logos can be scaled to any size (e.g. billboards, business cards) without pixelation or quality loss. 2) Vector logos have smaller file sizes for simple designs, making them faster to load on websites. 3) Vector logos are easier to edit (e.g. change colour, adjust shape) without quality loss. (1 mark per valid advantage, max 2 marks)

9. Quick Reference Cheatsheet

Category Key Rules & Formulas
Number Bases 1. Decimal → Binary: Divide by 2, collect remainders from last to first. 2. Binary ↔ Hex: Group 4 bits from right, map each group to 0-F. 3. Hex → Decimal: , position starts at 0 from the right.
Two's Complement 1. n-bit range: to . 2. Negative value: Flip all bits of positive equivalent, add 1. 3. Value of n-bit signed number: If sign bit = 1, subtract from unsigned value of bits.
Floating Point 1. Value = . 2. Normalisation rule: Positive mantissa starts with 01, negative mantissa starts with 10. 3. Left shift mantissa by k: subtract k from exponent; right shift by k: add k to exponent.
Character Encoding 1. ASCII: 7-bit (128 chars), 8-bit extended (256 chars), only Latin script. 2. Unicode: Supports all global languages; UTF-8 (1-4 bytes, web standard, ASCII compatible), UTF-16 (2/4 bytes).
Media Representation 1. Bitmap file size (bytes): . 2. Audio file size (bytes): . 3. Vector: Math-based, scalable, small for simple graphics; Bitmap: Pixel-based, photorealistic, scales poorly.

10. What's Next

Data Representation is a foundational topic that underpins almost every other unit in the A-Level Computer Science syllabus. You will use number base and two's complement knowledge when learning processor architecture and low-level assembly programming, floating point concepts when studying data types in high-level programming and algorithm efficiency, media encoding when working with file handling and compression algorithms, and character encoding when building web or database applications that handle multilingual text. A strong grasp of this topic will also make it much easier to answer practical programming questions related to data manipulation and binary file operations, which are common in Paper 2 practical assessments.

If you have any questions about specific conversion steps, normalisation rules, or exam mark scheme conventions, you can ask Ollie, our AI tutor, at any time for personalised explanations and extra practice questions tailored to your weak spots. You can also find more A-Level Computer Science study materials and past paper practice on the homepage to test your understanding ahead of your exam.

Aligned with the Cambridge International AS & A Level Computer Science 9618 syllabus. OwlsAi is not affiliated with Cambridge Assessment International Education.

← Back to topic

Stuck on a specific question?
Snap a photo or paste your problem — Ollie (our AI tutor) walks through it step-by-step with diagrams.
Try Ollie free →