QR codes - a visual explainer

A visual & interactive explainer for QR codes.

May 28, 2024

QR codes are everywhere - I don't think there would be any reader of this post who hasn't come across one yet. You can read more about them on Wikipedia or the main ISO 18004:2015 spec that defines it. This post is about explaining how they work, interactively.

Let's say you have a QR code. You can use the one below, or click the QR code to upload your own (after all, this is an interactive explainer!)

QR code for Wikipedia (click to upload your own)
QR code for Wikipedia (click to upload your own)

Anatomy of a QR code

Looking at this QR code, we can see it as a grid of bright and dark areas. Each such area is called a module, and is binary in nature - dark implying a value of 1, while light is 0. In some cases, the QR code might be a white-on-black, which is called as reflectance reversal, and the bit values are flipped.

The modules in this grid above, belong to two major buckets - Structural & Informational. Modules that form the structural patterns help in detecting and aligning the QR code grid correctly. The 3 structural patterns are Finder Patterns, Alignment Patterns and Timing Patterns. The modules in them do not carry any information.

The information carrying modules are split into the Format Segment, the Version Segment, and the Data Segment. Each such segment carries information bits along with error correction bits for error recovery.

To be able to read any of this information, the QR code should first be aligned into a grid correctly, so let's look at how that's done.

Detecting a QR code

To read a QR code, we need to first detect its presence & position, and correctly aligned so that it can be read as a grid of binary modules. The first step involves using Finder Patterns.

Finder Patterns are the 3 square patterns at the top-left, top-right and bottom-left corners. Each of these is a 7x7 module concentric square pattern, with a 3x3 center. These help us detect the presence and orientation of a QR code. But how? Note that no matter at what angle the pattern is, the widths of alternating colours will be in the ratio of 1:1:3:1:1. This allows us to keep the detection algorithm something like the following:

  1. Switch the image to grayscale, and identify min and max brightness values. Their average becomes our threshold to determine if a pixel is black or white.
  2. Scan every few rows for a 1:1:3:1:1 ratio of alternating black and white.
  3. If such a segment is found, from the mid point of such a segment, scan vertically to see if we again find a 1:1:3:1:1 ratio. If yes, we have a Finder Pattern. If no, ignore and continue.
  4. Validate that there are 3 such Finder Patterns and their center points connect in a way that two such connections are perpendicular to each other.

Hover over the picture below to animate what the above process looks like.

Can you see why the separator pattern modules (the empty ones around the finder patterns) are required? If those didn't exist, it's possible that our measurement of 1:1:3:1:1 ratios might go wrong.

Once we have the broad position & orientation information about the QR code, we use the Timing Patterns (show below) to figure out exact placement of the QR code inside a grid. Timing Patterns always alternate dark and bright, and there's one each on the 7th row and the 7th column, so we get both the horizontal & vertical parts of the grid. So now, not only do we have the grid, but also the number of modules on each side (and thus the version) of the QR code.

For large QR codes however, this may not yield the sensitivity that's needed. That's where Alignment Patterns (also shown above) come to the rescue. Each such pattern is a 5x5 module grid of concentric squares, with a 1x1 module at the center. Based on the version (or the number of modules on each side) of the QR code, there's a predefined number & placement of alignment patterns, e.g. in the current QR code (version 0), there are 0 alignment patterns. Knowing the locations of these helps us make tiny adjustments that might be needed to get the proper alignment of the QR code inside a grid.

Format Segment

This segment specifies the Error Correction (EC) level and the Mask pattern used in the Data Segment. The 2-bit EC level (allowing for 4 levels of EC) dictate how many error correction bits (and thus protection against errors) are built into the data segment. The 3-bit Mask pattern (allowing for 8 different data mask patterns) dictates which bits in the data segment are inverted - more about this when we discuss the Data Segment.

The Format segment is encoded as follows:

  1. 2-bits of EC level + 3-bits of Data Mask pattern constitute 5-bits of format payload.
  2. 10-bits of error correction are calculated for this payload using a BCH(15,5)1 code, and appended to the format payload as the lower 10-bits.
  3. The resulting 15-bits are XORed with 101010000010010 (0x5412) to ensure that the bit-pattern of this is never all zeroes.
  4. Because this is such a crucial piece of metadata, directly affecting how data bits are interpreted, two copies are maintained - one horizontal, another vertical.

Naturally, the decoding process looks something like the following:

  1. Lay out each copy as a 15-bit number, and XOR it with the mask 101010000010010 (0x5412) to get unmasked value.
  2. The top 5-bits of this are information (EC level and Mask Pattern), and the remaining 10-bits indicate error correction bits.
  3. Validate the correctness of the value by checking the error correction bits. So long as 3 or less bits have an error, we can recover from it. Use the copy that has a valid value.
  4. Use the top 2-bits of the correct unmasked value to determine the EC level, and the next 3-bits to determine Mask Pattern.

Version Segment

This segment specifies the version of the symbol, but only if the version of the QR symbol is ≥ 7 (i.e. when it's ≥ 45x45 modules in size), otherwise this space is taken up by the Data Segment. In the current case, we have no version segment, so we'll skip the demo of this part, and just focus on the explanation.

Version segments (when present) are maintained in 2 separate copies for redundancy. Each copy has a 6-bit version info, along with 12-bits of error correction bits for additional redundancy, making each copy 18-bits long. The error correction bits are calculated using a similar but different2 code than the one in format info. Because this segment is present only for versions ≥ 7, there's no risk of an all-zeroes, so (unlike format segment) no masking is used.

Data Segment

Data in a QR code is protected via Reed-Solomon error correction. While we won't go into the mathematics of Reed-Solomon in this post, it's worthwhile to still understand the basic structure of it.

A Reed-Solomon block (n,k,t) is a sequence of n bytes, out of which the first k bytes carry data, and the remaining n-k bytes are error-correction (EC) bytes. These EC bytes provide the ability to correct up to t errors. To correct, t errors, we need a min of 2t number3 of EC bytes, which are determined in a way that satisfies a specific mathematical criterion. During the decoding process, by looking at both the data and the EC bytes, it's possible to figure out where the errors are located and correct them (so long as number of errors is not more than t).

With that understanding, let's look at the process of encoding data in a QR code:

  1. Depending upon the (version, ecLevel) of the QR code, the data is split into n number of Reed-Solomon blocks. e.g. in the current QR there are 0 of these blocks . Remember that each block carries some k number of data bytes followed by n-k number of EC bytes.
  2. Spatially, each of these blocks start from the first available slot closest to the bottom-right corner of the QR code, and move in a snake-like fashion with the data carrying bytes first, followed by the EC bytes.
  3. The Reed-Solomon blocks are laid out in an interleaved fashion (e.g. block0-byte0 -> block1-byte0 -> block0-byte1 -> block1 -> byte1 ...).
  4. Finally, a Masking Pattern is applied, which inverts the modules where the mask is active. This helps ensure a kind of spreading out of black & white in the QR code. During the decoding process, we need to first unmask the modules before reading their values.

If you hover over the figure below, you can see byte-by-byte process of how these blocks are laid out in an interleaved fashion. While hovering, press Shift to see the mask for this QR, or press Space to unmask the modules.

Error Correction

This specific QR has 0 errors, but if you would've uploaded a QR code with a few errors (say, a QR code with a logo embellished), you would've seen them marked as red-crossed modules in the picture above, thanks to the error recovery made possible by Reed-Solomon EC bytes appended to the data bytes. As mentioned before, the mathematics of Reed-Solomon is best covered in a separate post, but the ability to visualise the error correction is provided here to give you a flavour of it.

It is this ability of a QR code to handle error correction, that allows for logos to be embellished in it, without affecting the ability to read them.

Interpretation of Data

The sequence of bytes encoded in the data segment (concatenation of data bytes from each Reed-Solomon block) is interpreted as a bitstream consisting of a sequence of (mode,count,data) tuples. The mode is a 4-bit value indicating how symbols are encoded (e.g. Numeric, Alphanumeric, Byte, etc4), count indicates how many symbols are encoded in data bitstream - together they act as a header of sorts. This scheme allows us to maximise information density by choosing appropriate mode for a particular part of data, e.g. if a part of data contains 12 decimal digits, we could encode that part with Numeric mode (3 symbols per 10-bits) which is more efficient than Byte mode (1 symbol per 8-bits).

In the current QR code, if you manually trace the (unmasked & error-corrected) data bitstream, you'll find the following structure (hover above each term for description). Note that the values show below are in the interpreted format - hover over any value to see the exact bit-sequence used to determine it.

0

Concluding Remarks

As you can see, a QR code is designed for redundancies - both by keeping multiple copies of the data (e.g. in case of format and version information), as well as adding error correction modules to format/version/data segments. We haven't discuss the mathematics behind the error correction mechanisms, and that's where the fun part lies. Hopefully we get to explore it in a later post.