Avoid Interleaved 2 of 5 Code

I was asked to recommend a scanner by a customer who sent me samples of their bar codes. They were all Interleaved 2 of 5 code, which is a numeric only code whose only saving grace is that you can print a lot of digits in a small space. Here are three symbologies encoding the numbers 1 to 8:

You can see that Interleaved 2 of 5 takes up the least space, but Code 128 is pretty close. Interleaved 2 of 5 has a built in defect in that the stop/stop patterns are not unique and if the scanner enters or leaves the code in a spot that resembles a start or stop, the code can be short scanned.

Code 128 is always printed with a check digit anyway and has a unique start/stop pattern, making it a superior code to I 2 of 5.

My customer’s bar code looked like this:

This is a picture from our microscope. Notice that the narrow bar measures .1 mm, or 3 mils. This is a 300 dpi printer. The wide to narrow ration of this code should be 3 to 1; this is printed at 4 to 1. Lastly, the narrow bar under the red arrow should be one element wide, this one is two.

Their printer is doing a bad job printing this code, but fortunately for them modern scanners are pretty forgiving and this code can be read reliably with a Xenon with high density optics.


GS1 Application Identifiers

The stated goal of GS1 (formerly the UPC Code Council) is to “develop and maintains global standards for business communication”. Their most widely know standard is for barcode labels. Here’s a sample GS1 barcode:

A scanner reading this barcode will  output “[C101189010720001501719083110LM123”.  The symbology above is Code 128 which has 106 different symbol patterns in it’s character set, three different start characters, three subsets (A,B, and C), and four function characters (FNC1 to FNC4) that are not printable that are used for special functions.

All GS1 barcodes start with a Start C symbol followed by a Function 1 character; scanners are supposed to interpret this a “[C1”, which indicates to the receiving software that the code follows GS1 rules.

The three pairs of characters enclosed in parenthesis are Application Identifiers; they tell the receiving software what type of data follows. Note that the parentheses are in the human readable only, they are not in the bar code itself.

A full list of AIs can be found here, but in the above example:

01 – Identifies the following data as a GTIN (Global Trade Identification Number)

17 – indicates an expiration date (YYMMDD)

10 – is a lot number, which is variable length field.

Note that if a variable length field (or more precisely, if the FNC1 is required by the GS1 tables) is in front of another field, a separator character must be used to signal the end of the field. This character can be a FNC1 or and ASCII group separator character (Hex 1D).

GS1 barcodes can also be Datamatrix codes; they follow the same rules as Code 128 except that the first three characters output are “[d2” instead of the “[C1” for Code 128.

2D Codes vs Stacked Linear Codes

Here are two barcode symbols that both encode the string “12345678”:

Which of these codes is a 2D code? The one on the right is Datamatrix code, a true two dimensional code. The one on the left is PDF417, a stacked linear code; it looks like a two dimensional barcode, but it isn’t.

2D codes store data in both the X and Y coordinates. Linear codes only contain data in one dimension. This is easy to see in a normal linear code.

It doesn’t matter where the scanner goes across the code, data is only encoded in the widths of the bars and spaces. Datamatrix characters are encoded in a matrix of 5 by 5 cells and have to be read by a camera.

Stacked linear codes are really a bunch of small linear barcodes stacked on top of one another. Each row has a row indicator or number, so a 1D scanner such as a laser is capable of reading these codes by sweeping across the code while the decoder keeps track of the row numbers and puts together the final output. Check the specs of your scanner, not all 1D scanners will read PDF417 symbols.

Other stacked symbologies besides PDF417 are Code 49, Code 16K,  and
GS1 Databar Stacked.

Note the size difference between the PDF417 and the Datamatrix symbol above. Not many new  applications use PDF417 because of the size and density advantage of Datamatrix.

Other 2D codes are Maxi Code, Aztec Code, and the ubiquitous QR Code.


How to identify a bar code symbology Part 2: Industrial 1D codes

There are over a hundred of types of 1D barcodes, but only a few are commonly used today. These are Code 128, Code 39,  and Interleaved 2 of 5, with Code 128 being the most common and Interleaved 2 of 5 (I 2 of 5) the least.

Less common 1D codes still used today are Codabar, Code 93, Code 11,  Two of Five, and MSI code.

2D barcodes are becoming more popular; I’ll write about them at a later date. 1D codes only contain data in one dimension, in the widths of the bars and spaces.

The first step to identifying a code is to note how many different bar and space widths the code uses:

Code 39 and I 2 of 5 only have two different widths of bars and spaces. If it has more than two, it’s usually Code 128, which uses four different widths. UPC uses four bar widths too, but you can usually recognize UPC from the guard bar patterns.

If the code only has two widths the next thing to look at is the start/stop patterns. The first (and last) five bars in a Code 39 symbol are narrow, narrow, wide, wide, narrow. I 2 of 5 starts with two narrow bars and ends with a wide and narrow bar. I 2 of 5 is numeric only, so if the code has two bar widths and alpha characters, it’s probably Code 39.

Note that the start or stop pattern in I 2 of 5 is not unique and can easily be found in the symbol itself, making this code vulnerable to short scans. The red line below represents a laser beam from a scanner going across an I 2 of 5 symbol:The laser exits on a wide and narrow bar pattern that could be interpreted as a stop code, resulting in a short scan. You’ll often see I 2 of 5 printed with bars above or below the code, called bearer bars,  to prevent short scans.

Most scanners can be set to read I 2 of 5 as fixed length codes,  preventing the short scan issue. Here’s a tip: To find out how many characters are in an I 2 of 5 symbol, count the number of bars, subtract 4 (for start/stop) and divide by 2.5. For example, the symbol with the red line through it about has 24 bars, so 24 – 4 = 20, divided by 2.5 gives you 8.

Another method of eliminating short scans is to enable a check digit in I 2 of 5. Always enable a check digit if you are going to read variable length I 2 of 5 symbols.

I’ll cover Code 39 and Code 128 in more detail later.


How to identify a barcode symbology Part 1: UPC codes

UPC is an abbreviation for Universal Product Code. It uses four different bar and space widths and encodes each number using two bars and two spaces.

We all can identify UPC-A (at least in the states) with its telltale guard bars, the 12 numeric characters printed in groups of 1, 5, 5, and 1. These numbers are the system digit, manufacturer’s code, item ID, and check digit respectively. The guard bars are the two lines that are longer than the rest at the beginning, middle, and end of the symbol.

UPCA symbol
UPCA symbol

The guard bars can be considered as start and stop code and don’t encode any data. There was some talk about UPC being the “Mark of the Beast” mentioned in the book of Revelations because the number six, when printed on the right side of the symbol is two narrow bars, so the conspiracy theorists thought UPC secretly contained “666”. I occasionally got questioned about this at trade shows.

There are a number of variations of UPC. There’s UPC-E, or zero suppressed code that is usually used on small items:

There’s UPC with a 2 or 5 digit supplemental codes used on magazines and periodicals; the supplemental number indicating the issue:

In Europe, it’s EAN, or the European Article Numbering code:

The first three digits in an EAN code indicate the country code and unlike UPC, the manufacturer number and item number are variable length. Notice that there are 13 numbers in an EAN code even though there are the same number of bars as spaces as a UPC-A code. UPC numbers have left and right parity; so a digit printed on the left side has a different pattern when it is printed on the right side. The extra number is encoded by varying the parity pattern on the left part of the EAN symbol.

There is also EAN-8 and EAN with supplemental codes that are similar to UPC-E and UPC with supplemental codes.

One special version of EAN worth mentioning is Bookland code. The country code of 978 has been assigned to a fictitious country, “Bookland” and is used to mark books. Bookland code uses the remaining EAN 10 digits to encode the ISBN number and uses a 5 digit supplemental to encode the suggested price:

The first digit in the supplemental code indicates the currency type. Check this out next time you buy a book. There are other versions of UPC, but they are pretty obscure.  RSS (Reduced Space Symbology) is also being used in retail applications.

For more on UPC  check out ADAMS1 and the GS1 organization.



What goes around comes around

These are a couple of old circular bar codes, dating back to the 1970’s.

The first example was used to track totes filled with tape measures and divert them to the proper gate on a conveyor. The computer system that this symbol was used with was a Computer Identics laser scanner attached to a DEC PDP-8 with Plessey MOS memory and a ‘flip chip’ card decoder on a separate backplane. The scanning software loaded via a paper tape reader.

This is a binary encoded symbol with a value of ‘72’.  The laser scanner only read half of the label, and after it was decoded, the computer diverted the tote to the gate associated with the value 72. This was one of the oldest bar code systems that I have worked with.


The next example is called ‘Split Circle Code’. It was developed in  1974 by Bendix Recognition Systems.

The circle was split in half, with each half encoding part of the symbol.  This type of symbol required that both halves of the circle be read, so there were orientation issues that had to be dealt with in order to get good reads.

Bendix encoded these symbols as BCD (binary coded decimal) values and they were printed by Bendix printers.

This example was used in a baggage handling system at Eastern Airlines which used a Bendix scanner to read the labels at a rate of 70 bags per minute.

You can still see the texture of the luggage that the label was applied to.

Apparently, many customers had complaints about the adhesive residue left behind when the label was removed and this ultimately led to the demise of these scanning systems.

How does UPC work?

Here’s a typical UPC symbol from a box of Hefty trash bags:

UPCA symbol
UPC-A symbol

You can see that UPC is made up of 12 numbers. We’ll ignore the first and last numbers for now and just pay attention to the middle 10 digits.

The first five digits are assigned to one manufacturer. These manufacturer numbers are centrally managed, assigned, and sold by Global Standard One, or GS1, a non-profit organization. GS1 was formerly known as the UPC Code council.

Once a company is assigned a UPC code it’s up to them to assign the last five digits to their products as they choose. The company then informs GS1 of these product code assignment and GS1 adds them to its master database  which is made available to third parties, like your local grocery chain to do look ups at their cash registers.

A UPC code is really a pointer to a record in the GS1 data base. The description and price are returned from the database lookup.

One interesting thing about UPC is that there are two different symbol patterns that encode each number depending on if it’s on the left or right side of the symbol. Look at how the number three is encoded differently on the two sides of this symbol:

This was done to allow omni-directional scanning with early supermarket scanners. These were often just a couple of laser lines that intersected at 90 degrees, like a plus (+) sign. Because the numbers were encoded differently on the left and right it allowed scanners to read the symbol a half at a time and put it together before transmitting. Each half of the symbol is taller than it is wide (oversquare) so it’s guaranteed to completely pass through one of the laser lines in a single pass.