2011-08-15



(This article by Marsha A. Harmon was originally published in 2011, and remains one of our most popular articles)

Black and White

Have you ever looked at the black and white symbol on your grocery product packaging, or on the cover of a book that you just bought or even a department store receipt and wondered how the information is encoded in those bars and spaces? Understanding how a bar code symbol is constructed might even show you how to actually read a symbol without the aid of a scanner.

We're talking about linear bar codes here and not about two-dimensional (2D) symbols. Although the PDF417 symbol on the left below is constructed of rows of linear symbols stacked on top of one another and could technically be called a bar code, it is not possible to interpret what is encoded without the aid of a scanner. And even though the information from the QR Code symbol on the right can be easily decoded using a common cell phone application, it is not possible to visually interpret what is encoded.



2D symbols, whether stacked like the PDF417 symbol or matrix like the QR Code, encode information in two dimensions -- horizontally and vertically, which allows a large amount of data to be encoded in a small space. The symbols above contain the same 60 characters of information about the author of this article.

Code 39

A linear bar code, on the other hand, encodes information only horizontally and fewer characters can be encoded in the same amount of space. The Code 39 symbol below contains six characters of data, the first name of the author, and is about the same size as the PDF417 symbol above encoding 60 characters.

Code 39 is a common bar code used for various label applications such as on name badges and for inventory and industrial applications. It's easy to use and there is technically no limit to the number of characters that can be encoded in a symbol, although there are practical limits to consider regarding the usable length and how it will be read.

Code 39 was so named because the original character set consisted of 39 alphanumeric characters. Additional punctuation characters have since been added, expanding the character set to 43. It is also sometimes called Code 3 of 9 because each character encoded in the symbol consists of nine elements, five bars and four spaces, and three of the nine elements are wide.

The example below illustrates the encoded asterisk (*) sign, used as the start and stop characters in a Code 39 symbol. Start and stop characters tell the scanner when the bar code begins and ends. The asterisks are sometimes printed in the human readable interpretation, but not always. Conventionally, the character on the left is considered the "start," and the one on the right, the "stop." In a Code 39 symbol, the start and stop characters are the same, but that is not the case for all bar code symbologies.

Counting the bars and spaces in the example will give you a total of nine elements. Two of the bars and one space are wide and the remaining three bars and three spaces are narrow. This pattern is found at the beginning and end of every Code 39 symbol.

*

Another purpose of the encoded start and stop characters in a Code 39 symbol is to tell the scanner if it is reading left-to-right or right-to-left. If the pattern begins with a narrow bar followed by a wide space and a narrow bar, etc. it is scanning left to right. If the pattern begins with a narrow bar and then a narrow space and a wide space, etc. it is scanning right to left.

An important rule for all machine-readable symbols is the Quiet Zone, a blank area that tells the bar code reader or scanner where the data begins and ends and prevents the scanner from picking up information that is not part of the bar code. A quiet zone is required to completely surround a two dimensional symbol and at the beginning and end of a linear bar code. For a Code 39 symbol, the quiet zone will precede the start code and follow the stop code and should be ten times width of the narrow element. The narrow element is also known as the "X" dimension. Understanding this concept becomes more important when discussing other symbologies.

The "M" character below follows the same encoding rule as that of the asterisk character, but in a different pattern: wide bar, narrow space, wide bar, narrow space, narrow bar, narrow space, narrow bar, wide space, narrow bar.

M

Following are the remainder of the characters encoded in the symbol above.

A

R

S

H

Because Code 39 is a "discrete" symbology, it is possible, by looking closely, to identify the encoded asterisks at the beginning and end, as well as the individual letters in the Code 39 symbol above. In a discrete symbology, each of the characters is a standalone like the characters on a typewriter. As shown in the examples above, each of the characters is encoded to begin and end with a bar. This design requires a small space between the characters that serves no purpose but to separate the individual characters.

U.P.C.

The bar code we see in our daily lives is the symbol marked on our grocery product packages, the Universal Product Code, commonly called U.P.C. As a bit of trivia, the official abbreviation for this symbology should contain periods after each of the letters so there will be no confusion with references to the Uniform Plumbing Code, which has a trademark for the designation "UPC" without the periods.

The design of U.P.C. is much different from the design of Code 39, most notably because a U.P.C. symbol encodes only numeric characters and there is no inter-character space.

As with Code 39, a U.P.C. symbol begins and ends with a Quiet Zone. In U.P.C., the first and last human-readable numbers are sometimes placed outside the symbol to help protect the quiet zones.

Unique to U.P.C. are the three sets of "guard bars" at the start, middle and end of the symbol. These bars separate the encoded numbers on the right and left sides of the symbol and they establish timing for the scanner that reads the symbol.

The six digits encoded on the left side of the symbol identify the manufacturer of the item – 098756 in the example above. The first five digits on the right side, 50001 in the example symbol, is the item code.

Every item sold must have an item code, and because they're priced differently, each method of packaging for every item must have a separate item code. For example, a 12-ounce can of Coke requires a different item number from a 16-ounce bottle of Coke, a 6-pack of 12-ounce cans requires a different number than a 24-pack, etc. Surprising to many customers, the price of the item is not encoded in the item code. Instead, the cash register system retrieves the price from the store's database when the item is scanned at the checkout. This means, if there is a difference between the shelf price of the item and the price that rings up at checkout, it's an issue with the database and not a failure of the symbol.

There is an exception to the practice of not encoding the price in the symbol, and that is the Bookland EAN symbol (a version of U.P.C) on books. The five-digit add-on code to the right of the symbol designates the form of currency, (e.g., "5" equals U.S. dollars) followed by a four-digit price without the decimal point.

The final digit on the right side of every U.P.C. symbol is the check digit, a number created by a complicated mathematical calculation, that allows the scanner to determine if the number was correctly scanned. Most bar code symbologies incorporate the use of a check digit, and some methods are more complicated than others.

The printer software performs the initial check digit calculation from the information in the symbol and the checkout scanner performs the calculation each time it scans an item. If the check digit calculated at the point of sale is different from the check digit on the symbol, the scanner knows that something went wrong and the item needs to be rescanned.

Encoding a U.P.C. symbol

The "X" dimension in a bar code symbol refers to the width of the smallest bar. As is apparent in the example above, the bars and the spaces that make up a U.P.C. symbol are of various widths: they can be 1, 2, 3 or 4 times the width of the narrowest bar.

Each encoded number in a U.P.C. symbol has four elements assigned to it, 2 black and 2 white. The numbers on the left side of the symbol always begin with a space and have bars and spaces in a white-black-white-black order. The numbers on the right always begin with a bar and have the correlating bars and spaces in the opposite black-white-black-white order. Each number is allotted a total of 7 widths.

It's possible, by using the table below and with a bit of patience, to find the encoded numbers in the sample U.P.C. symbol above

Code 128

Of the many bar code symbologies left to choose from, the last one for review here is Code 128, so named because it encodes the complete character set of the American Standard Code for Information Interchange (ASCII) characters. The ASCII Code Chart below includes 128 characters, some of which are non-printable. The space character is considered a non-printing graphic.

As with Code 39, this symbology is alphanumeric. Like U.P.C., it uses bars and spaces of multiple widths to encode the data characters.

A Code 128 symbol is made up of six sections:

• Quiet Zone
• Start Character
• Encoded Data
• Check Character
• Stop Character
• Quiet Zone

But this is a complicated symbology. In Code 128, each encoded character is composed of three bars and three spaces, and like in U.P.C., a bar or space can be 1, 2, 3 or 4 units wide. In U.P.C., the total of the widths of the bars and spaces must equal 7 units. But in Code 128 the sum of the widths of the spaces must be odd, the sum of the widths of the bars must be even, and the total must equal 11 units per character. Whew!

And that's not all! Code 128 provides a choice of three code sets, to encode various combinations of upper or lower case alphabetic, numeric and special characters. One of the code sets allows two characters to be encoded in the space of one, which makes the symbol very space efficient. In addition, the character sets can be mixed in a single symbol, making it useful for a variety of applications.

"Red" all over?

For years industry experts have predicted the demise of the bar code in favor of new and exciting technologies like Radio Frequency Identification (RFID). Today, however, the cost of printing a bar code on a product package is basically cost-free because the symbol is already a part of the design. In addition, point-of-sale systems around the world have been successfully using the technology for decades, and they work well! The philosophy, "If it ain't broke, don't fix it" seems prevalent with both manufacturers and retailers. So, until the tags associated with RFID are as cost-free to the manufacturer as printing a bar code, and then only when a tag can be read without the investment of higher cost reading technology currently in use, will the replacement of bar codes be seriously considered. Realistically, however, there will likely always be applications more appropriate for bar codes than any other technology.

There's a riddle that was well known decades ago and it still seems relevant today: "What's black and white and 'red' all over?"

The answer could easily be "bar codes."

Author

Marsha A. Harmon is Vice President and Chief Operating Officer of Q.E.D. Systems, an organization providing standards development, as well as educational, advisory, and systems design services; focusing primarily on electronic commerce/business technologies, including the disciplines of bar code technology, two-dimensional symbols, radio frequency communications, Radio Frequency Identification (RFID) and Real Time Locating Systems (RTLS).

Related articles:

QR Codes - Everything you always wanted to know*

How does a barcode work?

Create your own Barcode with our Barcode Generators

How to get a Bar Code

{jcomments on}

Show more