An algorithm for distinguishing apples from bananas
|[chongo's home] [Astronomy] [Mathematics] [Prime Numbers] [Programming] [Technology] [contacting Landon]|
AbstractThe number of set pixels approximates the size of the object when the objects contrasts well with its background. An algorithm is presented that can reliably distinguish between an apple and a banana is presented. This algorithm, which runs in linear or O(n) time with respect to image size, counts the number of set pixels in a low grade B&W image. Using a wide collection of apple and banana varieties and sizes, the algorithm was able to successfully distinguish between apples and bananas.
BackgroundImage recognition is an on-going area of research in Artificial Intelligence. Systems for general image recognition have been published in the Journal of Artificial Intelligence Research as well as the MIT Artificial Intelligence Lab is being made. These recognition systems tend of be both complex and computationally intensive.
When the recognition problem is simplified down to distinguishing between just a few objects under controlled conditions, simple but effective algorithms can be used. A simplified image recognition problem allows one to use algorithms that are neither complex nor computationally intensive.
The problemHow can one, by computer, distinguish between an apple and a banana given limited computational resources and limited decision time?
For this paper we will consider the situation where one has a conveyer belt (see figure 1) on which both apples and bananas are placed.
Fig 1. Conveyer Belt setup
In order to comply with the problem condition of ``limited computational resources'' we will assume the use of a low end computer or PC. For example, one may have only a $1000 PC that is typically found in retail personal computer stores in the later 1990's.
In keeping with the low cost constrains, we will assume that the computer has connected to it a low end black and white digital camera that produces an uncompressed TIFF image. Our solution below does not depend on the camera producing TIFF format images. Any format in which the values of individual pixels can be quickly determined is sufficient. Non-compressive cameras and no significant image processing software is in keeping with the same ``limited computational resources'' problem theme.
We assume that fruit is placed on the conveyer belt so that there is some space in-between each individual fruit. We will assume that there is some mechanism, perhaps a light beam & photocell device, to detect when fruit is under the camera. We will assume that the camera is positioned such that when the fruit detector detects a price of fruit, only that fruit is inside the view of the camera.
We will further assume that the conveyer belt is a low gloss flat black color.
It should be noted that due to the speed of the belt, one has only a small amount of time to made the apple / banana decision. This constraint is in keeping with the problem condition of ``limited decision time''.
It also be noted that we do not assume any particular orientation of the fruit other than it is flat on the conveyer belt. For apples this can mean almost any position where as a banana's orientation is more restricted by gravity.
The solutionGiven a B&W TIFF image of either an apple or a banana, we will count the number of set bits (white pixels) in the image. Apples will have a lower set bit than bananas.
In the above problem scenario, when the fruit detector determines
that a piece of fruit is under the camera, a picture is
taken and a TIFF image is transfered to the computer's memory.
The transfer of the TIFF image to the computer is detected by
The program in turn counts the number of set bits and based on
a pre-determined threshold value, determines if the fruit is
an apple or a banana.
The image transfer and bit count process can be done fairly quickly. Low end camera images are typically small. Thus the transfer from the camera to computer memory as well as bit counting will not take very long.
An as example, a 413 by 413 pixel TIFF B&W image is typically less than 22k bytes in size. A low end camera takes an image in 1/60th of a second. Given the speed of most video signals, the download time is usually much less than 1/60th of a second. Thus the total camera, download and memory scan time would be no more than 1/30th of a second.
Checking the solutionWe will check the correctness by the examining images of a wide variety of apples and bananas. We will use the disting program to simulate steps 5 & 6 in the distinguishing algorithm (see figure 2).
In our test, we used a wide variety of apples and
bananas that are commonly found in major produce stores
of the the United States.
in US markets come in multiple varieties,
bananas in US markets are often of a
The quality of the fruit selected was typical of a major US produce section of a supermarket. Apples were reasonably ripe and free of spots. Bananas were somewhat green (as is the unfortunate case of most bananas in the US) with few spots.
Produce quality grade apples and bananas vary in terms of color, shape and size. To account for these differences, we selected 2 of each type of apple and banana. With the kind assistance of a local produce vendor, we selected the both largest and the smallest (by weight) of each variety used (see table 1).
Given that we cannot assume any particular orientation of the fruit other than it is flat on on the conveyer belt, we took 2 images of each fruit. In the case of apples (which are reasonably symmetric around the axis of the stem) we imaged them both on their sides as well as on their ends. In the case of the bananas, which lying flat are restricted their sizes, we imaged both sides.
The original images taken were from an 8 bit color camera,
413 by 413 pixels in size.
At the distance taken, the resolution was approximately
20 pixels per centimeter (50 pixels per inch).
Lighting was from behind the camera.
We converted the original color images (see table 1)
into B&W images by two means.
One method was by a dithering process, the other
was by a saturation process.
Color to B&W dithering was perform in a standard
fashion by the XV program using its
TIFF dithering method.
The saturation conversion process used a luminosity
The step was set to the highest luminosity level such
that very few pixels off of the flat back background
(simulated conveyer belt) registered as white pixels.
The luminosity level was set by experimentation on
the reference frame images (pictures without fruit).
The resultsThe white pixel counts as determined by the disting program are detailed in table 5 below.
The large Golden Delicious apple produced the most
number of white pixels in saturated images.
However that saturated apple image had only 78.2% of the white pixels of
the smallest banana.
The large Golden Delicious also produced the most
number of white pixels in dithered images.
However that dithered apple image had only 64.5% of the white pixels of
the smallest banana.
However as one can see from table 6, the
typical apple imaged was even more distinct.
bananas have between
3 and 4 times the number of white pixels
Dithering of images produced the sharpest distinction between apples and bananas. Presumably dithered images would result in fewer mis-identifications errors. But in the worst case that we observed, the apple and banana while pixel count populations were very distinct.
The time to both read the image and count the pixels was so insignificant as to not be measurable, often taking less than 10 milliseconds on a 200 Mhz Pentium or SGI R5k O2.
ConclusionsOne can distinguish between apples and bananas by scanning their black and white image against a flat black background and counting the number of white pixels. This was accomplished over a wide range of sizes and varieties. No special orientation other than the fruit lying flat was required. B&W image production by dithering was shown to produce the best results. A single image scan was required resulting in an algorithm that runs on linear or O(n) time with respect to the image size.
Landon Curt Noll
chongo (was here) /\oo/\
$Revision: 7.3 $ $Date: 2014/02/10 03:06:12 $