1
Introduction

Oracle8i Visual Information Retrieval is an extension to Oracle8i Enterprise Edition that provides image storage, content-based retrieval, and format conversion capabilities through an object type. The capabilities of this product encompass the storage, retrieval, and manipulation of image data managed by the Oracle8i Enterprise Edition database server. This product supports image storage using binary large objects (BLOBs) and references to image data residing externally in BFILEs or URLs.

Visual Information Retrieval is a building block for various imaging applications rather than being an end-user application in itself. It consists of an object type along with related methods for managing and processing image data. Some example applications are:

Online digital art galleries or museums
Real estate marketing
Document imaging
Stock photograph collections (for example, for fashion designers or architects)

These applications have certain distinct requirements and some degree of commonality. The image object type accommodates the commonality and supports extensions that address application-specific requirements. With Visual Information Retrieval, images can be managed as easily as standard attribute data.

Visual Information Retrieval supports static, two-dimensional images in Oracle databases. The images may be bitonal (black and white) document images, grayscale photographs, or color photographic images. The product provides the means to add image columns/objects to existing tables, insert and retrieve images, and convert between popular application formats. This enables database designers to extend existing application databases with images or to build new end-user image database applications. Software developers can use the basic functions provided here to build specialized image applications.

Visual Information Retrieval can also be used for face recognition when used in conjunction with software from Viisage Technology, Inc. Face recognition has applications ranging from security ("is this the owner of this ATM card") to movie casting ("find the actor who looks most like this historical figure."

1.1 If You Already Understand Oracle8i interMedia Image

If you are already familiar with Oracle8i interMedia Image service, either as a component of Oracle8i interMedia or in a previous release as Oracle8 Image Cartridge, you can skim much of the conceptual information in this chapter. The base Image component and Visual Information Retrieval both let you store an image as an object in the database or as a reference to an external file or URL. Both products let you store and query on the following attributes:

Image height
Image width
Image size
File type or format (such as TIFF)
Compression type or format (such as JPEG)
Image type (such as monochrome)
MIME type

The Visual Information Retrieval object type is defined as the Image object, plus a signature attribute.

The main differences between the Image component and the Visual Information Retrieval product are that Visual Information Retrieval lets you create and use indexes, and perform content-based retrieval. Content-based retrieval lets you perform queries based on intrinsic visual attributes of the image (color, structure, texture), rather than being limited to keyword searches in textual annotations or descriptions. The underlying technology was developed by Virage, Inc., a leader in content-based retrieval.

For an example of a query using content-based retrieval, consider a database containing images of many automobiles. If you want to retrieve information on all the red automobiles, you would specify an image of a red automobile for comparison and request all records where the image looks like your picture. To increase the accuracy of the query (because all the images are of automobiles and you are interested only in red ones), you specify that the greatest relative weight is to be given to the global color attribute, with no weight given to the structure and texture attributes.

For further information on content-based retrieval, including how and why to specify relative weights (importance) for different visual attributes, see Chapter 2.

1.2 Image Concepts

This section contains conceptual material about digital images. Chapter 2 contains conceptual information about content-based retrieval and using Visual Information Retrieval to build image applications or specialized image services.

1.2.1 Digital Images

Visual Information Retrieval supports two-dimensional, static, digital images stored as binary representations of real-world objects or scenes. Images may be produced by a document or photograph scanner, a video source such as a camera or video tape recorder connected to a video digitizer or frame grabber, other specialized image capture devices, or even by program algorithms. Capture devices take an analog or continuous signal, such as the light that falls onto the film in a camera, and convert it into digital values on a two-dimensional grid of data points known as pixels.

Visual Information Retrieval provides the mechanism to integrate the storage and retrieval of images in Oracle databases using the Oracle8i Enterprise Edition database server.

1.2.2 Image Components

A digital image can be thought of as consisting of the image data (digitized bits) and attributes that describe the characteristics of the image. Image applications sometimes associate application-specific information, such as the name of the person whose image a photograph represents, with an image by storing descriptive text in an attribute or column in the database table.

The minimal attributes carried along with an image may include such things as its size (height in scan lines and width in pixels), the resolution at which it was sampled, and the number of bits per pixel in each of the colors that were sampled. The data attributes describe the image as it was produced by the capture device.

The image data (pixels) can have varying depths (bits per pixel) depending upon how the image was captured, and the image data can be organized in various ways. The organization of the image data, known as the data format, is crucial to accessing and accurately representing the image.

The size of digital images (number of bytes) tends to be large compared to traditional computer objects such as numbers and text. Therefore, many compression schemes are in use that squeeze an image into fewer bytes, thus putting a smaller load on storage devices and networks. Lossless compression schemes squeeze an image in such a fashion that when it is decompressed, the resulting image is bit-for-bit identical to the original. Lossy compression schemes do not result in a bit-identical image when decompressed, but the differences may be imperceptible to the human eye, or at worst, tolerable.

Visual Information Retrieval also provides a signature attribute in the image type that permits content-based retrieval. The signature is a vector containing detailed information about the visual attributes of the image. The signature is created when the image is processed by Visual Information Retrieval, and is used in all content-based queries. For more information about the signature attribute, see Section 2.2.

1.2.3 Interchange Formats

An image interchange format describes a well-defined organization and use of image attributes, data, and often a compression scheme, allowing different applications to create, interchange, and use images. Interchange formats are often stored in or as disk files, but may also be exchanged in a sequential fashion over a network and be referred to as a protocol. There are many application subdomains within the digital imaging world and many applications that create or use digital images within these. To assist application developers, Visual Information Retrieval supports many popular interchange formats. (See Appendix A, "File and Compression Formats".)

1.3 Object Relational Technology

The Oracle8i Enterprise Edition database server is an object relational database management system. That means that in addition to its traditional role in the safe and efficient management of relational data, it now provides support for the definition of object types including the data involved in an object and the operations that can be performed on it (methods). This powerful mechanism, well established in the object-oriented world, includes integral support for binary large objects (BLOBs) to provide the basis for adding complex objects, such as digital images, to Oracle databases.

See the following for extensive information on using BLOBs and BFILEs:

Oracle8i Application Developer's Guide - Large Objects (LOBs)
Oracle8i Concepts

1.3.1 Storing Images

Visual Information Retrieval can store digital images within the Oracle database under transactional control through the BLOB mechanism. It can also externally reference digital images stored in flat files through the BFILE mechanism, or an HTTP server-based URL. Although this is particularly convenient for integrating pre-existing sets of flat-file images with an Oracle database, these images will not be under transactional control.

The object relational type is known as ORDVir, and is based on the ORDImage object type, which in turn is based on the ORDSource object type. See the Oracle8i interMedia Audio, Image, and Video User's Guide and Reference for more details on ORDSource and ORDImage.

1.3.2 Querying Images

Once stored within an Oracle database, an image can be found using traditional queries by finding a row in a table that contains the image using the various alphanumeric columns (attributes) of the table. For example, select a photograph from the Employee table where the employee name is Jane Doe. An example of content-based retrieval might be as follows: compare a picture of the person trying to get in the front door to the stored photographs in the employees table, retrieve the most similar image, and check the access hours to determine if the door should unlock.

The meaning of "similar" can be refined in nonfacial image comparisons. You can experiment with different weight values for the visual attributes of global color, local color, texture, and structure.

The collection of digital images in the database can be related to some set of attributes or keywords that describe the associated content. The image content can be described with text and numbers such as dates and identification numbers. For Oracle8i, image attributes can reside in the same table as the image object type. Alternatively, an application designer could define a composite object type that contains the Visual Information Retrieval object type along with other attributes.

1.3.3 Accessing Images

Applications access and manipulate images using SQL, PL/SQL, or Java through the ORDVir image object type. The object syntax for accessing attributes within a complex object, such as an image, is the dot notation:

variable.data_attribute

The syntax for invoking methods of a complex object, such as an image, is also the dot notation:

variable.function(parameter1, parameter2, ...)

See Oracle8i Concepts for information on this and other SQL syntax.

See Oracle8i Visual Information Retrieval Java Client User's Guide and Reference for information on the Visual Information Retrieval Java interface.

1.4 Visual Information Retrieval Methods and Operations

Visual Information Retrieval provides several functions for performing format conversion, compression, and data manipulation operations on image data. It also provides the ability to extract image properties and to compare images based on their content. This section presents a conceptual overview of the Visual Information Retrieval methods and operations.

1.4.1 Analyzing and Comparing Images

The Analyze( ) operator and analyze method examine an image and create a signature based on the global and local colors, texture, and structure of the image content. Two additional operators, VIRScore( ) and VIRSimilar( ), compare the signatures of two images to determine if the images match based on a set of user-supplied criteria.

For facial recognition, use software from Viisage Technology, Inc. to generate a facial signature. Then use the Visual Information Retrieval Convert( ) operator to convert the signature to a format usable by the VIRScore( ) and VIRSimilar( ) operators.

1.4.2 Extracting Properties from Images

The setProperties( ) method is used to extract important properties from natively supported image data, including the following:

Height and width in pixels
Total size in bytes
File format (such as TIFF or BMP)
Compression format (such as JPEG or LZW)
Content format (such as monochrome or 8-bit grayscale)
MIME type

The size of the image data may be machine dependent. Importing and exporting images may require a recalculation using the setProperties( ) method to determine the current properties.

A set of functions prefaced with "get" can be used to retrieve individually stored attributes of natively supported images. For example, getHeight( ) and getWidth( ) return the image height and width in pixels. See Chapter 4 for a complete list of these functions and their syntax.

1.4.3 Verifying Image Properties

The checkProperties( ) method is used to verify that the properties stored in attributes of an image object match the actual image properties. This function operates on any natively supported image format.

1.4.4 Modifying Images

The process( ) and processCopy( ) methods are used for image format conversion, compression, and basic manipulation functions including scaling and cutting. The image may be compressed using an algorithm from the set of supported image formats and compression schemes. For example, image data in the TIFF format may be compressed using Packbits, Huffman, JPEG, LZW, or one of the other supported schemes. Appendix A, "File and Compression Formats" lists the supported image formats and related compression schemes. Appendix C, "Process and ProcessCopy Operators" describes the characteristics that can be modified.

For this release, the output of any image manipulation function must be directed to a BLOB. In-place modification is supported for BLOBs, not for BFILEs or URLs.

1.4.5 Moving Images

There are several ways to move images between systems, databases, or records:

The copy( ) and processCopy( ) methods copy an image into another ORDVir object.
The export( ), import( ), and importFrom( ) methods are useful in moving images between databases. The Convert( ) operator can be used to convert the image signature if you are moving images between different hardware platforms.
The setSource( ) method sets or alters information about an image stored externally.

1.4.6 Deleting Images

The deleteContent( ) method allows you to remove the contents of the image BLOB.

Because external files are read-only, this method works only with images stored in BLOBs.

1.4.7 Setting Image Characteristics Manually

Foreign images are images not natively supported by Visual Information Retrieval. The characteristics of these files cannot be read from the image header automatically. A special version of the setProperties( ) method exists to let you set these characteristics explicitly. Note that there is no verification that you have set the characteristics appropriately.

Two methods, setMimeType( ) and setUpdateTime( ), are called automatically whenever native images are modified. You must call these methods explicitly for foreign images.

1 Introduction