Multimedia Databases

Multimedia Databases CSCI 5817 – Spring 1999 Eric Scharff Abstract Applications that require multimedia information tend to use extremely large quant...
Author: Jemimah Day
10 downloads 0 Views 24KB Size
Multimedia Databases CSCI 5817 – Spring 1999 Eric Scharff

Abstract Applications that require multimedia information tend to use extremely large quantities of data. Traditional databases, which are designed for queries over vast amounts of information, would seem to be ideal for managing multimedia resources. However, because of the radically different ways in which multimedia information must be processed, simply using standard relational systems provides insufficient support. Although Object-Oriented and Object-Relational systems are superior in their ability to handle multimedia content, additional support will be necessary to address the unique needs of multimedia applications. This paper discusses some of the motivating needs of Multimedia Databases, the strengths and weaknesses of current products, and why new products will need to support more than current object-based technologies.

Introduction As computers have become faster and their ability to store and manipulate information has grown exponentially, computers continue to be taxed by handling information of increasing size and complexity. Multimedia applications were once available only on special-purpose or high-end computers because of their resourceintensive nature. Today, even entry-level computers can adequately handle multimedia content. Just as the ability to process media has improved dramatically, so too has the amount of available multimedia content. The ability to handle multimedia content and the belief that multimedia will enrich computer applications has created a tremendous demand for tools to help manage multimedia information. When we think of multimedia, we often think of graphic and sound-intensive computer programs, especially educational software and video games. Electronic 1

encyclopedias on CD-ROM often claim to be multimedia databases. Although entertainment, education, and home applications are important, they represent only one possible use in a large field of applications that could take advantage of multimedia information. Corporations are already beginning to offer interactive product catalogs, and these catalogs might feature product images, demonstrations of product use, interactive instruction manuals, and so on. Large databases of images have already proved to be useful in digital libraries for architects, graphic designers, and scholars. Law enforcement agencies use and employ sophisticated processing using large databases of fingerprints and pictures. News agencies (and most companies) are bombarded with information in many different forms, and keeping track of this information is difficult. In short, there is great potential for applications that can gracefully integrate information in addition to traditional textual sources. Multimedia applications require exceptionally large amounts of information. Seventy-four minutes of raw compact-disc quality sound takes up 650MB. Images, sounds, animations, and movies all use significant compression to keep data sizes manageable. Because of this need for huge information spaces, one might think that traditional databases would be the perfect tools for managing multimedia information. After all, databases were designed for extremely large amounts of data and provide optimizers that work best with huge numbers of pieces of information. Therefore, it would seem quite natural to use standard database management systems to help store, retrieve, and manipulate multimedia content. Unfortunately, multimedia information is quite different from the relational tables managed in traditional systems. Relational technology was designed to help capture the relations found in standard business data-processing applications. The model relies on complex relations that can be composed of a relatively small set of simple data types. In contrast, multimedia information is based on large complex objects with varying degrees of structure, is often heavily dependant on time, and is processed using a wide set of queries appropriate for the media. Since multimedia information is fundamentally different than relational media, many of the assumptions 2

made by relational systems are inappropriate for multimedia applications. Although we can hopefully use our many years of expertise with relational systems in the design of multimedia databases, systems that provide effective multimedia support will likely be radically different than today’s relational systems. In order to support multimedia content effectively, new systems must provide support for the unique structure, queries, storage, and application demands of multimedia systems. In this paper, we will first look at the current generation of multimedia databases. We will see how ad-hoc, relational, object-oriented, and object-relational systems are being used to address multimedia content. Based on the strengths and weaknesses of these tools, we will explore what the future of multimedia databases should be.

What is a Multimedia Database? When we think of “multimedia content” in computer applications, we usually think of images, sounds, and movies. Many multimedia database products rely on these assumptions in what they support or claim to support. However, multimedia content is not just about an application that puts a movie on a computer screen. Instead, multimedia applications need to know how to store, utilize, integrate, and present diverse kinds of information. One of the critical aspects of tasks that truly support multimedia is the utilization of various different kinds of information. This diverse information is associated with one or more modalities in which it can be presented. For example, text may be presented visually or spoken. Typical flavors of information include text, sound, images, animations, movies, and even temporally sequenced computer programs. Multimedia applications deal with more than one of these kinds of information. In this way, “multimedia” can be understood as “multiple media,” needing to deal with diverse kinds of information. In addition to representing and handling multiple kinds of information, it is often important to integrate diverse kinds of media. For example, a newspaper story might include text, images, and sound clips. The pieces of information are conceptually 3

related and need to exist in conjunction with each other. Therefore, pieces of data that exist in isolation make sense in some contexts and not in others. Synthesizing the forms of information provided by the multiple media based on the needs of the user is an important part of the multimedia application. A full-fledged multimedia database helps with both of these major goals. It must be able to handle diverse kinds of data, meaning that support for only one kind of media is probably insufficient. It should also be able to store sufficient information to help integrate the media stored within. Furthermore, to be a useful database and not simply a static repository of multimedia content, a multimedia database must also provide sophisticated mechanisms for querying, retrieving, adding, and updating data. Although some databases may deal with largely static data, the ability to reformulate information (especially for queries and presentation) is extremely important. Of course, like any other database, a multimedia database needs a mechanism to communicate with a controlling user or application. For all of these requirements, the database must provide support that is appropriate for the multimedia content. Queries, output forms, and views for multimedia content will probably be quite different than their text-based relational counterparts.

Today’s Multimedia Databases The history of multimedia databases seems to parallel that of traditional database products. The first multimedia databases weren’t really database systems at all, but instead rely on operating system files and queries over these files. These are ad-hoc systems that served mostly as repositories. Later, hand-built systems for dealing with specific multimedia data were created, often providing a minimal level of abstraction away from the actual storage, but little else. Today, there are products that provide reusable engines (often built on top of existing database engines) that provide sophisticated management of multimedia objects. As we briefly look at multimedia products, we will see some of the unique needs that multimedia databases face. As stated previously, many CD-ROMS of content, such as image catalogs, clip art libraries, map collections, and encyclopedias, often call themselves multimedia 4

databases. Although they probably should not be considered databases, it is important to look at these applications because they were some of the first large collections of multimedia content. In these applications, there is rarely a clear separation between the database and the presenting application. Often, the “database” is merely a collection of files on the CD-ROM, named and organized in a manner useful for the application. Sometimes there are special files used for indexing and searching, but these file formats are only understood by the application. Even if there is an internal separation in the software architecture between the user presentation and the database storage, this is rarely reflected in the packaged application, nor are there additional tools for using the information. Querying the multimedia information is usually done with a very special purpose search facility. For example, an image repository might provide the ability to search the captions of figures. However, there is rarely an extensible query language (capable of producing queries more complicated than what the user interface provides) or even a notion of what data exists and is queryable. Nevertheless, these early static multimedia data storage systems provide important clues about what multimedia databases should support. Even with limited querying capability, the systems provide query types appropriate for various media. For example, music databases might provide the ability to query by the kind of instrument (for a symphonic piece) or to skip to various movements in a large piece of music. Furthermore, the systems are designed to integrate information to present a coherent wealth of information to the user, largely because the pieces of information were tightly coupled (although making new associations between pieces of information might be difficult.) Finally, multiple representations of data are often supported when appropriate. For example, under some circumstances a movie might be displayed in an interactive window with playback controls, and in other situations the user might prefer a filmstrip-style sequence of still frames. One of the earliest multimedia databases (that began in this style) that still exists today is ImageAXS [1, 2]. ImageAXS is primarily a database for image data. It began as a personal database for organizing an individual user’s potentially large collection of 5

images. In recent years, ImageAXS and related products have added support for additional kinds of information, large-scale servers for museums and libraries, and interactive tools that integrate with the Web. One of the important things added by ImageAXS is a flexible concept of metadata. In general, the actual raw media information often contains little information that is useful for queries. For example, with images, it is difficult to write queries that operate over raw pixel data, and it is even harder to query compressed data (such as GIF or JPEG images). Therefore, information about the media, (data about the data, or metadata) is frequently associated with pieces of media. This metadata is useful for querying and organizing the data. As in earlier systems, the database does not store the media content directly, but instead relies on the file system for the actual storage of the media content. MediaWay [3, 4] (originally called MediaDB) represents a transition between adhoc (or hand-crafted special purpose media databases) to general purpose multimedia database engines. MediaWay was one of the first systems to provide very specific support for a wide variety of different media types, specifically different media file formats such as GIF, QuickTime, BMP, and PowerPoint documents. It also provides primitives for commonly used things for commercial content databases, such as resource cost and a complex notion of “assets.” Utilizing knowledge specific to these file formats, it can provide complex querying based on the kinds of document. As in earlier systems, users can add an arbitrary kind of metadata, but unlike other systems, users can create structured metadata for creating arbitrarily complex queries. MediaWay also provides an large number of common database features, such as checkin and check-out, concurrency, archiving, and access control. MediaWay provides an interesting mix of standard database technology and functionality appropriate for media databases. The final class of applications that count as multimedia databases are commercial databases that have had multimedia functionality added to them. One prime example is UniSQL [5, 6], an object-relational database by Cincom. Strictly speaking, UniSQL is not a multimedia database as much as it is a standard object-relational database. 6

However, UniSQL has tried to position itself as a multimedia database. As with all relational databases, UniSQL tries to cast complex objects into a relational schema. UniSQL handles multimedia content by providing complex object types for various kinds of media, and in the object oriented style provides the facility to define new data types and operators appropriate for the new kinds of media. However, the actual media are stored as non-semantic large objects in the database, relying on metadata created by hand and associated with the object. UniSQL provides its own version of Object SQL for querying and returning objects, and has facilities to return multimedia objects in special ways and a visual query language to help present and formulate multimedia queries. In a similar effort, Oracle has recently been pushing their Oracle Video Server [7] and Intermedia Server [8] as an add-in package to their main product, the Oracle server. Unlike UniSQL’s object-relational view, Oracle provides a fairly classic implementation of relational SQL that understands that certain kinds of objects are multimedia objects. Therefore, instead of focusing on the storage of objects, Oracle has focused on the scalable deployment of resource-intensive media. Their server provides streaming protocols so that potentially large media can be sent incrementally to clients. They also try to reduce network utilization and client-side resources with selectively sending pieces of content and providing broadcast facilities. Thus, Oracle provides all of the power (and limitations) of standard SQL augmented with some awareness of media. Presumably, users can create arbitrary relations about their media content and then supplement these relations with the actual media data.

Where should Multimedia Databases Go? As we have seen, there are a wide variety of multimedia database tools available. From commercial SQL engines retro-fitted for multimedia content to small ad-hoc databases, systems are trying to fill the needs of the creators of large media databases. Indeed, with the tremendous emphasis on multimedia Web content and the attention that large players like Oracle have placed on multimedia content, it should be obvious

7

that databases that support multimedia are going to be an extremely important part of the future. However, determining what form these new databases will take is still an open question. Many of today’s multimedia databases do not address the needs that multimedia applications need the databases to handle. We will explore what is unique about multimedia data and how future databases will need to support these needs.

Why Multimedia Content is Different As we have already seen, multimedia content is unlike the integers, character arrays, dates, and other primitive types stored in traditional relational databases. The formats used for multimedia databases are designed for their efficient storage and presentation to users. For example, a sound file might be series of amplitudes or compressed frequencies that can be used for the audio playback. To the database that just stores the multimedia data as a binary large object, the bit stream of a symphony has the same form as the bit stream representing a radio talk show. Even if the kinds of information are stored differently, there is a limitation to the kinds of processing one might want to do on a piece of media. For example, determining the contents of an image given an array of pixels representing the image can be an extremely complex machine vision task. The identification of objects in a scene might not be the kind of processing that a database should do when processing a query. In short, multimedia information may store a great deal of information (and may require a lot of storage) but there is important information that should be associated with the multimedia content. Therefore, the maintenance of metadata, information about the media objects, is extremely important. One effective way to generate a good multimedia database today is to create a rich relational schema that captures information about the data. Based on these queries, the application would identify the right pieces of media. Although SQL might support this, keeping an explicit link between the media content and the metadata bout the media is extremely important. Helping to automate the creation and maintenance of metadata is critical.

8

The complexity and richness of multimedia databases makes object-oriented and object relational approaches very appealing. Many O-O and O/R vendors point to multimedia as a potentially rewarding application [9]. In fact, a good object-oriented framework for dealing with multimedia applications might be a sufficiently strong application to motivate future widespread adoption of object database technologies. Although objects look like the best way to store multimedia content today, providing a rich hierarchy and language for describing media will be an essential part of any successful multimedia database.

Why Multimedia Queries are Different Just as the contents of a multimedia database are radically different than that of a standard relational system, so too are the ways of requesting and formulating information. Merely providing the ability to query binary large objects would not be a sufficient way of supporting multimedia queries, because, as stated before, extracting information directly from raw multimedia data is not easy. This is not to say that the ability to extract useful features from multimedia systems is not available. Many schemes are possible to extract features such as the objects that compose an image, the detection of transitions in video, and the instrumentation and melody in audio. However, these features are likely to be stored as metadata, but in a flexible way so that new kinds of features can be added when necessary. Nevertheless, querying multimedia features is very different from querying features about employee salaries or insurance tables. Many multimedia queries are based on similarities of arbitrary sets of features. Rarely are two pieces of media alike, but they might be similar. A successful multimedia database must provide the facility for encoding these similarity based queries and the creation of new query types. The ability to query by example or query over portions of a piece of media must be supported in similar kinds of ways. In addition to similarity and incremental queries, multimedia queries are often inherently temporal. Media such as animations, movies, and sounds involve time at the most basic level. Furthermore, multimedia information

9

might be integrated based on time, such as linking a video with the associated closed caption text. Traditional databases (including more recent object databases) have major limitations in their ability to represent and reason about time. Since temporal representations and temporal queries are such an important part of the querying task, any future work on representing time (such as TSQL) will be very important in multimedia databases.

Why Multimedia Data Use is Different Even after finding the right storage and query mechanisms, the presentation of multimedia data is also unlike traditional information. Traditional data is returned in the form of sets of relations that are then further processed to provide results to the users. In multimedia databases, returning such tabular data is inappropriate. The multimedia data must be presented in the output mechanism appropriate for the kind of data stored. For some kinds of objects (like images), it might make sense to have only one kind of output (display of the image). For others (like text) there might be many kinds of output (visual display, speech, and so on.) The transformation of multimedia data into a form of presentation appropriate for the user is a major goal of the multimedia database. It is also the area in which all of today’s databases provide very little support. ImageAXP provides full images and sets of thumbnail images, but database support for flexibly transforming results into a useable presentation form would be a major breakthrough. The resource requirements of multimedia content are also radically different than largely textual data. Multimedia storage and transmission requires a great deal of communication bandwidth and processing power. The cost of playing back or transmitting an hour of video over a network is significant. To address these issues, multimedia content providers have utilized new schemes such as streaming media servers, providing various levels of quality, utilizing broadcast and multicast mechanisms, and so on. Since returning multimedia results is such a critical part of any database, multimedia databases must have an awareness of the various schemes for

10

manipulating materials and must be able to evolve as these technologies change over time.

Summary Multimedia has the potential for being an important new application for database technology. Its reliance on large complex pieces of information makes these systems excellent candidates for databases that are optimized for such huge information resources. Although traditional databases provide some support for multimedia content, effectively handling multimedia remains a major challenge. Because of the fundamentally different nature of multimedia content, future databases will need to come up with more effective ways to store, query, and present multimedia data. Supporting multimedia content effectively is a major challenge, but as we explore it, we are likely to find new techniques helpful not only for multimedia databases but for all large dynamic information repositories.

References [1] “ImageAXS Product Information Page,” , 1999. [2] C. Follmann, “Multimedia database to put on a Mac face,” MacWEEK, 12(26), July 13, 1998, 15. [3] “MediaWay,” , 1999. [4] B. Phillips, “MediaWay presses access to multimedia database,” PC Week, 13(7), February 19, 1996, 39-40. [5] “UniSQL Product Information Page,” , 1999. [6] “UniSQL announces UniSQL for Windows,” C/C++ Users Journal, 14(12), December 1996, 78-79. [7] J. Senna, “Oracle Video Server Delivers,” InfoWorld, 20(25), June 22, 1998, . [8] “Oracle Intermedia Server,” , 1999. [9] J. Celko and J. Celko, “Debunking object-database myths,” Byte, 21(6), October 1997, 101-106.

11