What is multimedia?
Chapter 1: Overview
Multimedia = too many cables? Multimedia combines audio and visual materials to provide computerized interaction of text, sound, graphics, images, animation & video to enhance communication and to enrich its presentation. Multimedia systems handle at least one type of “continuous media” as well as “static media”.
1
2
Multimedia applications
Multimedia applications Multimedia mail systems. Teleconferencing
Videodisc applications
A DVD can hold 2-8 hrs of high-quality video.
Electronic games Web browsers Multimedia presentation systems
An “engine” that displays, synchronizes, provides interaction with, and generally manipulates multimedia material (e.g., Macromedia Flash).
3
A computer equipped with microphone, speakers, and a video camera, and placed on a multimedia network, can establish audio and video connections between other, similarly equipped, machines. Multi-user tools, such as group editors. A group editor allows conference participants to share documents, and to edit the documents simultaneously.
4
Multimedia applications Multimedia services
Interactive TV Interactive shopping Education Medical services (telemedicine) Video-on-demand
5
Analog vs. Digital Two ways to process information: analog and digital Examples:
Vinyl LP vs. CD conventional radio vs. web-radio slide-rule vs. digital calculator
12
Analog recording
Analog recording
Sound is caused by a variation of air pressure – a sound wave
An audio tape records an analog signal that represents the sound wave. An analog signal is continuous – there are an infinite number of points (values) and each value has infinite precision.
13
Digital recording
ADC
To record an analog signal by a computer (which has a finite amount of storage), we need to perform analog-to-digital conversion (ADC):
14
Sampling Quantization Coding
Consider an analog signal
time
15
16
Sampling
Sampling
convert continuous signal into discrete values by taking samples
The original signal can be approximated by interpolation using the sampled values.
17
18
Sampling
Quantization
More samples ⇒ more accurate approximation
A computer cannot record the values with infinite precision. A value has to be quantized. a quantization Level/interval
19
20
Quantization
Quantization
A computer cannot record the values with infinite precision. A value has to be quantized.
Each sample value is replaced by the nominal value of its quantization level. a sample value
a coarser quantization
a nominal value
21
Quantization
22
Quantization The difference between a sample value and its quantization value is called the quantization error. quantization error
Each sample value is replaced by the nominal value of its quantization level.
23
24
Quantization
Quantization
More quantization levels gives a more accurate representation.
More quantization levels gives a more accurate representation.
25
26
Quantization
Quantization
More quantization levels gives a more accurate representation.
Note that quantization does not necessarily have to be uniform or linear. wider quantization level narrower quantization level
27
28
Coding
Coding
assign a codeword to each quantization level.
An analog signal could then be represented digitally by a string of 0’s and 1’s
0101 0100 0011 0010 0001 0000 1000 1001 1010 1011 1100 1101
0010 0011 0011 0000 1010 1011 … …
29
Why digital?
Why digital?
ADC introduces error
The waveform constructed by interpolating samples is not a verbatim copy of the original. Quantization introduces quantization error.
so …
Why digital representation? What should be the sampling frequency? How many quantization levels (or bits/sample) shall we use?
Example: for CD-DA (music CD):
sampling frequency: 44,100 samples per second 16 bits per sample (i.e., 65,536 quantization levels)
30
31
easy integration & sharing of resources (storage, transmission network). can be processed by a computer, e.g., encryption, watermarking, compression, digital effects, etc. more “reliable” storage and transmission (through error-correction and replication). can be used and copied many times without losing quality. 32
Why digital?
Media types
channel
channel
analog signal transmission
channel
reconstruct
digital signal transmission
Non-temporal (Discrete) – do not have a time dimension, and their contents & meanings do not depend on the presentation time.
Text Image Graphics
33
Media types
Text
Temporal (Continuous, Isochronous) – have a time dimension. They convey meanings only if “displayed” at a specific rate.
34
not visually exciting conveys essential and precise information text representation: e.g., ASCII, BIG5, GB storage “friendly” sometimes, certain information is too abstract to be captured by words
Video Digital Audio Music
35
36
Digital images
Digital images sampling
To digitize a photo, we again perform ADC
37
Digital images
38
Digital images
pick a sample color from each “box” to record
finer sampling gives a clearer image
Each sample is called a pixel ⎯ picture element 39
40
Digital images
Digital images The resolution of a digital camera refers to the number of samples that the camera takes. Some typical numbers: You need
finer sampling yet
No. of samples
Resolution
Print size
0.3 M
640 × 480
3 × 4 inches
1M
1152 × 864
5 × 7 inches
1.2 M
1280 × 960
5 × 7 inches
2.1 M
1600 × 1200
8 × 10 inches
3.3 M
2048 × 1536
11 × 14 inches
at least 150ppi (pixel per inch) for printing
For better quality, make it 300 ppi
41
42
Digital cameras
Capturing color
A Charge-coupled-device (CCD) camera has a 2dimensional array of photosites that convert the amount of light intensities into equivalent electrical charges.
Split a light beam into its red, green and blue components and use 3 separate CCD grids.
Note: there are also less-expensive digital cameras that use CMOS instead of CCD. CMOS sensors are usually: z z z
There are a number of ways that a CCD camera captures color:
More susceptible to noise Less light-sensitive Consume less power
Both technologies have advanced in recent years that they are comparable against each other now. 43
44
Capturing color
Capturing color
There are a number of ways that a CCD camera captures color:
The amount of charge at each photosite tells the intensity of a color component of light detected at that point.
A more economical way is to use 1 CCD panel and cover each photosite with a different filter.
R G B B R G G B R
45
Capturing color
46
Capturing color
R G B B R G G B R
R G B B R G G B R
red
green
47
48
Capturing color
Capturing color Bayer filter
R G B B R G G B R blue
49
50
Digital images
Digital images
two-dimensional arrays of pixels (picture elements) of varying color and intensity. color model: how to specify the color of a pixel (coding)? Additive mixing
Subtractive mixing
RGB: colors can be represented by numeric triplets specifying red (R), green (G), and blue (B) intensities. Y’CRCB (for digital images and video) Color CMY(K) (for printing) z z z z
Cyan (no red) Magenta (no green) Yellow (no blue) K — black ink
Compression
A page-sized 24-bit color image with 300 pixels per inch (PPI) takes up about 20Mbytes. Lossless and lossy compression. Many different standards: JPEG, GIF (Graphic Interchange Format), ... Example: run-length encoding.
Image transformation and processing.
differences
51
E.g., morphing (one image transformed into another). 52
Graphics
Graphics
Graphics data are represented by a geometric model + a set of graphics operations. Geometric model
A collection of 2D/3D geometric primitives (lines, circles, polygons, curves). Transformations: rotation, translation, scaling.
53
54
3D models
Graphics
Wire frame
Graphics operations are applied to make the scene more realistic:
Coloring Shading
Lighting
z
taken from Planet Architecture
z z z
Ambient light (from all directions) Point light (inverse square law) Spot light (a cone-shaped volume)
Viewing z
55
How a surface reflect light
Where the camera is
56
Graphics
Taken from SIGGRAPH’ 97
Texture mapping z
Graphics
applying an image onto a surface
Rendering z
converts a model + shading, lighting + viewing ... into an image.
Animation
Eye-catching Good for demonstration
57
Graphics
58
Graphics
59
60
Graphics
Analog video A sequence of images called frames, “persistence of vision”. Originally, motion pictures are shown at an (insufficient) frame rate of 16fps. It was found that for smooth motion, 24fps is needed. Attributes: frame rate, resolution, aspect ratio, interlacing, refresh rate.
taken from Planet Architecture
61
Analog video
Analog video Theoretically, most color can be produced by mixing 3 primary colors (red, green, blue). An analog video camera produces 3 distinct continuous signals, one for each color component. scan line
Formats.
Examples: z z
NTSC (National Television Systems Committee) PAL (Phase Alternation Line).
format frame rate NTSC 30 PAL 25 HDTV(US) 30 HDTV(EURO) 25 MUSE(Japan) 30
scan lines aspect ratio 525 4:3 625 4:3 1125 16:9 1250 16:9 1125 16:9
62
horizontal blanking
signal
black
picture width to picture height
white
…
black
vertical blanking 63
white 64
Analog video
Analog video
Theoretically, most color can be produced by mixing 3 primary colors (red, green, blue). An analog video camera produces 3 distinct continuous signals, one for each color component. scan line
Theoretically, most color can be produced by mixing 3 primary colors (red, green, blue). An analog video camera produces 3 distinct continuous signals, one for each color component. scan line
horizontal blanking
signal
…
horizontal blanking
signal black
black
white
white
black
vertical blanking
… vertical blanking
white
white
65
Analog video
horizontal blanking black
white
…
66
Analog video
Theoretically, most color can be produced by mixing 3 primary colors (red, green, blue). An analog video camera produces 3 distinct continuous signals, one for each color component. scan line signal
black
black
A video signal is applied to a TV (a CRT) to control the power of an electron beam that strikes the phosphors on the inside of the CRT surface. With the 625/50 system (PAL), for example, a frame is displayed every 1/25 seconds. Unfortunately, the phosphors do not stay lit that long ⇒ flickering To prevent flicker, the picture has to be refreshed at least 50 times per second. Interlacing: a frame is divided into 2 fields: odd-lines and even-lines, a field is displayed every 1/50 seconds.
vertical blanking white 67
68
Analog video frame 1
progressive
Analog video frame 2
Luminance/chrominance principle: the three primary colors can be converted into 2 parts:
1/50
0
1/25
time
Luminance: information on the lightness of the image. Chrominance: information on the color of the image.
Because the human eye is not very sensitive to color information, the bandwidth of the color component is reduced before transmission.
interlaced 69
70
Analog video
Digital video A video can also be represented by a sequence of digital images. broadcast quality video (uncompressed):
Video storage
video tape e.g., VHS tape z
about 240 scan lines resolution
problem: loss of quality when copying and repeated playback due to stretches and magnetic material wearing off.
71
1 sec ⇒ > 20 MB
For lesser quality, and a good compression technique, it is possible to achieve: B – byte 1 sec = 1.2 Mb ⇒ b – bit transfer rate of (single-speed) CD-ROM ⇒
VCD
Data rate of a DVD movie is about 2GB/hr. Compression: lossless and lossy. 72
Digital video
Digital video formats
For lossy compression, we can achieve a 50:1 or higher compression ratio. MPEG: The Moving Pictures Expert Group
CCIR 601 (4:2:2 chromatic subsampling)
MPEG-1: 1.5Mbps VHS quality video MPEG-2: 4-10 Mbps (digital TV) MPEG-4: a system which allows a scene structure to be composed of multiple different objects (video, audio, natural, synthetic) MPEG-7: multimedia information retrieval
for video exchange 525/60 system: 720 × 480 (take 720 samples from each active scan line) 625/50 system: 720 × 576
4:2:0
for video broadcast
73
74
Digital video formats
Digital audio
SIF (Standard Input Format)
Digital audio representation
for storage 525/60 system: 352 × 240 625/50 system: 352 × 288 4:1:1 chromatic subsampling
CIF (Common Interchange Format)
produced by sampling a continuous signal generated by a sound source ADC (again)
for video conferencing 352 × 288 at 30 fps using 4:1:1 chromatic subsampling
75
76
Digital audio
Digital audio
Sampling frequency
Nyquist’s Theorem z
Sampling rate ≥ 2 × highest signal frequency
Human ear is sensitive to frequencies of up to about 20kHz (c.f., rat: 1k – 10k Hz; cat: 100 – 60k Hz) Sampling frequency ≥ 40kHz z z z
Storage
audio-CD ⇒ 44.1kHz, 16 bits per sample (⇒ 96dB max. SNR) DVD audio ⇒ 196kHz (max), 24 bits per sample (max) (Q: what is the throughput requirement?)
An hour of high quality stereo digital audio requires > 500MBytes of storage. A CD-ROM can store about 650MBytes of data. A CD-DA can store about 74 minutes of audio data.
For stereo, 2 channels. 77
78
Digital audio
Digital audio
Why 16 bits per sample?
Let b = number of bits per sample, q = quantization step, Q = number of quantization levels. We have Q = 2b. Max. signal amplitude = (q2b)/2; Max. quantization noise = q/2. SNR = 20log10(q2b/q) dB ≈ 6b dB. Threshold-of-pain / audibility-threshold = 100 to 120 dB. Quantization noise is inaudible ⇒ SNR is at least 100. 6b = 100 ⇒ b = 16.7.
79
q
(q2b)/2
2b levels
80
Music
Challenges Multimedia stresses all components of a computer system (data volume & time constraints) CPU processing power
MIDI -- Musical Instrument Digital Interface
Digital musical instruments send MIDI messages to a sequencer. The sequencer composes the music according to the messages received. The sequencer/synthesizer has a “palette” of sounds for each type of instrument
fast speed for data capturing, CODEC, data enhancement large amounts of data being processed in real-time
Storage and Memory
high capacity, fast access time, high transfer rates
System architecture
MIDI sequencer
high bus bandwidth, efficient I/O
81
Challenges
Research areas
Software
1. fast processors 2. high-speed networks 3. large capacity storage devices 4. video & audio compression algorithms 5. graphics systems 6. human-computer interfaces
tools for retrieval and data management of continuous media data
Operating systems
support for new data types, real-time scheduling, multimedia file systems, time-critical synchronization
Networks
82
high bandwidth, low latency, low jitter
83
7. real-time operating systems 8. information storage and retrieval 9. hypertext & hypermedia 10. languages for scripting 11. parallel processing methods
84
Compression
Some compression standards JPEG
Digital compression and coding of continuous-tone still images
Joint 15:1 (full color stillPhotographic frame applications) Experts Group
H.261
Video coder/decoder for audio-visual services at p*64 Kbps
Specialist Group on Coding for Visual Telephony
MPEG
Coding of moving pictures and associated audio
Moving 200:1 Motion-intensive Pictures applications Experts Group
MM systems require data compression for 3 reasons:
the large storage requirements of MM data. relatively slow storage devices that cannot play MM data in real time network bandwidth that does not allow real-time video data transmission
100:1 to 2000:1 (videobased telecommunications)
85
86
Multimedia networking
Multimedia networking
Many multimedia applications, such as video mail, video conferencing, and video-on-demand, require the support of a high performance network system. In these applications, the multimedia objects are stored at a server and played back at the clients’ sites. Remote retrieval of multimedia objects has stringent time constraints.
Delay: the amount of time it takes to transmit a data unit (e.g., a video frame) from a sender to a receiver. Jitter: delay variation. delay source
87
jitter
destination
88
Multimedia information retrieval
Multimedia networking Characteristics
Data Transfer Multimedia transfer
Data rate
Low
High
Traffic pattern
Bursty
Stream-oriented
Reliability requirements Latency requirements Mode of communication Temporal relationship
No loss
Some loss
None
Low, e.g., 20 ms
Point-to-point
Multipoint
None
Synchronized transmission
To retrieve a text document from the Web, we use keyword search via “Alta Vista”, for example. To retrieve a record from a relational database, such as Oracle, we use an SQL statement. How shall we formulate a query to retrieve pictures? What about audio? How do we describe a sound? How do we describe a song? 89
90
Multimedia software tools
Multimedia software tools Image and video editing
Music sequencing
Cakewalk
Adobe Photoshop z
supports general MIDI z provides several editing views (staff, piano roll, event list) and Virtual Piano z
z z
Adobe Premiere z
z z
91
allows layering of images, graphics and text includes many graphics drawing and painting tools sophisticated lighting effects, various image processing filters provides many video and audio tracks, superimposition and virtual clips supports various transitions, filters and motions for video clips a reasonable desktop video editing tool
92
Multimedia software tools Multimedia authoring
Microsoft Power Point z z z
building slide show sequence slides include objects of different media, e.g., sound and video limited animation ability
Macromedia Director, Flash z
z
z
Movie metaphor (cast of bitmapped sprites, scripts, music, sound, and palettes) Lingo script language allows more control of the presentation sequence. Control of devices, e.g., VCRs and disk players. ready for building interactivities using buttons, etc. 93