chapter 47 mode x: 256-color vga magic

Previous chapter 47 mode x: 256-color vga magic Home Next VGA’s Undocumented timal“ Mode At a book signing fo? n of Code Optimization, an attract...
3 downloads 0 Views 2MB Size
Previous

chapter 47 mode x: 256-color vga magic

Home

Next

VGA’s Undocumented timal“ Mode At a book signing fo? n of Code Optimization, an attractive young woman came up to me, holding and said,‘You’reMichaelAbrash, aren’t you?”I confessed that Iwas, prepared to respond in an appropriately modest yet proud way to the compliments I a s sure would follow. (It was my own book signing, after all.) It didn’t work out that way, though. The first thing out of her mouth was: “‘Mode X’ is a s name for a graphics mode.” As my jaw started to drop, she dn’t invent themode,either. My husbanddiditbefore you did.” added, “ And they say there &e no groupies in programming! Well. I never claimedthat I invented the mode (which is a 320x240 256-colormode with some very special properties, as we’ll see shortly). I did discover it independently, but so did other people in the game business, some of them no doubtbefore I did. The difference is that all those other people heldonto this powerful mode as a trade secret, while I didn’t; instead, I spread theword as broadly as I could in my column in 07; DobbSJournaZ, on the theory that the more peopleknew about this mode, the more valuable it would be. And I succeeded, as evidenced by the fact that this now widely-used mode is universally knownby the name Igave it in00)“Mode X.” Neither do I think that’s a bad name; it’s short, catchy, and easy to remember, and it befits the mystery status of this mode, which was omitted entirely from IBM’s documentation of the VGA.

877

In fact, when allis said and done,Mode X is one of my favorite accomplishments. I remember reading that Charles Schultz, creator of “Peanuts,”was particularly proud of having introduced the phrase “security blanket” to the English language. I feel much the same way about Mode X; it’s now a firmly entrenched part of the computer lexicon, andhow often do any of usget a chanceto do that? And that’s not to mention all the excellentgames that would not have been as good withoutMode X. So, in the end, I’m thoroughlypleased with ModeX; the world is a betterplace for it, even if it did cost me my one potential female fan. (Contrary to popular belief, the lives ofcomputer columnists and rock stars are not, repeat,not, all that similar.) This and thefollowing two chapters arebased on theDDJcolumns that started itall back in 1991,three columns that generated a tremendous amount of interest andspawned a ton of games, and about which I still regularly get letters and e-mail. Ladies and gentlemen, Igive you...Mode X.

What Makes Mode X Special? Consider the strange case of the VGA’s 320x240 256-color mode-Mode X-which is undeniably complex to program and isn’t even documented by IBM-but which is, nonetheless, perhaps the single best mode the VGA has to offer, especially for animation. We’ve seen the VGA’s undocumented 256-color modes, in Chapters 31 and 32, but now it’s time to delve into thewonders of Mode X itself. (Most ofthe performance tips I’ll discuss for this mode also applyto the other nonstandard256-color modes, however.) Fivefeatures set ModeX apart from other VGA modes. First,it has a 1:laspect ratio, resultingin equal pixel spacing horizontallyand vertically (that is, square pixels). Square pixels makefor themost attractive displays, and avoid considerable programming effort thatwould otherwise be necessary to adjust graphics primitives and images to match the screen’s pixelspacing. (For example,with square pixels, a circle can be drawn asa circle; otherwise, it must be drawn asan ellipse that corrects for the aspect ratio-a slower and considerably more complicated process.) In contrast, mode 13H, the only documented 256-color mode, provides a nonsquare 320x200 resolution. Second, Mode X allows page flipping, a prerequisite for the smoothest possible animation. Mode 13H does not allow page flipping, nor does mode 12H, the VGA’s high-resolution 640x480 16-colormode. Third, Mode X allows the VGAs plane-oriented hardware to be used toprocess pixels in parallel,improving performance by up to fourtimes overmode 13H. Fourth, like mode 13H but unlikeall other VGA modes, Mode X is a byte-per-pixel mode (eachpixel is controlled by one byte in displaymemory), eliminating the slow read-before-writeand bit-masking operations often required in l6-color modes, where each byte of display memory represents more than a single pixel. In addition to cutting the numberof memory accesses in half, this is important because the 486/ Pentium write FIFO and the memory caching schemes used by many VGA clones speed up writes more than reads.

878

Chapter 47

Fifth, unlike mode 13H, Mode X has plenty of offscreen memory free for image storage. This is particularly effectivein conjunction with the use of the VGA’s latches; together, the latches and the off-screen memory allow images to be copied to the screen four pixels at a time. There’s a sixth feature of Mode X that’s not so terrific: It’s hard to program efficiently. As Chapters 23 through 30 of this book demonstrates, 16-color VGA programming can be demanding. Mode X is often as demanding as 16-color programming, and operates by a set of rules that turns everything you’ve learned in 16-color mode sideways. Programming Mode X is nothing like programming the nice, flat bitmap of mode 13H, or, for that matter, the flat, linear (albeit banked) bitmap used by 256-color SuperVGA modes. (I’t’s important to remember thatMode X works on all VGAs, notjust SuperVGAs.) Many programmers I talk to love the flat bitmap model, and think that it’s the ideal organization for display memory because it’s so straightforward to program. Here, however, the complexity of Mode X is opportunity-opportunity for the best combination of performance and appearance the VGA has to offer.If you do 256-color programming, and especially if you use animation, you’re missing the boat if you’re not using Mode X. Although some developers have taken advantage of ModeX, its use is certainly not universal, being entirely undocumented; only an experienced VGA programmer would have the slightest inkling that it even exists, and figuring out how to make it perform beyond the write pixel/read pixel level is no mean feat. Little other than my DDJcolumns hasbeen publishedabout it, althoughJohn Bridges has widelydistributed his code for a number of undocumented 256-color resolutions, and I’d like to acknowledge the influence of hiscode on the mode set routine presented in this chapter. Given the tremendous advantages of Mode X over the documented mode 13H, I’d very much like to get it into the hands of as many developers as possible, so I’m going to spend the next few chapters exploring this odd but worthy mode. I’ll provide mode set code, delineate the bitmap organization, and show howthe basic write pixel and read pixel operations work. Then, I’ll move on to the magic stuE rectangle fills, screen clears, scrolls, image copies, pixel inversion, and, yes, polygon fills (just a different driver for the polygon code), all blurry fast; hardware raster ops; and page flipping. In the end, I’ll build a working animation program that shows many of the features of Mode X in action. The mode set code is the logical place to begin.

Selecting 320x240 256-Color Mode We could, if we wished, writeour own mode set code for Mode X from scratch-but why bother? Instead, we’ll let the BIOS do most of the work by having it set up mode 13H, which we’ll then turn into Mode X by changing a few registers. Listing 47.1 does exactly that. Mode X: 256-Color VGA Magic

879

The codein Listing 47.1 has been around for some time, and the very first version had a bug thatserves up aninteresting lesson. The original DDJversion made images roll on IBM’s fixed-frequency VGA monitors, a problem that didn’tcome to my attention until the code was in print andshipped to 100,000 readers. The bug came about this way: The code I modified to make the Mode X mode set code used the VGA’s 28-MHz clock. Mode X should have used the %-MHz clock, a simple matter of setting bit 2 of the Miscellaneous Output register (3C2H) to 0 instead of 1. Alas, I neglected to change thatsingle bit, so frames were drawn at a faster rate than they should have been; however, both of my monitors are multifrequency types, and they automatically compensated for the faster frame rate. Consequently, my clockselection bug was invisible and innocuous-until it was distributed broadly and everybody started bangingon it. IBM makes only fixed-frequency VGA monitors, which require very specific frame rates; if they don’t get what you’ve told them to expect, the image rolls. The corrected version is the oneshown here as Listing 47.1;it doesselect the 25-MHz clock, and works just fine on fixed-frequency monitors. Why didn’t I catch this bug? Neither I nor a single one of my testers had a fixedfrequency monitor! This nicely illustrates how difficult it is these days to test code in all the PC-compatible environments inwhich it might run. Theproblem is particularly severefor small developers, who can’t afford to buy everymodel of everyhardware component fromevery manufacturer;just imagine trying to test network-aware software in all possible configurations! When people ask why software isn’t bulletproof; why it crashes or doesn’t coexist with certain programs; why PC clones aren’t always compatible; why, in short, the myriad irritations of using a PC exist-this is a big part of the reason. I guess that’s just theprice we pay for the unfetteredcreativity and vast choice of the PC market.

LISTING 47.1 L47-

1.ASM

; Mode X (320x240.256colors)

mode s e tr o u t i n e .

Works on a l l VGAs.

. ................................................................

* R e v i s e d6 / 1 9 / 9 1t os e l e c tc o r r e c tc l o c k :f i x e sv e r t i c a lr o l l * * p r o b l e mfoi nxs e d - f r e q u e n c y ( I B M 8 5 1 X - t y pm e )o n i t o r s . * . ................................................................ ; ;

; C n e a r - c a l l a b l ea s :

voidSet320x240Mode(void): ; T e s t e dw i t h TASM ; M o d i f i e df r o mp u b l i c - d o m a i n

SC-INDEX CRTC-INDEX MIS-OUTPUT SCREEN-SEG

.model small .data

880

Chapter 47

mode setcodebyJohnBridges.

equ 03c4h ;Sequence Controller Index equ 03d4h ;CRT C o n t r o l lIenrd e x 03c2h equ ; M i s c e l l a n e o u sO u t p u tr e g i s t e r equ OaOOOh ;segment o f d i s p l a y memory i n mode X

: I n d e x / d a t ap a i r sf o r

CRT C o n t r o l l e r r e g i s t e r s t h a t d i f f e r mode X. CRTParms l a b ewl o r d dw 00d06h : v e r t i c a lt o t a l dw 03e07h : o v e r f l o w ( b i t 8 o f v e r t i c a l c o u n t s ) dw 04109h : c e l lh e i g h t( 2t od o u b l e - s c a n ) dw OealOh :vsync start dw O a c l l h :v syncendand p r o t e c tc r 0 - c r 7 dw O d f l 2 h ; v e r t i c a ld i s p l a y e d dw 00014h : t u r n o f f dword mode dw Oe715h :v b l a n k s t a r t dw 00616h ;v b l a n ke n d dw Oe317h : t u r n on b y t e mode CRT-PARM-LENGTH ((S-CRTParms)/2) equ

between

: mode 13hand

.code p u b l i c -Set320x240Mode -Set320x240Mode Droc near push bP : p r e s e rcvael l esr t' saf cr ak m e : psrie s e r v e C r e g i svt ea r s push : ( d o n ' tc o u n t on B I O S p r e s e r v i n ga n y t h i n g ) di push mov in t

ax.13h 10h

: l e tt h e BIOS s e ts t a n d a r d2 5 6 - c o l o r : mode ( 3 2 0 x 2l0i n0 e a r )

mov mov out mov out mov mov out

dx.SC-INDEX ax, 0604h d x . a; xd i s a b lceh a i n 4 mode ax.0100h d x . a x: s y n c h r o n o u sr e s ewt h i l es e t t i n gM i s cO u t p u t : f o rs a f e t y ,e v e nt h o u g hc l o c ku n c h a n g e d dx.MISC-OUTPUT a1 .Oe3h d x . a :l s e l e c t 25 MHz d o tc l o c k & 60 Hz s c a n n i n gr a t e

mov mov out

dx.SC-INDEX ax, 0300h d x . a:xu n droe s e( rt e s t a sr te q u e n c e r )

dx.CRTC-INDEX : r e p r o g r a mt h e CRT C o n t r o l l e r mov al.llh ;VSync End r e gc o n t a i n sr e g i s t e rw r i t e mov dx.al : p r o t e cbti t out dx :CRT C o n t r o l 1 eDr a trae g i s t e r inc a l . d x: g ect u r r e n t VSync End r e g i s t e rs e t t i n g in a l . 7 f h :remove w r i t ep r o t e c t on v a r i o u s and dx.al : CRTC r e g i s t e r s out dx :CRT C o n t r oIl nl edre x dec c ld s i . o f f s e t CRTParms : p o i n t t o CRT p a r a m e t e rt a b l e mov :# o f t a b l e e n t r i e s mov cx.CRT-PARM-LENGTH SetCRTParmsLoop: odsw 1 : gtnheeetx t CRT I n d e x / O ap taai r oduxt . a: sxtehntee x t CRT I n d e x / O a pt aa i r loop SetCRTParmsLoop mov dx.SC-INDEX mov ax.OfO2h o du xt , a: ex n a b w l er i t eatfosol l pu lr a n e s mov ax.SCREEN-SEG :now c l e a ar ldl i s p l a y : a t a time mov es.ax

memory. 8 p i x e l s

Mode X: 256-Color VGA Magic

881

sub d i ,:dpio i n t E S : D I dt oi s p l a y memory sub ax,ax :clear zt eo r o - v a l pu iex e l s mov cx.8000h :# o f words i nd i s p l a y memory r es tpo s: cwl edaoailfrsl p l a y memory pop : r ed si t o r e pop si POP bP ret -Set320x240Mode endp end

C r e g i vs at er rs : r e s tcoarlel se trfa’r sac km e

After setting up mode 13H, Listing 47.1 alters the vertical counts and timings to select 480 visible scanlines. (There’s no needto alter any horizontal values, because mode 13H and Mode X both have 320-pixelhorizontal resolutions.) The Maximum Scan Line register is programmed to double scan each line (that is, repeat eachscan line twice), however, so we get aneffective vertical resolution of 240 scan lines. It is, in fact, possible to get 400 or 480 independent scan lines in 256-color mode, as discussed in Chapter 31 and 32; however, 400-scan-line modes lack square pixels and can’t support simultaneous off-screen memoryand page flipping. Furthermore, 480scan-line modes lack page flipping altogether, due to memory constraints. At the same time, Listing 4’7.1 programs the VGA’s bitmap to a planar organization that is similar to that used by the 16-color modes, and utterly different from the linear bitmapof mode 13H. The bizarre bitmap organization of Mode X is shownin Figure 47.1. The first pixel (the pixel at the upperleft corner of the screen) is controlled by the byte at offset 0 in plane0. (The onething thatMode X blessedly has in common with mode 13H is that eachpixel is controlled by a single byte, eliminating the needto mask out individual bits of display memory.) The second pixel, immediately to the right of the first pixel, is controlled by the byte at offset 0 in plane 1. The third pixel comes from offset 0 in plane 2, and the fourth pixel from offset 0 in plane 3. Then, thefifth pixel is controlled by the byte at offset 1 in plane 0, and thatcycle continues, with each group of four pixels spread across the fourplanes at thesame address. The offset M of pixel N in display memory is M = N/4, and the plane P of pixel N is P = N mod 4. For display memory writes, the plane is selected by setting bit P of the Map Mask register (Sequence Controllerregister 2) to 1and all other bits to 0; for display memory reads, the plane is selected by setting the Read Map register (Graphics Controller register 4) to P. It goes without saying that this is one ugly bitmap organization, requiring a lot of overhead to manipulate a single pixel. The write pixel code shown in Listing 47.2 must determine the appropriate plane and perform 16-bitaOUT to select that plane for each pixel written, and likewise for the read pixel code shown in Listing 47.3. Calculating and mapping in a plane once foreach pixel written is scarcely a recipe for performance. That’s all right, though, because most graphics software spends little time drawing individual pixels. I’veprovided the write and read pixel routines as basic primitives,

882

Chapter 47

and so you’ll understand how the bitmap is organized, but the building blocks of high-performance graphics software are fills, copies, and bitblts, and it’s there that Mode X shines.

LISTING 47.2 L47-2.ASM : Mode X (320x240. 256 c o l o r s )w r i t ep i x e lr o u t i n e . : No c l i p p i n gi sp e r f o r m e d .

Workson

a l l VGAs.

; C n e a r - c a l l a b l ea s : ;

v o i dW r i t e P i x e l X ( i n t

SC-INDEX 03c4h equ MAP-MASK 02h equ SCREEN-SEG equ equ SCREENKWIDTH

struc dw X dw Y dw PageBase dw

X.

i n t Y . unsigned i n t PageBase. i n C t olor);

OaOOOh EO

:Sequence C o n t r o l l e rI n d e x : i n d e x i n SC o f Map Mask r e g i s t e r :segment o f d i s p l a y memory i n mode X ; w i d t ho fs c r e e ni nb y t e sf r o mo n es c a nl i n e ; t ot h en e x t

parms

Color parms

dw ends

2 dup ( ? ) ? ? ?

?

:pushed BP and r e t u r na d d r e s s : X c o o r d i n a t eo fp i x e lt o draw : Y c o o r d i n a t eo fp i x e lt od r a w ;base o f f s e t i n d i s p l a y memory o f page i n ; w h i c ht od r a wp i x e l w t oh i ;cicnho l o r draw p i x e l

Mode X: 256-Color VGA Magic

883

.model s m a l l .code p u b l i c -WritePixelX -Wri t e P i x e l X n e aprr o c push bp mov bP*sP

; p r e s e r v ec a l l e r ' ss t a c kf r a m e ; p o i n tt ol o c a ls t a c kf r a m e

mov mu1 mov shr shr add add mov mov

ax.SCREEN-WIDTH C bp+Y 1 bx.Cbp+XI bx.1 bx.1 bx, ax bx.[bp+PageBasel ax.SCREEN-SEG es.ax

mov and mov shl mov out

c l . b y t e p t r Cbp+Xl cl .Ollb ax.0100h + MAP-MASK ah.cl dx.SC-INDEX dx, ax

mov mov

a1 . b y t e p t r [ b p + C o l o r ] e s d: [tebhsx;pecidltinro.rhxealaeodlwlr

;offsetofpixel'sscanlinein

page

-

;X/4 offsetofpixelinscanline ; o f f s e to fp i x e li n page : o f f s e to fp i x e li nd i s p l a y memory ; p o i n t ES:BX t o t h e p i x e l ' s a d d r e s s

--

;CL pixel'splane ;AL i n d e x i n SC o f Map Mask r e g ;setonlythebitforthepixel'splaneto ; s e t t h e Map Mask t o e n a b l e o n l y t h e ; p i x e l ' sp l a n e

1

f r a m e s t a ccka l l;ePOP re r 'sst o r e b p ret endD -W r i t e P i x e l X end

LISTING 47.3

L47-3.ASM

: Mode X (320x240. 256 c o l o r s )r e a dp i x e lr o u t i n e .

Workson

a l l VGAs.

; No c l i p p i n gi sp e r f o r m e d .

: C n e a r - c a l l a b l ea s : :

u n s i g n e di nRt e a d P i x e l X ( i n t

GC-INDEX READ-MAP SCREEN-SEG SCREEN-WIDTH

:G CIron0and3ptecrhoexi clhlse r :index 04h OaOOOh

80

struc dw X dw Y dw PageBase dw

X . i n t Y , u n s i g n e idnPt a g e B a s e ) ;

i n GCt hoef :segment d oi sf p l a y ;w s cibodryfeitftnhreeonsm : t ot h en e x t

Read Map r e g i s t e r memory i n mode X scan one

line

parms

:pushed BP and r e t u r na d d r e s s ; X c o o r d i n a t eo fp i x e lt or e a d ;Y c o o r d i n a t eo fp i x e lt or e a d ;base o f f s e t i n d i s p l a y memory o f pagefrom ; w h i c ht or e a dp i x e l

parms ends .model smal 1 .code publ i c -Readpixel X -ReadPixelX n e pa r o c bp push mov bp.sp

884

Chapter 47

: p r e s e r v ec a l l e r ' ss t a c kf r a m e ; p o i n tt ol o c a ls t a c kf r a m e

mov mu1 mov shr shr add add mov mov

ax.SCREEN-WIDTH [ bp+Y 1 bx.Cbp+XI bx.1 bx.1 bx, ax bx.[bp+PageBasel ax.SCREEN-SEG es ,ax

mov and mov mov out

a h , b y t ep t r ah.0llb a1 ,READ-MAP dx.GC-INDEX dx.ax

mov sub

a1 . e s : [ b x l ah.ah

; r e a dt h ep i x e l ' sc o l o r : c o n v e r t i t t o an u n s i g n e d i n t

bP

: r e s t o r ec a l l e r ' ss t a c kf r a m e

POP ret -ReadPixel X end

; o f f s e to fp i x e l ' ss c a nl i n ei n

-

;X/4 offsetofpixelin ; o f f s e to fp i x e li n page :offsetofpixelindisplay

page

scan l i n e memory

: p o i n t ES:BX t o t h e p i x e l ' s a d d r e s s

[bp+X1

--

pixel'splane i n d e x i n GC o f t h e Read Map r e g ;AL ; s e tt h e Read Map t o r e a d t h e p i x e l ' s : plane :AH

endp

Designing from a Mode X Perspective Listing 47.4shows Mode X rectangle fill code. The plane is selected for each pixel in turn, with drawing cyclingfrom plane 0 to plane 3, then wrapping back to plane 0. This is the sort of code that stems from a write-pixel line of thinking; it reflects not a whit of the unique perspective that Mode X demands, and although it looks reasonably efficient, it is in fact some of the slowest graphics code you will ever see. I've provided Listing47.4partly for illustrative purposes, but mostly so we'll have a point of reference for the substantial speed-up that's possible with code that's designed from a Mode X perspective. LISTING 47.4L47-4.ASM : : : : :

Mode X ( 3 2 0 x 2 4 0 .2 5 6c o l o r s )r e c t a n g l e fill r o u t i n e . Workson all VGAs. Uses s l o wa p p r o a c ht h a ts e l e c t st h ep l a n ee x p l i c i t l yf o re a c h p i x e l . F i l l s up t o b u t n o t i n c l u d i n g t h e c o l u m n a t EndX andtherow a t EndY. No c l i p p i n g i s p e r f o r m e d . C n e a r - c a l l a b l ea s :

:

v o i dF i l l R e c t a n g l e X ( i n St t a r t X i.n t S t a r t Y . i n t EndX. i n t EndY. u n s i g n e di n t PageBase. i n t C o l o r ) :

SC-INDEX MAP-MASK SCREEN-SEG SCREEN-WIDTH parms StartX StartY EndX

struc dw dw dw dw

03c4h 02h OaOOOh 80

:Sequence C o n t r o l l e rI n d e x : i n d e x i n SC o f Map Mask r e g i s t e r :segment o f d i s p l a y memory i n mode X : w i d t ho fs c r e e ni nb y t e sf r o m onescan : t ot h en e x t

line

:pushed BP and r e t u r na d d r e s s : X c o o r d i n a t eo fu p p e rl e f tc o r n e ro fr e c t :Y c o o r d i n a t eo fu p p e rl e f tc o r n e ro fr e c t ;X c o o r d i n a t eo fl o w e rr i g h tc o r n e ro fr e c t : ( t h er o wa t EndX i s n o t f i l l e d )

Mode X: 256-Color VGA Magic

885

EndY

dw

?

PageBase dw

?

Color parms ends

?

dw

.model smal 1 .code pub1 ic -Fi 11 RectangleX -Fi11Rectangl eX p r o cn e a r push bP mov bP.SP push si di push mov mu1 mov shr shr

ax.SCREEN-WIDTH [bp+StartYl di,[bp+StartX] d i .1 d i .1

add add

di ,ax di.[bp+PageBasel

mov mov

ax.SCREEN-SEG es.ax

:Y c o o r d i n a t e o f l o w e r r i g h t c o r n e r o f r e c t

: ( t h ec o l u m na t EndY i s n o t f i l l e d ) ;base o f f s e t i n d i s p l a y memory o f page i n ; w h i c ht o fill r e c t a n g l e ; c o l o ri nw h i c ht od r a wp i x e l

: p r e s e r v ec a l l e r ' ss t a c kf r a m e ; p o i n tt ol o c a ls t a c kf r a m e : p r e s e r v ec a l l e r ' sr e g i s t e rv a r i a b l e s

;offsetin

page o f t o p r e c t a n g l e s c a n l i n e

-

offsetoffirstrectanglepixelin :X/4 : line : o f f s e to ff i r s tr e c t a n g l ep i x e li n page ;offsetoffirstrectanglepixelin : d i s p l a y memory

scan

: p o i n t ES:DI t o t h e f i r s t r e c t a n g l e p i x e l ' s ; address

dx.SC-INDEX mov a1 .MAP-MASK mov dx.al out inc dx c l . b y t ep t r[ b p + S t a r t X l mov and cl .Dllb mov a1 . O l h shl a1 . c l a h , b y t ep t r[ b p + C o l o r l mov mov bx.[bp+EndYI sub bx.[bp+StartYI F i 1Done 1 jle mov s i , [bp+EndX] s i ,[ b p + S t a r t X l sub Fi 11Done jle F i 11 RowsLoop: ax push di push cx.si mov FillScanLineLoop: dx.al out mov es:[dil.ah a1,l shl a1and .Ollllb A d d r e sj nszS e t di inc mov al.00001b AddressSet: 1oop Fi11ScanLineLoop pop di add d i .SCREEN-WIDTH

886

Chapter 47

; s e tt h e

Sequence C o n t r o l l e r I n d e x t o Map Mask r e g i s t e r

; pointtothe

; p o i n t DX t o t h e ;CL

SC D a t ar e g i s t e r

- firstrectanglepixel'splane

;setonlythebitforthepixel'splaneto : c o l o rw i t hw h i c ht o fill

-

;BX h e i g h to fr e c t a n g l e : s k i p i f 0 or n e g a t i v e h e i g h t

-

:CX widthofrectangle : s k i p i f 0 o rn e g a t i v ew i d t h

;remember t h e p l a n e mask f o r t h e l e f t edge ;remember t h e s t a r t o f f s e t o f t h e s c a n l i n e : s e tc o u n to fp i x e l si nt h i ss c a nl i n e :settheplaneforthispixel : d r a wt h ep i x e l : a d j u s tt h ep l a n e mask f o r t h e n e x t p i x e l ' s : b i t , modulo 4 i f we t u r n e do v e rf r o m :advanceaddress : plane 3 t op l a n e 0 : s e t p l a n e mask b i t f o r p l a n e 0

:retrievethestartoffsetofthescanline o f t h en e x ts c a n ;pointtothestart : l i n eo ft h er e c t a n g l e

1

: r e t r i e v et h ep l a n e mask f o r t h e l e f t ;count down s c a nl i n e s

ax

POP

bx

dec j n zF i F i 1Done: 1 pop pop

Rows 1

Loop

di si bp

POP

edge

; r e s t o r ec a l l e r ’ sr e g i s t e rv a r i a b l e s ; r e s t o r ec a l l e r ’ ss t a c kf r a m e

ret -F i 1 1 R e c t a neX g l endp end

The two major weaknesses of Listing 47.4 both result from selecting the plane on a pixel by pixel basis. First, endless OUTs (which are particularly slow on 386s, 486s, and Pentiums, much slower than accesses to display memory) must be performed, and, second, REP STOS can’t be used. Listing 47.5 overcomes both these problems by tailoring the fill technique to the organization of display memory. Each plane is filled in its entirety in one burst before the next plane is processed, so only fiveOUTs are required in all, and REP STOS can indeed be used; I’ve used REP STOSB in Listings 47.5 and 47.6. REP STOSW could be usedand would improveperformance on most VGAs; however, REP STOSW requires extra overhead to set up, so it can be slower for small rectangles, especiallyon &bit VGAs. Note that doing an entire plane at atime can produce a“fading-in”effect for large images, because all columns for one plane are drawn before any columns for the next. If this is a problem,the fourplanes can be cycled through once for each scan line, rather than once for the entire rectangle. Listing 47.5 is 2.5 times faster than Listing 47.4 at clearing the screen on a 20-MHz cached 386 with a Paradise VGA. Although Listing 47.5 is slightly slower than an equivalent mode 13H fill routine would be, it’s not grievously so.

p

In general, performingplane-at-a-time operations can make almost any Mode X operation, at the worst, nearly as fast as the same operation in mode 13H (although this sort of Mode Xprogramming is admittedly fairly complex). In this pursuit, it can help to organize data structures with Mode Xin mind. For example, icons could be prearranged in system memory with the pixels organized into four plane-oriented sets (oy, again, in four sets per scan line to avoid a fading-ineffect) to facilitate copying to the screen a plane at a time with REP MOVS.

LISTING 47.5

L47-5.ASM

; Mode X (320x240. 256 c o l o r s )r e c t a n g l e fill r o u t i n e . Works ; VGAs. U s e sm e d i u m - s p e e da p p r o a c ht h a ts e l e c t se a c hp l a n eo n l yo n c e

p e rr e c t a n g l e ;t h i sr e s u l t si n a f a d e - i ne f f e c tf o rl a r g e r e c t a n g l e s .F i l l su p t o b u tn o ti n c l u d i n gt h ec o l u m na t ; row a t EndY. No c l i p p i n g i s performed. ; C n e a r - c a l l a b l ea s : ; ;

;

v o i dF i l l R e c t a n g l e X ( i n St t a r t X i.n St t a r t Y i,n t unsigned i n t PageBase, i n t C o l o r ) ;

SC-INDEX

MAPLMASK SCREEN-SEG

03c4h

equ ;index 02h equ equ

OaOOOh

on a l l

EndX and t h e

EndX. i n t EndY.

;Sequence C o n t r o l l e rI n d e x i n SC o f Map Mask r e g i s t e r ;segmentd i so pf l a y memory i n mode X

Mode X: 256-Color VGA Magic

887

equ

SCREEN-WIDTH

80

; w i d t ho fs c r e e ni nb y t e sf r o m

onescan

line

: t ot h en e x t parms s t r u c StartX StartY EndX

dw dw dw dw

2 dup ( ? ) ?

EndY

dw

?

PageBase

dw

Color parmsends

dw

Startoffset Width Height PlaneInfo STACK-FRAME-SIZE

? ?

?

?

equ equ equ equ equ

-2 -4 -6 -8 8

.model smal 1 .code pub1 i c - F i 11 RectangleX -F i 11Rectangl eX p r o c n e a r push bp mov bp. sp sp.STACK-FRAME-SIZE sub si push push di cld mov mu1 mov shr shr

ax.SCREEN-WIDTH [bp+StartYl d i ,[ b p + S t a r t X l d i .1 d i .I

:pushed BP and r e t u r na d d r e s s :X c o o r d i n a t e o f u p p e r l e f t c o r n e r o f r e c t :Y c o o r d i n a t eo fu p p e rl e f tc o r n e ro fr e c t ;X c o o r d i n a t eo fl o w e rr i g h tc o r n e ro fr e c t : ( t h er o wa t EndX i s n o t f i l l e d ) :Y c o o r d i n a t e o f l o w e r r i g h t c o r n e r o f r e c t : ( t h ec o l u m na t EndY i s n o t f i l l e d ) ;base o f f s e t i n d i s p l a y memory o f page i n : w h i c ht o fill r e c t a n g l e :colorinwhichto d r a wp i x e l

; l o c a ls t o r a g ef o rs t a r to f f s e to fr e c t a n g l e : l o c a ls t o r a g ef o ra d d r e s sw i d t ho fr e c t a n g l e : l o c a ls t o r a g ef o rh e i g h to fr e c t a n g l e IF and p l a n e mask :1oca1 s t o r a g e f o r p l a n e

; p r e s e r v ec a l l e r ' ss t a c kf r a m e : p o i n tt ol o c a ls t a c kf r a m e : a l l o c a t es p a c ef o rl o c a lv a r s ; p r e s e r v ec a l l e r ' sr e g i s t e rv a r i a b l e s

: o f f s e t i n page o f t o p r e c t a n g l e s c a n l i n e

;X/4

-

offsetoffirstrectanglepixelinscan

: line add add

d i ,ax di.Cbp+PageBasel

mov mov mov mov mov out mov sub Jle mov mov mov

ax.SCREEN-SEG es ,ax Cbp+StartOffsetl,di dx,SC-INDEX a1 .MAP-MASK dx.al bx, [bp+EndY 1 bx.Cbp+StartYl F i 1Done 1 Cbp+Heightl.bx dx. [bp+EndXI cx.[bp+StartX] dx.cx F i 11 Done dx c x . n o tO l l b dx.cx dx.1 dx. 1

CmP

Jle dec and sub shr shr

888

Chapter 47

:offsetoffirstrectanglepixelin : o f f s e to ff i r s tr e c t a n g l ep i x e li n : d i s p l a y memory

page

: p o i n t ES:DI t o t h e f i r s t r e c t a n g l e p i x e l ' s

: address ; s e tt h eS e q u e n c eC o n t r o l l e rI n d e xt o Map Mask r e g i s t e r

: pointtothe

-

:BX heightofrectangle ; s k i p i f 0 o rn e g a t i v eh e i g h t

; s k i p i f 0 o rn e g a t i v ew i d t h

dx inc mov mov

a d;draIree ccotrsotfoassenssg l e [bp+Width].dx word p t r [bp+PlaneInfo],OOOlh : l o w e rb y t e ; u p p e rb y t e

fill

--ppl al anneemask f o r p l a n e 0. # f o rp l a n e 0

F i 1 P1 1 anesLoop: mov ax,word p t r[ b p + P l a n e I n f o ] mov dx.SC-INDEX+l ; p o i n t DX t thoe SC D raet ga i s t e r p i xtehl i s foopurl ta nt hee: sdext . a l mov E S : D I tr oe c t a n g lset a r t d i , [ b p + S t a r t O f f s e; tpl o i n t mov dx.Cbp+Widthl mov c 1 , b y t ep t r[ b p + S t a r t X ] and cl .Ollb ;plane # o f f i r s t p i x e l in initialbyte ah,cl ;do we draw t h i s p l a n e i n t h e i n i t i a l b y t e ? CmP jae InitAddrSet ;yes dec dx ;no. so s k i p t h e i n i t i a l b y t e Fi 11 LoopBottom jz : s k i pt h i sp l a n e i f n op i x e l si n it inc di InitAddrSet: mov c l . b y t e p t r [bp+EndX] dec cl and cl .Ollb ;plane # o f l a s t p i x e l i n f i n a l b y t e ah.cl ;do we draw t h i s p l a n e i n t h e f i n a l b y t e ? CmP j be WidthSet :yes dec dx ;no. s o s k i p t h e f i n a l b y t e F i 11 LoopBottom i f no p i x e l s i n i t ; s k i pt h i sp l a n e s jz WidthSet: mov s i .SCREEN-WIDTH s i ,dx sub : d i s t a n c ef r o me n do fo n es c a nl i n et os t a r t ; o fn e x t mov bx.Cbp+Heightl ;# o f l i n e s t o fill mov a l . b y t ep t rC b p + C o l o r l : c o l o rw i t hw h i c ht o fill F i 11 RowsLoop: cx, dx mov ;# o f b y t e s a c r o s s s c a n l i n e stosb ;fill t h e s c a n l i n e i n t h i s p l a n e r eP add di ,si :pointtothestartofthenextscan : 1i n e o f t h e r e c t a n g l e dec bx ; c o u n t down s c a nl i n e s jnz F i 11 RowsLoop FillLoopBottom: mov ax.word Cp bt rp + P l a n e I n f o l shl a1 .1 ;settheplanebittothenextplane inc ah # ; i n c r e m e n tt h ep l a n e mov word [pbtpr + P l a n e I n f o ] . a x cmp ;have ah.4 we pdone l a n eas l?l j n Fz1i 1 P1 anesLoop ; c o n t i n u e i f anymoreplanes F i 1 Done: 1 v a r ieagbci alse;tl pop lrseeer rs' tso r e d i pop si mov bpsp, ; d i s c a r ds t o r a g ef o rl o c a lv a r i a b l e s ; r e s t o r ec a l l e r ' ss t a c kf r a m e POP bp ret -Fi 11 RectangleX endp end

Hardware Assist from an Unexpected Quarter Listing 47.5 illustrates the benefits of designing code from a Mode X perspective; this is the software aspect of Mode X optimization, which suffices to make ModeX Mode X: 256-Color VGA Magic

889

about as fast as mode 13H. That alone makes Mode X an attractive mode, given its square pixels, pageflipping, and offscreen memory,but superior performance would nonetheless be a pleasant addition to that list. Superior performance is indeed possible in Mode X, although, oddly enough, it comes courtesyof the VGA’s hardware, which was never designed to be used in 256-color modes. All of the VGA‘s hardware assistfeatures are available in ModeX, although some are not particularly useful. The VGA hardware feature that’s truly the key to Mode X performance is the ability to process four planes’ worth of data in parallel; this includes both the latches and the capability to fan data out to any or all planes. For rectangular fills, we’lljust need to fan the data out to various planes, so I’ll defer a discussion of other hardware features for now. (By the way, the ALUs, bit mask, and most other VGA hardware features are also available in mode 13H-but parallel data processing is not.) In planar modes, such as Mode X, a byte written by the CPU to display memory may actually goto anywhere betweenzero and four planes, as shownin Figure 47.2. Each plane for which the setting of the corresponding bit in the Map Mask register is 1 receives the CPU data, and each planefor which the corresponding bitis 0 is not modified. In 16-color modes, each plane contains onequarter of each of eight pixels, withthe 4 bits of each pixel spanning all four planes. Not so in Mode X. Look at Figure 47.1 again; each plane contains one pixel in its entirety, with four pixels at any given address, one perplane. Still, the Map Mask register does the same job in Mode X as

CPU write of valueThe 41 h tooffset 0 inthe displaymemory

r

CPU value (41h) is writtentooffset 0 ineach of two planesenabled by the Map Mask register, planes 0 and 2; planes 1 and 3 are notaltered.

Selectingplanes with the Map Mask register. Figure 47.2

890

Chapter 47

in 16-color modes;set it to OFH (all 1-bits), and all four planes will be written to by each CPU access. Thus, it would seemthat up to four pixels could be set by a single Mode X byte-sized writeto display memory, potentially speeding up operations like rectangle fills by four times. And, as it turnsout, four-plane parallelism works quite nicely indeed. Listing 47.6 is yet another rectangle-fill routine, this time using the Map Mask to set up to four pixels per STOS. The only trick to Listing 47.6 is that any leftor right edge thatisn’t aligned to a multiple-of-four pixel column(that is, a column at which one four-pixel set ends and the next begins) must be clippedvia the Map Maskregister, becausenot all pixels at the address containing the edge are modified. Performance is as expected; Listing 47.6 is nearlyten times fasterat clearing the screen than Listing 47.4 and just about four times faster than Listing 47.5-and also about fourtimes faster than the same rectangle fill in mdde 13H. Understanding the bitmap organizztion and display hardware of Mode X does indeedpay. Note that the return from Mode X’s parallelism is not always 4x; someadapters lack the underlying memory bandwidth to writedata thatfast. However, ModeX parallel access should always be faster than mode 13H access;the only question on any given adapter is how much faster. LISTING47.6147-6.ASM : Mode X (320x240. 256 c o l o r s )r e c t a n g l e

fill r o u t i n e . Workson

all

: VGAs. Uses f a s ta p p r o a c ht h a tf a n sd a t ao u tt ou pt of o u rp l a n e sa t : once t o drawup t o f o u r p i x e l s a t o n c e . F i l l s up t o b u t n o t : i n c l u d i n gt h ec o l u m na t

EndX and t h er o wa t : performed. : C n e a r - c a l l a b l ea s : : v o i dF i l l R e c t a n g l e X ( i nSt t a r t Xi.nSt t a r t Yi,n t unsigned i n t PageBase. i n t C o l o r ) :

equ equ

SC-INDEX MAP-MASK SCREEN-SEG

SCREEN-WIDTH parms StartX StartY EndX

struc dw dw dw dw

equ 80 equ

03c4h 02h OaOOOh

2 dup ( ? ) ? ? ?

EndY

dw

?

PageBase

dw

?

Color parms ends

dw

?

EndY. No c l i p p i n g i s

EndX. i n t EndY.

;Sequence C o n t r o l l e rI n d e x ; i n d e x i n SC o f Map Mask r e g i s t e r :segment o f d i s p l a y memory i n mode X : w i d t ho fs c r e e ni nb y t e sf r o mo n es c a nl i n e : t ot h en e x t :pushed BP and r e t u r na d d r e s s ; X c o o r d i n a t eo fu p p e rl e f tc o r n e ro fr e c t : Y c o o r d i n a t eo fu p p e rl e f tc o r n e ro fr e c t :X c o o r d i n a t eo fl o w e rr i g h tc o r n e ro fr e c t : ( t h er o wa t EndX i s n o t f i l l e d ) : Y c o o r d i n a t eo fl o w e rr i g h tc o r n e ro fr e c t : ( t h e column a t EndY i s n o t f i l l e d ) ;base o f f s e t i n d i s p l a y memory o f page i n : w h i c ht o fill r e c t a n g l e : c o l o ri nw h i c ht o draw p i x e l

.model s m a l l .data : Planemasks f o r c l i p p i n g l e f t and r i g h t edges o f r e c t a n g l e . 00fh,00eh.00ch.008h db L e f t C l ipP1 aneMask

Mode X: 256-Color VGA Magic

891

00fh.001h.003h.007h RightClipPlaneMask db .code pub1 ic JillRectangl eX J i l l R e c t a n gDl enr X eo ac r push mo v push push c ld mov mu1 mov shr shr add add mov mov ,ax mov mov out inc mov and mov mov and mov

ax.SCREEN-WIDTH [ b p + S t a r:toYprft lfeoaiosncpgfeteta n sgcl ieanne di .Cbp+StartXl :X/4 ofriferfoscftet at snpcgialnxene l d i .1 : line d i .1 d i ,ax :rf oierfcosf tstf aenpti ginxl e l d i . [ b p + P a g e B a s: oe fl ffsoi erfset tc t a n gpl iexi ne l : d i s p l a y memory : p o i n t ES:DI t o t h e f i r s t r e c t a n g l e ax.SCREEN-SEG es a :d dpriexsesl ' s dx.SC-INDEX t:hsee t Sequence C o n t r oI lnlt deo er x : p ot ithnoet Map Mask r e g i s t e r a1 .MAP-MASK dx.al : p o i ndtx t hDXe t o SCr e D g iasttae r s i .Cbp+StartXl s i ,0003h up :look l e f t plane edge mask bh.LeftClipP1aneMaskCsil : t o c l i p 6 p u t i n BH s i .Cbp+EndXl pel adr nisggeeihu,t0:pl0o0o3kh bl.RightClipP1aneMaskCsil : mask t o c l i p 6 p u t i n BL

mov mov CmP Jle dec and sub shr shr j nz and MasksSet: mov sub Jle mov mov sub dec FillRowsLoop: push mov out mov stosb dec

Js

Jz

892

Chapter 47

: p r e s e r v ec a l l e r ' ss t a c kf r a m e : p o i n tt ol o c a ls t a c kf r a m e : p r e s e r v ec a l l e r ' sr e g i s t e rv a r i a b l e s

-

cx.Cbp+EndXI si .Cbp+StartXl cx.si F i 1Done 1 cx si .not Ollb cx.si cx.1 cx.1 MasksSet bh,bl

s i ,Cbp+EndYI si .Cbp+StartYl F i 1Done 1 a h . b y t ep t r[ b p + C o l o r l bp.SCREEN-WIDTH bp.cx bp cx a1 ,bh dx.al a1 ,ah cx F i 11 LoopBottom DoRightEdge

page

: c a l c u l a t e # o fa d d r e s s e sa c r o s sr e c t

:skip i f 0 o rn e g a t i v ew i d t h

fill - 1 : t h e r e ' sm o r et h a no n eb y t et od r a w : t h e r e ' so n l yo n eb y t e , so c o m b i n e t h e l e f t : a n dr i g h t - e d g ec l i p masks

:# o fa d d r e s s e sa c r o s sr e c t a n g l et o

-

heightofrectangle :BX : s k i p i f 0 o rn e g a t i v eh e i g h t fill :colorwithwhichto : s t a c kf r a m ei s n ' tn e e d e da n ym o r e : d i s t a n c ef r o me n do fo n es c a nl i n et os t a r t : o fn e x t :remember w i d t hi na d d r e s s e s - 1 :putleft-edgeclip mask i n AL : s e tt h el e f t - e d g ep l a n e( c l i p ) mask :putcolorin AL : d r a wt h el e f te d g e :countoffleft edge b y t e : t h a t ' st h eo n l yb y t e : t h e r ea r eo n l yt w ob y t e s

Previous mov out mov rep OoRightEdge: mov out mov stosb F i 11 LoopBottom: add

pop POP

; m i d d l ea d d r e s s e sa r ed r a w n 4 pixelsat ; s e tt h em i d d l ep i x e l mask t o no c l i p ; p u tc o l o r i n AL ; d r a wt h em i d d l ea d d r e s s e sf o u rp i x e l sa p i e c e

a1 . b l dx,al a1 .ah

: p u tr i g h t - e d g ec l i p mask i n AL : s e tt h er i g h t - e d g ep l a n e( c l i p ) ; p u tc o l o ri n AL ; d r a wt h er i g h te d g e

.

d i bp cx si F i Rows 1

POP

dec j nz F i 11 Done: pop

a1 .OOfh dx.al a1 ,ah stosb

Home

a pop

mask

:pointtothestart o f t h en e x ts c a nl i n eo f : t h er e c t a n g l e ; r e t r i e v ew i d t hi na d d r e s s e s - 1 : c o u n t down s c a nl i n e s Loop

di si bp

; r e s t o r ec a l l e r ’ sr e g i s t e rv a r i a b l e s ; r e s t o r ec a l l e r ’ ss t a c kf r a m e

ret -Fi 11 RectangleX endp end

Just so you can see Mode X in action, Listing 47.7 is a sample program that selects Mode X and draws a numberof rectangles. Listing 47.7 links to any of the rectangle fill routines I’ve presented. And now, I hope, you’re beginning to see why I’m so fond of Mode X. In the next chapter, we’ll continue with Mode X by exploring thewonders that the latches and parallel plane hardware can work on scrolls, copies, blits, and pattern fills. LISTING 47.7

L47-7.C

/*

Program t od e m o n s t r a t e mode X ( 3 2 0 x 2 4 0 .2 5 6 - c o l o r s )r e c t a n g l e fill b yd r a w i n ga d j a c e n t2 0 x 2 0r e c t a n g l e si ns u c c e s s i v ec o l o r sf r o m 0 onupacrossand down t h es c r e e n * / # i n c l u d e < c o n i o . h> #include

Mode X: 256-Color VGA Magic

893

Next