Saturday, January 14, 2012

Part Nine - Optimizations for the Princess

Let me tell you a thing about airplane coding. It's where I get things done. As soon as the plane has taken off and it's fine to use electronic devices again, I put on my headphones and start getting productive. Trying to cram as much work into a few hours allows me to work in a very focused manner.
That was also true on my flight to Barcelona. The stuff that I struggled with before was suddenly much clearer. I came up with a really good optimization for the redrawing of background graphics, which allowed the game to run at a much smoother frame rate.
Just before I went on the flight I added profiling code for the bitmap drawing routines. I needed to measure how much time is spent pushing pixels. Traditionally on the C64 this is achieved by changing the screen colors to visualize the raster time used by a function.

The initial measurement was quite sobering. It looked like this:

Drawing the flames, before optimization


The green color means it's drawing an opaque bitmap (i.e. black pixels overwrite the destination). Red means that the width of the bitmap is 2 bytes (16 pixels). These bars therefore represent the redrawing of the torches. You can see that it takes about 37 raster lines which is about 2.3ms.
After optimization it looked more like this:

Drawing the flames, after optimization

The time spent in redrawing the flame bitmaps is now only about 0.9ms. Naturally the same speedup also applied to all other redrawing of transitional objects, so jiggling floors, raising/lowering gates and spikes shooting out of the floor were all much smoother now.
With this thing off my mind, I was able to have a relaxed time in Barcelona. On my flight back I did add the little damage indicators that appear when a character gets hit.

Now I did try to optimize this code before, but didn't really make this much progress. The main idea now was to specialize the code for every possible width (1, 2, 3, and 4 bytes) and iterate for each row. This meant that the code required a lot more memory than before, which was problematic because it needed to be in RAM. During drawing the ROM has to be turned off, as it covers the same memory area as the bitmap buffers. For non-opaque drawing, I have to read from the bitmap and then OR the new pixels on top of that.

Opaque drawing is easier, it basically boils down to this:

;----------------------------------
draw2ColumnsOpaque: 
{
        ldx #$00
        {
            lda BgImageBuffer+2,x   ; read byte from image
            ldy #$00
            sta (BitmapPtr),y       ; and store in bitmap
            inx

            lda BgImageBuffer+2,x   ; read byte from image
            ldy #$08
            sta (BitmapPtr),y       ; and store in bitmap
            inx
            
            lda BitmapPtr
            and #$07
            beq oneRowUp
            
            dec BitmapPtr
            dec BlockImageVisibleHeight
            bne _cont
            jmp endDraw

oneRowUp:               
            lda BitmapPtr               ; move one char row up
            sec
            sbc #$39
            sta BitmapPtr
            
            lda BitmapPtr+1
            sbc #$01
            sta BitmapPtr+1
        
            dec BlockImageVisibleHeight
            bne _cont
        }

        jmp endDraw
;----------------------------------
}

For transparent images (i.e. black pixels are not written) there's a lot more work necessary:

;----------------------------------
draw3ColumnsNormal:
{
        ldx #$00
        {
            ldy #$00
            lda (BitmapPtr),y       ; read from bitmap 
            ldy BgImageBuffer+2,x   ; read byte from image
            and MaskTable,y         ; get masking outline of image byte and clear covered bits in bitmap
            ora BgImageBuffer+2,x   ; now or the image bytes on top of the background
            ldy #$00
            sta (BitmapPtr),y       ; and store in bitmap
            inx

            ldy #$08
            lda (BitmapPtr),y       ; read from bitmap 
            ldy BgImageBuffer+2,x   ; read byte from image
            and MaskTable,y         ; get masking outline of image byte and clear covered bits in bitmap
            ora BgImageBuffer+2,x   ; now or the image bytes on top of the background
            ldy #$08
            sta (BitmapPtr),y       ; and store in bitmap
            inx

            ldy #$10
            lda (BitmapPtr),y       ; read from bitmap 
            ldy BgImageBuffer+2,x   ; read byte from image
            and MaskTable,y         ; get masking outline of image byte and clear covered bits in bitmap
            ora BgImageBuffer+2,x   ; now or the image bytes on top of the background
            ldy #$10
            sta (BitmapPtr),y       ; and store in bitmap
            inx

            lda BitmapPtr
            and #$07
            beq oneRowUp                

            dec BitmapPtr
            dec BlockImageVisibleHeight
            bne _cont
            jmp endDraw

oneRowUp:               
            lda BitmapPtr               ; move one char row up
            sec
            sbc #$39
            sta BitmapPtr
            
            lda BitmapPtr+1
            sbc #$01
            sta BitmapPtr+1
        
            dec BlockImageVisibleHeight
            bne _cont
        }

        jmp endDraw
;----------------------------------
}

At this point (in early June 2011) I was hitting the most important milestone: Being able to play through the whole game. What's generally known as an "alpha" build. Some things didn't quite work 100% (end fight with Jaffar was messed up), some things were missing (no falling floors), and the graphics for the palace levels were mostly just dungeon tiles with different colors.

That meant that I had to tackle the next big unknown, the cutscene animation system. I previously only found bits and pieces of this code and was unsure how much there was. I was hoping that it would fit into my current structure. I decided to first concentrate on the in-game cutscenes (between certain levels) and the final cutscene of the game first, as those would not require the whole title screen logic, which I didn't have yet. I knew that those cutscene must be triggered from the game code somewhere.
Luckily I had already received graphics for the princess room and all the animations of the princess and vizier from STE. But I first had to find out how to put them to use. So I was back into doing serious reverse engineering.

At the end of the mainLoop there's this end condition:

            lda NextLevel
            cmp CurrentLevel
            beq mainLoop         ; nope, level hasn't changed, keep playing
startNextLevel:
            jsr l27e5
            jmp activateNextLevelOrCutscene

The code at l27e5 is something I haven't spent much time looking at. My hunch is that it's showing the message prompting the user to insert the correct disk for the upcoming level. Not something I was interested in.
The jump to activateNextLevelOrCutscene is more interesting, not just because it's completely unnecessary, since activateNextLevelOrCutscene begins immediately after this instruction.
It calls checkForCutscene which looks at the NextLevel variable and branches to different pieces of code for levels 2, 4, 6, 8, 9 and 12. Turns out these are the levels that begin with a cutscene, showing the princess waiting for the kid in some form. For any other level the code branches immediately to initLevel.

            lda NextLevel
            sta CurrentLevel   ; the next level is now the current level
            cmp #$02
            beq l2225          ; branch to cutscene for level 2
l220e:
            cmp #$04
            beq l223a          ; branch to cutscene for level 4
l2212:
            cmp #$06
            beq l223e          ; branch to cutscene for level 6
l2216:
            cmp #$08
            beq l224a          ; branch to cutscene for level 8
l221a:
            cmp #$09
            beq l2242          ; branch to cutscene for level 9
l221e:
            cmp #$0c
            beq l2246          ; branch to cutscene for level 12
endCutscene:
            jmp initLevelImpl  ; next level has no cutscene, so just start it immediately

Let's take the cutscene before level 2 as an example. This is just the generic "princess waits while standing" scene:

Just standing here, waiting for my prince to come...


l2225:
            lda #$01                     ; cutscene id
l2227:
            pha
retry:
            jsr loadCutsceneBackground   ; loads the bitmap for the princess room
            jsr l4c4b                    ; Apple II specific file error handling?
            jsr l4c4e                    ; Apple II specific file error handling?
            bne retry
l2233:
            pla
            jsr triggerCutscene          ; execute the cutscene
            jmp endCutscene              ; jumps to initLevel, continues game after cutscene

So far so good. I identified triggerCutscene to be the main entry point for all cutscenes, which gets the cutscene id passed in through the accumulator.
It looks like this:

;----------------------------------
triggerCutsceneImpl:
            pha
            jsr initCutscene
            pla
            tax
            lda CutsceneJumpTableLo,x
            sta le214
            lda CutsceneJumpTableHi,x
            sta le215
    le214 = * + 1
    le215 = * + 2
            jsr $ffff
            lda #$01
            sta CutsceneDelay
            rts
;----------------------------------

Oh, this is as easy as it gets. A simple jump table for each cutscene. The entry for level 2 is also used for level 6 (since it's the same cutscene) and the function it points to is very short:

;----------------------------------
cutsceneBeforeLevel2And6:
            jsr getTimeLeftForCutscene ; check how much time is left in the game
            jsr initCutsceneTimers  ; initializes hour glass state and sand animation timer
            jsr triggerPrincessStandingFacingRight
            jsr saveShad
            lda #$02                ; run the cutscene loop for two frames
            jsr runCutsceneLoop
            ldx #$32                ; run for this many frames if music is disabled                
            lda #$0d                ; id of the background music, cutscene runs until music ends
            jmp cutsceneLoopWithOptionalCharacterAnim
;----------------------------------

The core of it is triggerPrincessStandingFacingRight:

;----------------------------------
triggerPrincessStandingFacingRight:
            jsr triggerPrincessStanding    ; spawn princess (looks left by default)
            lda #$00                       ; make her look right instead
            sta CharFace
            rts
;----------------------------------

;----------------------------------
triggerPrincessStanding:
            lda #$05                       ; princess uses CharID 5 (see below)
            sta CharID
            lda #$78                       ; X position
            sta CharX
            lda #$97                       ; Y position
            sta CharY
            lda #$ff                       ; looking left
            sta CharFace
            lda #$5e                       ; standing sequence
            jsr setSequence
            jmp animChar                   ; execute first frame of sequence
;----------------------------------

I'd seen this last bit of code before, when I was looking at all the places where setSequence is used. Back then I wasn't able to identify all of the sequences, because some of them just seemed broken. Turns out that those were cutscene animations, and they were in the normal sequence table already. So there's no animation sequence data loaded for the cutscene. They obviously looked weird when played with the Kid frame def list.

So basically the cutscene system is just using the normal game engine to draw the characters. They have special CharIDs as described in the PoP source code documentation.

5 = princess (in princess scenes)
6 = vizier (in princess scenes)

Since cutsceneBeforeLevel2And6 calls saveShad, the princess is actually drawn as a guard.
Just like for guards, reading from the frame def list is overridden for these characters by overrideFrameDefListForCharacter, so it's going through indexIntoCutsceneFrameDefList rather than the normal code.

After setting up the CharData for the princess and saving it to the Shad slot, the game runs runCutsceneLoop which is a little mini game loop just for the cutscenes. It handles keyboard and joystick (to be able to skip the scene), updates the character animations (much like the game itself does) and updates the screen:

;----------------------------------
updateCutsceneScreen:
            jsr drawCutsceneFrame
            jsr waitForVBlank
            jmp flipFrontAndBackBuffer
;----------------------------------

In drawCutsceneFrame we find code to draw the hour glass, the flames of the torches, the stars outside the window and the sand of the hour glass. This is all very specialized code, but it uses the existing drawing functions used by the game to blit images onto the screen.

It was relatively easy now to get this system up and running. First I replaced loadCutsceneBackground with my own code that copies a multicolor bitmap into both screen buffers and draws a rectangle for the right pillar into the mask bitmap.

The hourglass drawing I replaced with a very simple copy loop that stamps one of the hourglass images onto the normal princess room bitmap (and screen and color RAM). There are 9 different images used, depending on the current time left in the game. At the beginning the hourglass is still full and in the end it's empty. I also made two variants which are alternated between to get the sand animation going. The Apple II original had special sand animation images which were drawn on top of the hourglass, but I decided that this wasn't really necessary, so my code just redraws the whole hourglass every time. No problem.

The blinking stars outside the windows were also more easily solved on the C64. I just placed them nicely in the bitmap and then added some code to play around with the color RAM entries. Easy enough.

The flames are the same as the ones used in the level background graphics, and a lot of the drawing code is shared with the in-game code.

And after adding some special code to switch the lower half of the character sprites to use a different color I finally had the princess on the screen in her room, as expected. This made me really happy.

I now started to quickly decipher the other cutscene code, which was all very similar. I decided to do the final cutscene of the game next, because that one seemed the most complicated to me. From converting the animation images I knew that at some point it switches from using two characters (for princess and kid) to a single one for both, as they embrace.

;----------------------------------
cutsceneFinal:
            lda #$08
            sta CutsceneDelay
            lda #$01
            sta SoundEnabled
            sta MusicEnabled
            jsr triggerPrincessStandingFacingLeftBeforeTurn
            jsr saveShad
            lda #$08
            jsr runCutsceneLoop
            jsr triggerKidArriving  ; plays kid running sequence (facing left)
            jsr saveKid
            lda #$08
            jsr runCutsceneLoop
            lda #$6c
            jsr triggerShadSequence ; princess turning around and embracing kid
            lda #$05
            jsr runCutsceneLoop
            lda #$0d
            jsr triggerKidSequence  ; kid stops from run
            lda #$02
            jsr runCutsceneLoop
            lda #$00
            sta KidFrame            ; disable kid, the embrace has both characters in one image
            lda #$09
            jsr runCutsceneLoop
            lda #$0f               ; the music to play
            jsr le449              ; just animates cutscene background graphics
            jsr triggerMouseJoiningPrincessAndKid ; mouse comes in and joins the couple
            jsr saveKid
            lda #$0c
            jsr runCutsceneLoop
            lda #$65               ; stop mouse - I changed this to #$72 to make
            jsr triggerKidSequence ; the mouse rise like in the PC version
            lda #$1e
            jmp runCutsceneLoop
;----------------------------------

As soon as I had this cutscene somewhat working (I tweaked the positioning later to make it look nicer) I captured a video and uploaded it on June 13th 2011:


I was quite satisfied after having understood how the cutscene system works. It felt like being on the home stretch. But there was still some things left to do. The intro cutscene was within my grasp now, but I also needed to come up with all the title screen code. Oh, and what about the music? We'll find out about that in the next part.

10 comments:

  1. Thanks a lot for the update, I knew that coming back here every day will pay off with an entertaining read ;-)

    ReplyDelete
  2. Interesting! There's no obvious way to speed up the bitmap drawing routines any further, I'm afraid. In your source code examples, I cannot find the definition of the label "_cont". Did you omit it by chance, or is it some kind of implicit label? - Thanks to the RSS feed, I didn't have to check here every day ;)

    ReplyDelete
    Replies
    1. @S.E.S.: The assembler I use has implicit labels _cont and _break for the beginning and the end of a scope (like continue and break in C/C++).

      Delete
  3. Just lovely! Really enjoy reading your blogposts ;)

    ReplyDelete
  4. Interesting blog. A lot of familar decompilation tales. You can check out my work on OutRun here:
    http://reassembler.blogspot.com/

    ReplyDelete
  5. Great work! I've taken a different approach to remaking NFS1 and Carmageddon - use the original assets but rewrite the code. Mostly for my own fun! http://blog.1amstudios.com/search/label/OpenC1

    ReplyDelete
  6. You didn't mention the presence of TWO cutscenes for level 12, so I'm curious if you might have inadvertently omitted the rarely-seen "time is running out" cutscene that plays if you reach level 12 with less than 5 minutes left in the original game. It's a very chilling dramatic twist in the original game, yet most players seem to be totally unaware of its existence.

    In case you aren't familiar with the "time is running out" cutscene, there's a video of the PC Version of it in the forum thread at the following link, plus a short story about how it freaked us out when we first saw it during a late-night game in 1989:
    http://forum.princed.org/viewtopic.php?p=12132#p12132

    ReplyDelete
    Replies
    1. Yes, of course that variant of the cutscene is there. It's actually the default cutscene for level 12. But there's a time check and if there's enough time left, then the code just branches to the normal cutscene for level 2 and 6.

      Delete
    2. Ah then, good job!

      In that case, here are a few obscure Apple II features (bugs) that are so obscure that I hadn't found anyone else who knew about them until the discussion at http://forum.princed.org/viewtopic.php?p=12150#p12150

      - Level 5, if you lure the first guard down and kill him on a switch then *another* gate will open so you can grab the Life Potion before the shadow does! (See video at the linked forum thread)

      - Level 8, if you kill the first guard on the door switch then those three troublesome gates later in the level will open and stay open so you don't need to do a speed run. (Once I found it, I always used this trick because it saves a lot of hassle on level 8.)

      I used to think these were cleverly-hidden bonus features, but now that the source code has been published forum-user David discovered it's a bug...and it's very likely this bug made it into the C64 version unless you restructured the code that handles "stuck" switches.

      Delete
  7. Hi MR. SID,

    This game conversion is great, thank you for your excellent work. For the case if you would like to optimize the game further, I can recommend a next target. I noticed that while a piece of the floor falls down and the prince moves (run), refreshing of the screen content is getting very slow, about 1-2 fps. There is a room at the start of the game where you can test it: http://kepfeltoltes.hu/view/130331/prince_slow_room_www.kepfeltoltes.hu_.png

    ReplyDelete