That was also true on my flight to Barcelona. The stuff that I struggled with before was suddenly much clearer. I came up with a really good optimization for the redrawing of background graphics, which allowed the game to run at a much smoother frame rate.
Just before I went on the flight I added profiling code for the bitmap drawing routines. I needed to measure how much time is spent pushing pixels. Traditionally on the C64 this is achieved by changing the screen colors to visualize the raster time used by a function.
The initial measurement was quite sobering. It looked like this:
Drawing the flames, before optimization |
The green color means it's drawing an opaque bitmap (i.e. black pixels overwrite the destination). Red means that the width of the bitmap is 2 bytes (16 pixels). These bars therefore represent the redrawing of the torches. You can see that it takes about 37 raster lines which is about 2.3ms.
After optimization it looked more like this:
Drawing the flames, after optimization |
The time spent in redrawing the flame bitmaps is now only about 0.9ms. Naturally the same speedup also applied to all other redrawing of transitional objects, so jiggling floors, raising/lowering gates and spikes shooting out of the floor were all much smoother now.
With this thing off my mind, I was able to have a relaxed time in Barcelona. On my flight back I did add the little damage indicators that appear when a character gets hit.
Now I did try to optimize this code before, but didn't really make this much progress. The main idea now was to specialize the code for every possible width (1, 2, 3, and 4 bytes) and iterate for each row. This meant that the code required a lot more memory than before, which was problematic because it needed to be in RAM. During drawing the ROM has to be turned off, as it covers the same memory area as the bitmap buffers. For non-opaque drawing, I have to read from the bitmap and then OR the new pixels on top of that.
Opaque drawing is easier, it basically boils down to this:
;---------------------------------- draw2ColumnsOpaque: { ldx #$00 { lda BgImageBuffer+2,x ; read byte from image ldy #$00 sta (BitmapPtr),y ; and store in bitmap inx lda BgImageBuffer+2,x ; read byte from image ldy #$08 sta (BitmapPtr),y ; and store in bitmap inx lda BitmapPtr and #$07 beq oneRowUp dec BitmapPtr dec BlockImageVisibleHeight bne _cont jmp endDraw oneRowUp: lda BitmapPtr ; move one char row up sec sbc #$39 sta BitmapPtr lda BitmapPtr+1 sbc #$01 sta BitmapPtr+1 dec BlockImageVisibleHeight bne _cont } jmp endDraw ;---------------------------------- }
For transparent images (i.e. black pixels are not written) there's a lot more work necessary:
;---------------------------------- draw3ColumnsNormal: { ldx #$00 { ldy #$00 lda (BitmapPtr),y ; read from bitmap ldy BgImageBuffer+2,x ; read byte from image and MaskTable,y ; get masking outline of image byte and clear covered bits in bitmap ora BgImageBuffer+2,x ; now or the image bytes on top of the background ldy #$00 sta (BitmapPtr),y ; and store in bitmap inx ldy #$08 lda (BitmapPtr),y ; read from bitmap ldy BgImageBuffer+2,x ; read byte from image and MaskTable,y ; get masking outline of image byte and clear covered bits in bitmap ora BgImageBuffer+2,x ; now or the image bytes on top of the background ldy #$08 sta (BitmapPtr),y ; and store in bitmap inx ldy #$10 lda (BitmapPtr),y ; read from bitmap ldy BgImageBuffer+2,x ; read byte from image and MaskTable,y ; get masking outline of image byte and clear covered bits in bitmap ora BgImageBuffer+2,x ; now or the image bytes on top of the background ldy #$10 sta (BitmapPtr),y ; and store in bitmap inx lda BitmapPtr and #$07 beq oneRowUp dec BitmapPtr dec BlockImageVisibleHeight bne _cont jmp endDraw oneRowUp: lda BitmapPtr ; move one char row up sec sbc #$39 sta BitmapPtr lda BitmapPtr+1 sbc #$01 sta BitmapPtr+1 dec BlockImageVisibleHeight bne _cont } jmp endDraw ;---------------------------------- }
At this point (in early June 2011) I was hitting the most important milestone: Being able to play through the whole game. What's generally known as an "alpha" build. Some things didn't quite work 100% (end fight with Jaffar was messed up), some things were missing (no falling floors), and the graphics for the palace levels were mostly just dungeon tiles with different colors.
That meant that I had to tackle the next big unknown, the cutscene animation system. I previously only found bits and pieces of this code and was unsure how much there was. I was hoping that it would fit into my current structure. I decided to first concentrate on the in-game cutscenes (between certain levels) and the final cutscene of the game first, as those would not require the whole title screen logic, which I didn't have yet. I knew that those cutscene must be triggered from the game code somewhere.
Luckily I had already received graphics for the princess room and all the animations of the princess and vizier from STE. But I first had to find out how to put them to use. So I was back into doing serious reverse engineering.
At the end of the mainLoop there's this end condition:
lda NextLevel cmp CurrentLevel beq mainLoop ; nope, level hasn't changed, keep playing startNextLevel: jsr l27e5 jmp activateNextLevelOrCutscene
The code at l27e5 is something I haven't spent much time looking at. My hunch is that it's showing the message prompting the user to insert the correct disk for the upcoming level. Not something I was interested in.
The jump to activateNextLevelOrCutscene is more interesting, not just because it's completely unnecessary, since activateNextLevelOrCutscene begins immediately after this instruction.
It calls checkForCutscene which looks at the NextLevel variable and branches to different pieces of code for levels 2, 4, 6, 8, 9 and 12. Turns out these are the levels that begin with a cutscene, showing the princess waiting for the kid in some form. For any other level the code branches immediately to initLevel.
lda NextLevel sta CurrentLevel ; the next level is now the current level cmp #$02 beq l2225 ; branch to cutscene for level 2 l220e: cmp #$04 beq l223a ; branch to cutscene for level 4 l2212: cmp #$06 beq l223e ; branch to cutscene for level 6 l2216: cmp #$08 beq l224a ; branch to cutscene for level 8 l221a: cmp #$09 beq l2242 ; branch to cutscene for level 9 l221e: cmp #$0c beq l2246 ; branch to cutscene for level 12 endCutscene: jmp initLevelImpl ; next level has no cutscene, so just start it immediately
Let's take the cutscene before level 2 as an example. This is just the generic "princess waits while standing" scene:
Just standing here, waiting for my prince to come... |
l2225: lda #$01 ; cutscene id l2227: pha retry: jsr loadCutsceneBackground ; loads the bitmap for the princess room jsr l4c4b ; Apple II specific file error handling? jsr l4c4e ; Apple II specific file error handling? bne retry l2233: pla jsr triggerCutscene ; execute the cutscene jmp endCutscene ; jumps to initLevel, continues game after cutscene
So far so good. I identified triggerCutscene to be the main entry point for all cutscenes, which gets the cutscene id passed in through the accumulator.
It looks like this:
;---------------------------------- triggerCutsceneImpl: pha jsr initCutscene pla tax lda CutsceneJumpTableLo,x sta le214 lda CutsceneJumpTableHi,x sta le215 le214 = * + 1 le215 = * + 2 jsr $ffff lda #$01 sta CutsceneDelay rts ;----------------------------------
Oh, this is as easy as it gets. A simple jump table for each cutscene. The entry for level 2 is also used for level 6 (since it's the same cutscene) and the function it points to is very short:
;---------------------------------- cutsceneBeforeLevel2And6: jsr getTimeLeftForCutscene ; check how much time is left in the game jsr initCutsceneTimers ; initializes hour glass state and sand animation timer jsr triggerPrincessStandingFacingRight jsr saveShad lda #$02 ; run the cutscene loop for two frames jsr runCutsceneLoop ldx #$32 ; run for this many frames if music is disabled lda #$0d ; id of the background music, cutscene runs until music ends jmp cutsceneLoopWithOptionalCharacterAnim ;----------------------------------
The core of it is triggerPrincessStandingFacingRight:
;---------------------------------- triggerPrincessStandingFacingRight: jsr triggerPrincessStanding ; spawn princess (looks left by default) lda #$00 ; make her look right instead sta CharFace rts ;---------------------------------- ;---------------------------------- triggerPrincessStanding: lda #$05 ; princess uses CharID 5 (see below) sta CharID lda #$78 ; X position sta CharX lda #$97 ; Y position sta CharY lda #$ff ; looking left sta CharFace lda #$5e ; standing sequence jsr setSequence jmp animChar ; execute first frame of sequence ;----------------------------------
I'd seen this last bit of code before, when I was looking at all the places where setSequence is used. Back then I wasn't able to identify all of the sequences, because some of them just seemed broken. Turns out that those were cutscene animations, and they were in the normal sequence table already. So there's no animation sequence data loaded for the cutscene. They obviously looked weird when played with the Kid frame def list.
So basically the cutscene system is just using the normal game engine to draw the characters. They have special CharIDs as described in the PoP source code documentation.
5 = princess (in princess scenes)
6 = vizier (in princess scenes)
Since cutsceneBeforeLevel2And6 calls saveShad, the princess is actually drawn as a guard.
Just like for guards, reading from the frame def list is overridden for these characters by overrideFrameDefListForCharacter, so it's going through indexIntoCutsceneFrameDefList rather than the normal code.
After setting up the CharData for the princess and saving it to the Shad slot, the game runs runCutsceneLoop which is a little mini game loop just for the cutscenes. It handles keyboard and joystick (to be able to skip the scene), updates the character animations (much like the game itself does) and updates the screen:
;---------------------------------- updateCutsceneScreen: jsr drawCutsceneFrame jsr waitForVBlank jmp flipFrontAndBackBuffer ;----------------------------------
In drawCutsceneFrame we find code to draw the hour glass, the flames of the torches, the stars outside the window and the sand of the hour glass. This is all very specialized code, but it uses the existing drawing functions used by the game to blit images onto the screen.
It was relatively easy now to get this system up and running. First I replaced loadCutsceneBackground with my own code that copies a multicolor bitmap into both screen buffers and draws a rectangle for the right pillar into the mask bitmap.
The hourglass drawing I replaced with a very simple copy loop that stamps one of the hourglass images onto the normal princess room bitmap (and screen and color RAM). There are 9 different images used, depending on the current time left in the game. At the beginning the hourglass is still full and in the end it's empty. I also made two variants which are alternated between to get the sand animation going. The Apple II original had special sand animation images which were drawn on top of the hourglass, but I decided that this wasn't really necessary, so my code just redraws the whole hourglass every time. No problem.
The blinking stars outside the windows were also more easily solved on the C64. I just placed them nicely in the bitmap and then added some code to play around with the color RAM entries. Easy enough.
The flames are the same as the ones used in the level background graphics, and a lot of the drawing code is shared with the in-game code.
And after adding some special code to switch the lower half of the character sprites to use a different color I finally had the princess on the screen in her room, as expected. This made me really happy.
I now started to quickly decipher the other cutscene code, which was all very similar. I decided to do the final cutscene of the game next, because that one seemed the most complicated to me. From converting the animation images I knew that at some point it switches from using two characters (for princess and kid) to a single one for both, as they embrace.
;---------------------------------- cutsceneFinal: lda #$08 sta CutsceneDelay lda #$01 sta SoundEnabled sta MusicEnabled jsr triggerPrincessStandingFacingLeftBeforeTurn jsr saveShad lda #$08 jsr runCutsceneLoop jsr triggerKidArriving ; plays kid running sequence (facing left) jsr saveKid lda #$08 jsr runCutsceneLoop lda #$6c jsr triggerShadSequence ; princess turning around and embracing kid lda #$05 jsr runCutsceneLoop lda #$0d jsr triggerKidSequence ; kid stops from run lda #$02 jsr runCutsceneLoop lda #$00 sta KidFrame ; disable kid, the embrace has both characters in one image lda #$09 jsr runCutsceneLoop lda #$0f ; the music to play jsr le449 ; just animates cutscene background graphics jsr triggerMouseJoiningPrincessAndKid ; mouse comes in and joins the couple jsr saveKid lda #$0c jsr runCutsceneLoop lda #$65 ; stop mouse - I changed this to #$72 to make jsr triggerKidSequence ; the mouse rise like in the PC version lda #$1e jmp runCutsceneLoop ;----------------------------------
As soon as I had this cutscene somewhat working (I tweaked the positioning later to make it look nicer) I captured a video and uploaded it on June 13th 2011:
I was quite satisfied after having understood how the cutscene system works. It felt like being on the home stretch. But there was still some things left to do. The intro cutscene was within my grasp now, but I also needed to come up with all the title screen code. Oh, and what about the music? We'll find out about that in the next part.
Thanks a lot for the update, I knew that coming back here every day will pay off with an entertaining read ;-)
ReplyDeleteInteresting! There's no obvious way to speed up the bitmap drawing routines any further, I'm afraid. In your source code examples, I cannot find the definition of the label "_cont". Did you omit it by chance, or is it some kind of implicit label? - Thanks to the RSS feed, I didn't have to check here every day ;)
ReplyDelete@S.E.S.: The assembler I use has implicit labels _cont and _break for the beginning and the end of a scope (like continue and break in C/C++).
DeleteJust lovely! Really enjoy reading your blogposts ;)
ReplyDeleteInteresting blog. A lot of familar decompilation tales. You can check out my work on OutRun here:
ReplyDeletehttp://reassembler.blogspot.com/
Great work! I've taken a different approach to remaking NFS1 and Carmageddon - use the original assets but rewrite the code. Mostly for my own fun! http://blog.1amstudios.com/search/label/OpenC1
ReplyDeleteYou didn't mention the presence of TWO cutscenes for level 12, so I'm curious if you might have inadvertently omitted the rarely-seen "time is running out" cutscene that plays if you reach level 12 with less than 5 minutes left in the original game. It's a very chilling dramatic twist in the original game, yet most players seem to be totally unaware of its existence.
ReplyDeleteIn case you aren't familiar with the "time is running out" cutscene, there's a video of the PC Version of it in the forum thread at the following link, plus a short story about how it freaked us out when we first saw it during a late-night game in 1989:
http://forum.princed.org/viewtopic.php?p=12132#p12132
Yes, of course that variant of the cutscene is there. It's actually the default cutscene for level 12. But there's a time check and if there's enough time left, then the code just branches to the normal cutscene for level 2 and 6.
DeleteAh then, good job!
DeleteIn that case, here are a few obscure Apple II features (bugs) that are so obscure that I hadn't found anyone else who knew about them until the discussion at http://forum.princed.org/viewtopic.php?p=12150#p12150
- Level 5, if you lure the first guard down and kill him on a switch then *another* gate will open so you can grab the Life Potion before the shadow does! (See video at the linked forum thread)
- Level 8, if you kill the first guard on the door switch then those three troublesome gates later in the level will open and stay open so you don't need to do a speed run. (Once I found it, I always used this trick because it saves a lot of hassle on level 8.)
I used to think these were cleverly-hidden bonus features, but now that the source code has been published forum-user David discovered it's a bug...and it's very likely this bug made it into the C64 version unless you restructured the code that handles "stuck" switches.
Hi MR. SID,
ReplyDeleteThis game conversion is great, thank you for your excellent work. For the case if you would like to optimize the game further, I can recommend a next target. I noticed that while a piece of the floor falls down and the prince moves (run), refreshing of the screen content is getting very slow, about 1-2 fps. There is a room at the start of the game where you can test it: http://kepfeltoltes.hu/view/130331/prince_slow_room_www.kepfeltoltes.hu_.png