"Sins of the PS2: Dead or Alive 2" (PS2 emulation issue discussion)

grap3fruitman

Well-Known Member
Standard Donor


Sins of the PS2: Dead or Alive 2
Boasting a library of thousands of games, the PlayStation 2 can be taken to two extremes that cause trouble for emulators. The first end is one of optimization: using every trick in the book, and many more unwritten tricks, to bring about the best performance and graphical fidelity possible. Games like Jak and Daxter as well as Ratchet and Clank would fall in this category. The other end, however, has sloppy, inefficient, and overall buggy code written only to meet deadlines. Some of this code would break on PC, but the PS2 is far more forgiving. Since the code worked on PS2, no one cared, and it was shipped as-is. This article is for those games.
One might question why Dead or Alive 2 is mentioned in the title, given that it has worked fine in PCSX2 for many years. This is deceiving, however, as PCSX2 has a patch for DOA2: without the patch, the game fails to boot. But why? Enabling "EE data cache" does allow DOA2 to boot and go in-game, but it tanks performance to such a ridiculous extent that the game might as well not exist. Furthermore, this game won't work at all on any PS2 emulator without the patch, so it's not a problem specific to PCSX2. What is this data cache thing, why is it so slow, and why does DOA2 even need it?
The Dreaded Data Cache
Older systems, such as the NES, were slow due to CPU speed being the main bottleneck. This meant that making the CPU faster would always make whatever program it was running faster as well. As the years went on, CPUs began outpacing memory speeds. Any time a memory access happened, the CPU would have to wait dozens of cycles! Increasing the clock rate would make no difference if programs were bottlenecked by memory speed, so a smarter solution was needed. Enter the data cache: CPU designers figured out that since the majority of time would be spent accessing a relatively small amount of memory, they could cache program data on a small buffer of memory embedded directly onto the CPU. Modern caches are multi-level and rather complex, but the main idea is the same: frequently accessed data allows the CPU to maximize throughput instead of waiting on memory. The PS2's Emotion Engine is rather simple and only has a single 8 KB level of data cache.
By design, the cache is separate from main memory, and unless explicitly programmed to do otherwise, the CPU will try to read and write the cache. This is fine for most applications, but things get hairy when the CPU starts accessing other hardware as it is possible that the cache and RAM will have separate contents. For example, the PS2's DMA engine only reads from main memory. PS2 DMA can transfer data to a variety of devices, including the GPU, sound chip, and controller. A mismatch between cache and main memory could result in anything from corrupted graphics to a hard crash, but there are two general ways to prevent this. The first is that software must manually flush the cache whenever it accesses a peripheral that reads main memory. A flush will force all of cache contents to be stored to memory, ensuring that the two of them match. The second solution is using "cache coherency" protocols that automatically ensure that cache and main memory are the same. The PS2 uses the first solution, given that the second is more expensive to implement.
Sony's SDK is mindful of the cache and makes sure to always flush it whenever accessing other hardware. In theory, a game should never have to worry about the cache, as it is transparent to normal program operation. In practice, game developers found the SDK quite limiting and created their own functions to access the hardware, which couldn't possibly go wrong at all...
Sounds Like a Plan
Before it was patched, DOA2 had been broken on PCSX2 for many years. At some point it was discovered that the game relied on the data cache, but it seems no one bothered investigating what exactly was broken in the game. Ultimately a patch was backported from the PS2 emulator on the PS3, and that was the end of that. Sometime in the middle of 2019, I wanted to take a deeper look to see if I could get some insight out of this problem.
To start with, here's the game function (taken from Ghidra) that gets patched:
1.png

The patch changes the third parameter, a 1, to 0. What's the significance of this?
This function is used to execute a command on the Input/Output Processor (IOP). DOA2 has a custom sound module running on the IOP called TLSNDDRV that wraps around the default low-level sound driver. Using custom modules is common practice for PS2 games, and they can range anywhere from CDVD streaming to texture decompression. Interestingly, DOA2 uses the same buffer in memory for both sending the command to the IOP and receiving a reply - keep this in mind. When the third parameter is 1, sceSifCallRpc is asynchronous, meaning that the EE will continue executing code after it sends the command. When this parameter is set to 0, the function is synchronous, meaning that the EE stalls until the IOP is finished with the command.
One might realize that this code can run into issues due to it being asynchronous. While this is true, the function call by itself isn't bugged; it's the context of how this function is used that is important.
A Simple Oversight
Sometime during boot, the game initializes the sound driver on the IOP side. During this process, it sends two asynchronous commands in a row, which is how our bug manifests. As mentioned before, since sceSifCallRpc is asynchronous, the EE continues executing code after sending the command to the IOP. What happens if the EE tries to send another command while the IOP is busy processing the first?
The EE driver is programmed to wait for the previous command to finish in this scenario. However, it makes a fatal mistake: before making this check, it copies the command data into the buffer that will be sent directly to the IOP. Take a look at the function again and remember how the send and receive buffer are the same. That's right - when the first command finishes, the IOP overwrites the second command with its reply! This means the command is now junk, and once it is sent, the sound driver gets confused and crashes. The EE then hangs because the IOP never responds.
Well, that's only true on emulators without data cache. When the cache is enabled, the command is stored in the cache. Then when the IOP sends its reply, it overwrites main memory but not the cache, keeping the command safe. Hence, the game boots. This bug was likely unintentional: whoever wrote the driver wanted to save a bit of memory by reusing the same buffer and didn't think about the proper order of events. Had this caused issues, it would have been fixed, but the game is fine on real hardware, so no one noticed. Meanwhile, the patch works because it makes the EE unable to copy a new command to the buffer until the previous one finishes.
Unfortunately, data cache emulation is too costly. For every memory access, it must be determined if the access is cached or uncached. If it's cached, due to the PS2 using "two-way" cache, two separate cache lines must be checked to see if the address is in the cache. If it isn't, a cache miss occurs: 64 bytes of memory must be read, and if the cache is "dirty" due to the CPU having written to cache before, 64 bytes of cache must also be written to memory before the read occurs. These checks will be happening millions of times a second, and it's just too much to ask from an emulator. Admittedly, PCSX2's implementation is unoptimized and only works on the EE interpreter, which is not renown for its speed. That being said, I don't know if optimized data cache emulation could ever be full speed. If it could be, it would only be reserved for the most powerful computers - good luck convincing the general public to spend hundreds of dollars on upgrading their computers just to play a single PS2 game.
The lesson here is that not all sins can be atoned for. While I dislike per-game hacks, sixth-gen consoles mandate them to get every game playable at full speed. For this reason, games like DOA2 that abuse the data cache can only be solved with patches. Games aren't preserved if no computer can reach double-digit FPS, after all.
Closing remarks: I've thought about making Sins of the PS2 a series, where I pick a problematic game I've analyzed in the past and write an article about it. Please let me know if you find this format enjoyable
 

Matt Ponton

Founder
Staff member
Administrator
Standard Donor
Interesting, is this only with the DOA2 original Japanese release that was shipped off a dev build on Itagaki's desk without his approval? or also occurs for DOA2: Hardcore?
 
ALL DOA6 DOA5 DOA4 DOA3 DOA2U DOAD
Top