Replies: 3 comments 1 reply
-
|
By quick glance I'm surprised the (lack of a real) FIFO doesn't cause more problems in these games, particularly when it comes to the "half" handling. |
Beta Was this translation helpful? Give feedback.
-
|
I tend to agree with @angelosa, but I tend to believe the FIFO isn't entirely the issue (although "no polygons and/or corrupt ones" certainly sounds like it involves an outgoing data FIFO to the Voodoo). What's the deal with the difference in handling between ASTAT in the DRC/JIT and interpreter? That certainly bears investigation. Both of the numbers you cite, Vas, are below the half-threshold for the FIFO on Racing Jam, but I don't like that they're so different. Personally, my suggestion would be to address the arithmetic differences even if it's painful to do so from the interpreter, in order to be able to more precisely diff the register logs. I have a sneaking suspicion it has little to do with arithmetic and everything to do with external hardware interactions - either due to the block-based nature of the DRC/JIT system itself, or due to different handling of related things. |
Beta Was this translation helpful? Give feedback.
-
|
mamedev/mame@5345617 makes There's the previously-mentioned FIFO free space difference, and then there's this kind of thing: @@ -250057,20 +249875,20 @@
0002E5FB 01410EFD 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200BA: DM(I1, 0x01) = R1
0002E5FB 01410EFD 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200BB: R0 = DM(I0, 0x01)
0004A5BD 01410EFD 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200BC: R0 = ROT R0 BY R6, R1 = DM(I0, 0x01)
-A5BD0004 00000E73 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200BD: R0 = ROT R0 BY R6, DM(I1, 0x01) = R0
-0004A5BD 00000E73 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200BE: R1 = ROT R1 BY R6, DM(I1, 0x01) = R0
-0004A5BD 0E730000 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200BF: R1 = ROT R1 BY R6, DM(I1, 0x01) = R1
-0004A5BD 00000E73 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200C0: DM(I1, 0x01) = R1
-0004A5BD 00000E73 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200C1: R0 = DM(I0, 0x01)
-00004512 00000E73 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200C2: R0 = ROT R0 BY R6, R1 = DM(I0, 0x01)
-45120000 00000000 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200C3: R0 = ROT R0 BY R6, DM(I1, 0x01) = R0
-00004512 00000000 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200C4: R1 = ROT R1 BY R6, DM(I1, 0x01) = R0
-00004512 00000000 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200C5: R1 = ROT R1 BY R6, DM(I1, 0x01) = R1
-00004512 00000000 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200C6: R0 = DM(I0, 0x01)
-00073238 00000000 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200C7: R0 = ROT R0 BY R6, DM(I1, 0x01) = R1
-32380007 00000000 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200C8: R0 = ROT R0 BY R6, DM(I1, 0x01) = R0
-00073238 00000000 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200C9: DM(I1, 0x01) = R0
-00073238 00000000 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200CA: R0 = 0x00000014
+A5BD0004 00002386 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200BD: R0 = ROT R0 BY R6, DM(I1, 0x01) = R0
+0004A5BD 00002386 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200BE: R1 = ROT R1 BY R6, DM(I1, 0x01) = R0
+0004A5BD 23860000 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200BF: R1 = ROT R1 BY R6, DM(I1, 0x01) = R1
+0004A5BD 00002386 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200C0: DM(I1, 0x01) = R1
+0004A5BD 00002386 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200C1: R0 = DM(I0, 0x01)
+00003622 00002386 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200C2: R0 = ROT R0 BY R6, R1 = DM(I0, 0x01)
+36220000 00000000 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200C3: R0 = ROT R0 BY R6, DM(I1, 0x01) = R0
+00003622 00000000 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200C4: R1 = ROT R1 BY R6, DM(I1, 0x01) = R0
+00003622 00000000 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200C5: R1 = ROT R1 BY R6, DM(I1, 0x01) = R1
+00003622 00000000 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200C6: R0 = DM(I0, 0x01)
+00072C15 00000000 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200C7: R0 = ROT R0 BY R6, DM(I1, 0x01) = R1
+2C150007 00000000 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200C8: R0 = ROT R0 BY R6, DM(I1, 0x01) = R0
+00072C15 00000000 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200C9: DM(I1, 0x01) = R0
+00072C15 00000000 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200CA: R0 = 0x00000014
00000014 00000000 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200CB: DM(0x03400001) = R0
00000014 00000000 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200CC: DM(0x00400010) = R2
00000014 00000000 0000026E 00000000 3DC45000 3E530000 00000010 0000EEE5 C6EA8CE5 C6E11693 480933D0 BF2ED8AC 00000000 3E559CAA 0200CD: BIT CLEAR USTAT1 0x00000040It's doing a load with post-increment using DAG register I0. In this case, I0 is 0x02480054, is Voodoo register 0x150 as a byte address. According to our Voodoo code, that's In any case, it's looking even more like it's something to do with how the SHARC communicates with the outside world, rather than a SHARC ALU emulation bug. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
(Tag @galibert, @angelosa, @gm-matthew, @MooglyGuy who may be interested.)
I've been poking at the the AD SHARC DSP emulation lately. I wasn't originally looking for bugs (I had the source file open for other reasons), but there were some pretty obvious ones that jumped out at me. Since then I've been looking at it occasionally when I'm taking a break from other stuff.
One thing that's sort of bugging me is missing/bad polygons in Konami PowerPC/SHARC/Voodoo games. You can reproduce it easily by enabling the recompiler for the first DSP on Racing Jam's NWK-TR board:
This will cause bad/missing polygons in every second frame in attract mode with
-drc(it will still behave like it normally does with-nodrc).I started looking at differences between the interpreter and recompiler by running to attract mode with the interpreter and taking a save state, then loading starting with the save state and tracing with the interpreter and recompiler. Some of the differences were from the recompiler not implementing the table-based
RECIPSandRSQRTSinstructions properly, so I addressed that. The traces look a lot closer now, but there are still differences. In the trace fragments here, the first fourteen numbers are register R0-R13 contents, followed by the instruction address (PC value) and the disassembled instruction.There are differences that seem to be caused by the recompiler short-circuiting spinloops, e.g. this:
This is probably harmless.
Sometimes it seems to read a a different value from the Voodoo, e.g. here:
The interpreter read 0x0e13c79d while the recompiler read 0x0e12679d from location 0x02480000 in the DSP's data memory space. The debugger says this is:
Looking at the implementation, the offset is in the range that's always passed through to the Voodoo device. This is a pure virtual member function in the base class, and NWK-TR instantiates
VOODOO_1, so it should be callingvoodoo_1_device::read, then getting passed through tomap_register_r. Does that mean it's reading the status register? If it is, the part that differs is the FIFO free space (316 vs 294). Possibly different due to timing differences, or is this a clue pointing to what's wrong?The
FIXinstruction produces slightly different results between the interpreter and recompiler:The difference is that the interpreter rounds the midpoint away from zero while the recompiler rounds the midpoint towards the nearest even number. I believe the recompiler is correct, as the SHARC instruction set manual says it uses IEEE round to nearest, which specifies that the midpoint is rounded to the nearest even number. However, implementing this in C++ is pain, so I haven't changed the interpreter. I honestly doubt a 1LSB difference is causing the bad polygons here anyway.
Then you get knock-on effects of these 1LSB differences later, for example:
Note that the value in R0 that it actually stores in the end is the same either way, so the 1LSB difference seems to come out in the wash.
So I'm not sure what's going on. I'm sure there are more maths bugs and flag calculation bugs in both the interpreter and recompiler, but it doesn't seem to be hitting them in a way that causes different results for this program. Is DMA somehow broken for the recompiler? Is something throwing the timing so things get out-of-sync? Is the different FIFO free space value a clue as to what the actual difference is?
Beta Was this translation helpful? Give feedback.
All reactions