And how exactly does Super Mario Sunshine and Super Mario Galaxy use EFB?
04-14-2020, 05:27 PM
(This post was last modified: 04-14-2020, 06:02 PM by MayImilae.)
As the GameCube’s ArtX GPU is rendering a frame, it needs a “workspace” to put the already rendered pixels as it finishes rendering, so the ArtX GPU has it’s own very close and very fast bit of dedicated memory specifically for this: the Embedded Frame Buffer (EFB). Once it is done rendering, it will copy the frame from the embedded frame buffer to the main memory of the GameCube – this is an EFB Copy. In main memory, the GPU can then EFB Copy the finished frame data back into the GPU, where it can edit the rendered frame for effects and other purposes. It is even common to do multiple EFB copies per frame! Once the GPU is done with the frame, the GPU will convert the frame to YUV (which the ArtX GPU can’t edit) before doing a special EFB Copy to move the frame to a region of main memory where Video Interface (VI) will read the frame and perform scanout to the screen. We call the region of main memory that the frame goes to the “eXternal Frame Buffer” (XFB), and we call the special EFB Copy to move the data to the XFB an “XFB Copy”. EFB is a very official term, but XFB is more or less something we invented, as far as I know.
Super Mario Sunshine and Super Mario Galaxy both use EFB Copies for effects as I said above, copying the rendered frame back to the GPU to modify the rendered frame for effects. Reflections, heat haze, blur, and bloom are common tricks on the GameCube and Wii. Sunshine uses EFB copies for the heat haze and other effects, and Galaxy uses it for bloom, lens flare, and other effects.
How does this relate to Dolphin’s settings? Ok first, EFB. The GameCube’s main memory is unified, so both the GPU and CPU can edit it. So the CPU can edit the frame buffer in main memory just like the GPU can. In fact, it can do more, since the CPU can read YUV when the GPU can’t, so it can edit the XFB too! Anyway, this means is that the CPU can perform edits on frame buffers for free on the GameCube and Wii, and many games took advantage of this fact. Modern PCs do not use unified memory, but we have to emulate this behavior somehow, so we have two options. First is “Store EFB Copies to Texture and RAM” which sends the frame buffer to both host VRAM and host System Memory. This way, if the game wants to use the CPU to edit the framebuffer, it can. However, this requires us to move the rendered frame from the host GPU VRAM over to the host system memory every single frame, and moving all of that data around hurts performance. That is the cost of accuracy for many games. However, generally most games only perform EFB Copy effects with the GPU and are fine with the other option: “Store EFB Copies to Texture Only”. This just option does not send the frame to the CPU, leaving it only in the GPU. This is how modern PCs are designed to behave, so it’s nice and fast. However, this is inaccurate, and if a game wants to use the CPU to edit the frame, it will just fail. Previously this failing would appear as black or random garbage, but recently we’ve made it show purple so it’s easy to spot.
To be clear, “most” is not “all”. The percentage of games that have something break with Store EFB Copies to Texture Only, guessing off the top of my head, is 40%ish. That may not sound like much, but that’s out of over 3000+ games. It’s a LOT of games! And many of them are popular ones.
By default we use Store EFB Copies to Texture Only, and if a game is known to break with this option, we overwrite it to Store EFB Copies to Texture and RAM for that game via a GameINI. If you see Store EFB Copies to Texture Only in bold and unchecked, don’t mess with it if you don’t know what you are doing, as it will break something.
Next is XFB. Our XFB emulation emulates the XFB region of main memory so that the game can use the CPU to edit it there. Since XFB is more or less just a special EFB Copy and memory region, most of what I said above applies. Even the settings are the same – “Store XFB Copies to Texture Only” and “Store XFB Copies to Texture and RAM”. However, there are a few differences, such as the fact that all XFB effects are entirely CPU driven since the GPU can’t read the YUV format. Plus XFB is related to VI and all of its weirdness that it can do during scanout. I won’t go into detail on this since we wrote an article all about it! https://dolphin-emu.org/blog/2017/11/19/hybridxfb/
AMD Threadripper Pro 5975WX PBO+200 | Asrock WRX80 Creator | NVIDIA GeForce RTX 4090 FE | 64GB DDR4-3600 Octo-Channel | Windows 11 22H2 | (details)
MacBook Pro 14in | M1 Max (32 GPU Cores) | 64GB LPDDR5 6400 | macOS 12
this post is awesome
OS: Windows 10 Pro “22H2”, 64-Bit (19045.2251)
CPU: Intel i5-6600 (Base: 3.3; Max 4 cores: 3.6; Max 3 cores: 3.7; Max 2 cores: 3.8; Max single core: 3.9 [GHz])
GPU: Nvidia GeForce GTX 970 (Driver version 527.56)
RAM: 32 Gib DDR4 at 2133 MHz (4 x 8 Gib)