12 KiB
+++ date = "2016-03-09T15:30:00-05:00" title = "Citra Progress Report - 2015 P2" tags = [ "progress-report" ] author = "bunnei" forum = 36 +++
This month we bring you the second installment of our two-part progress report on Citra in 2015! With this part, we discuss the evolution from Citra being able to barely run a few commercial games at a few frames-per-second, to where it is in 2016: Running many retail games at reasonable speeds, some of which are fully playable with near flawless graphics! We discuss Citra's new "dyncom" CPU core, the OpenGL renderer, per-pixel lighting, and various bug fixes. Lastly, we wrap up with an outlook for 2016, and a special thanks to everyone who has helped make Citra what it is today!
Spring 2015: A New CPU, Renderer and More
The arms race to get games booting ran into a wall. Often, testing games and getting to crash points would be incredibly painful because of how slow the emulator ran. Instead of measuring frames-per-second, testers often referred to seconds-per-frame. Donkey Kong Country Returns 3D would sometimes require three realtime seconds to render a single in-game frame in the software renderer! It took over 90 minutes to capture all of the footage for a three minute video of Super Monkey Ball 3D!
{{< youtube t0TGSeQe1wE >}}
Within the next few months, bunnei and Lioncash replaced the old "ARMULATOR" CPU core in Citra with a better implementation that was both several times faster and much more accurate. Despite the fact that it was still an interpreter, this new core was still efficient enough that the performance implications were huge. With this new core, games that were light on graphics could reach above half-speed on very strong processors.
Shortly thereafter, a new developer joined the Citra team – tfarley – with the ambitious goal of implementing a hardware renderer using OpenGL. Up until this point, Citra had used a software renderer primarily developed by neobrain to render graphics. While a software renderer is great for development and achieving pixel-perfect accuracy, Citra’s GPU emulation had become the major performance bottleneck. Even with infinitely fast CPU emulation, the software renderer was so slow that no games would run full speed despite this.
While the rest of the team continued with other development efforts, tfarley went on a month-long battle developing this new OpenGL renderer. All of this resulted in Ocarina of Time 3D running nearly perfect in Citra using OpenGL while running at a fairly decent speed! Below is a very early video of Ocarina of Time, before the renderer was completed and merged into CItra's mainline repository.
{{< youtube Hj8sPsB5qXQ >}}
Summer: More Accuracy, More Performance
With the summer, development slowed down a bit – but several additional improvements were made. By this point, the team had already made lots of incremental advances in 3DS emulation, resulting in a significant number of retail games booting, such as Super Mario 3D Land, Fire Emblem: Awakening, The Legend of Zelda: A Link Between Worlds, Mario Kart 7, and many more.
{{< img src="earlyunknown1.png" center="true" >}}
{{< img src="earlyunknown2.png" center="true" >}}
{{< img src="earlycrush3d.png" center="true" >}}
{{< img src="earlymario3dland.png" center="true" >}}
{{< img src="earlysteeldiver.png" center="true" >}}
{{< img src="earlyluigi.png" center="true" >}}
{{< img src="earlylinkbetweenworlds.png" center="true" >}}
{{< img src="earlymajorasmask.png" center="true" >}}
{{< img src="earlymk7.png" center="true" >}}
{{< img src="earlyfireemblem.png" center="true" >}}
But there was still one major issue that was blocking many games: video playback. 3DS games use a proprietary format
known as “MOFLEX” to play video clips, which are commonly used for intro logos, cut scenes, and more. Not only was the
video format unknown, but any time a MOFLEX video was used, Citra would hang in an infinite loop for unknown reasons.
However, this issue proved to be no match for our team – within a matter of weeks, Citra developers
yuriks and Subv reverse-engineered and implemented all of the
mechanisms necessary to prevent hanging and play MOFLEX videos!
{{< figure src="3.png" alt="Bravely Default" title="Bravely Default's intro sequence relies on MOFLEX video support" >}}
With many games now running stable with fairly accurate rendering, it became evident that with a bit more speed some titles would not only be playable in Citra, but enjoyable to play. On top of this, more speed would make testing even easier, so with a huge library of games now working, the task again came to making things faster.
The cause of the slowdown was very obvious: Citra’s emulation of 3DS vertex shaders. With emulators like Dolphin and PPSSPP, your GPU uses shaders to emulate the target system’s GPU, which does not actually use any shaders of its own. The 3DS, on the other hand, has a more modern GPU that natively supports its own shaders – which are not as trivial to emulate in the same manner. The approach that we took was similar to CPU emulation – and we were using a pure interpreter for the job. As such, the solution was pretty obvious; Even a naively implemented Just-In-Time (JIT) compiler would make our shader emulation scream. With that in mind, bunnei set out to implement a vertex shader JIT.
While it took several weeks to develop, the difference in speed was very obvious. Vertex heavy games were sometimes two or three times as fast, allowing some games like Ocarina of Time 3D to near full speed in many areas!
{{< youtube yhLEs4yEmlU >}}
Autumn: Closing out the Year with a Bang!
After a brief summer hiatus, development on Citra began to pick up again with autumn 2015. While Citra now had a library of games running without severe problems, many still relied on more advanced graphics features that had yet to be reverse-engineered. One of these more widely used features is fragment lighting – a feature that enables complex lighting calculations to be performed on a per-pixel basis. A major breakthrough was made when hacker and homebrew developer fincs made major strides in figuring out the 3DS fragment lighting implementation. Immediately, bunnei began embodying this work into Citra.
{{< figure src="rotatecube.gif" alt="Fragment shader test" title="An early fragment lighting demo running in Citra" >}}
Despite that it’s not a fully complete implementation of fragment lighting, the results have significantly improved Citra’s visuals!
{{< figure src="moonbefore.png" title="It's a super moon!" >}}
{{< figure src="moonafter.png" title="With fragment lighting implemented, the moon became even scarier! H-hurray?" >}}
While no more major features were merged into Citra by the turn of the year, Subv came up with one final teaser of things to come before letting 2015 come to a close. CROs, the dynamically linked libraries of the 3DS (similiar to DLLs on Windows), blocked several very popular 3DS games from booting in Citra, such as Pokemon X, Y, Omega Ruby, Alpha Sapphire and Super Smash Bros. for 3DS. While the implementation was far from done, it still was able to boot up some of the games that relied heavily on this feature.
{{< youtube 30NwGUYmIpU >}}
Citra in 2016
With 2016 already upon us, it's shaping up to be a pretty exciting year for Citra! In addition to the many tasks discussed in the 2015 progress reports that are still ongoing, we've got several exciting new features to look forward to:
- MerryMage has been making some exciting progress on an HLE implementation of the DSP, meaning audio support may come sooner than expected!
- tfarley has been working on a HW renderer optimization - "texture forwarding" - that minimizes copies of textures/framebuffers to/from emulated RAM. The result of this effort will be both a performance improvement as well as support for upscaled rendering!
- yuriks has plans to rewrite the vertex shader JIT to fix several inherent flaws, which will improve accuracy and address the crashing issues, as well as improved memory and IPC emulation in the kernel HLE.
- Subv is still looking into CRO support, as well as mipmapping and scissor testing.
- ds84182 is currently working on an implementation of Pica's geometry shaders.
- bunnei has plans to work on Circle Pad Pro (required for getting Majora's Mask in game), CROs, and a JIT compiler for faster CPU emulation.
- Lioncash has plans to continue working on ARM11 CPU emulation, improving both accuracy and performance.
With all of these exciting new features upcoming, it's quite possible that 2016 might be the year that Citra becomes official with a v1.0 release!
Special Thanks from Bunnei
While working on this article, I found that it was easy to find major changes to write about, but that these really only captured a fraction of the progress made in 2015 – as there was work done literally every single day to make Citra what it is now. While I wish I could draw attention to every contribution made, that’s just not reasonable to do in one article – so instead I’d like to personally thank each person that impacted Citra in the past year: Lioncash, yuriks, neobrain, Subv, archshift, linkmauve, tfarley, purpasmart96, aroulin, chinhodado, polaris-, zawata, kevinhartman, Cruel, Lectem, jroweboy, LittleWhite-tb, Normmatt, kemenaran, rohit-n, Yllodra, xsacha, darkf, filfat, SeannyM, uppfinnarn, Kingcom, Zaneo, wwylele, Zangetsu38, Apology11, Kloen, MoochMcGee, gwicks, vaguilar, chrisvj, Sethpaien, Gareth422, JSFernandes, esoteric-programmer, martinlindhe, clienthax, ILOVEPIE, zhuowei, Bentley, mailwl, Disruption, LFsWang, Antidote, ichfly.
I can’t wait to see what you guys bring for 2016!