citra-canary

Archived

This repository has been archived on 2024-03-23. You can view files and clone it, but cannot push or open issues or pull requests.

History

Lioncash b8d43d4dfb common/swap: Improve codegen of the default swap fallbacks Uses arithmetic that can be identified more trivially by compilers for optimizations. e.g. Rather than shifting the halves of the value and then swapping and combining them, we can swap them in place. e.g. for the original swap32 code on x86-64, clang 8.0 would generate: mov ecx, edi rol cx, 8 shl ecx, 16 shr edi, 16 rol di, 8 movzx eax, di or eax, ecx ret while GCC 8.3 would generate the ideal: mov eax, edi bswap eax ret now both generate the same optimal output. MSVC used to generate the following with the old code: mov eax, ecx rol cx, 8 shr eax, 16 rol ax, 8 movzx ecx, cx movzx eax, ax shl ecx, 16 or eax, ecx ret 0 Now MSVC also generates a similar, but equally optimal result as clang/GCC: bswap ecx mov eax, ecx ret 0 ==== In the swap64 case, for the original code, clang 8.0 would generate: mov eax, edi bswap eax shl rax, 32 shr rdi, 32 bswap edi or rax, rdi ret (almost there, but still missing the mark) while, again, GCC 8.3 would generate the more ideal: mov rax, rdi bswap rax ret now clang also generates the optimal sequence for this fallback as well. This is a case where MSVC unfortunately falls short, despite the new code, this one still generates a doozy of an output. mov r8, rcx mov r9, rcx mov rax, 71776119061217280 mov rdx, r8 and r9, rax and edx, 65280 mov rax, rcx shr rax, 16 or r9, rax mov rax, rcx shr r9, 16 mov rcx, 280375465082880 and rax, rcx mov rcx, 1095216660480 or r9, rax mov rax, r8 and rax, rcx shr r9, 16 or r9, rax mov rcx, r8 mov rax, r8 shr r9, 8 shl rax, 16 and ecx, 16711680 or rdx, rax mov eax, -16777216 and rax, r8 shl rdx, 16 or rdx, rcx shl rdx, 16 or rax, rdx shl rax, 8 or rax, r9 ret 0 which is pretty unfortunate.		2019-04-15 17:56:16 +02:00
..
android	android: add logging	2019-03-09 18:23:32 -06:00
audio_core	Destroy the callback after the stream is destroyed	2019-04-05 14:16:55 -06:00
citra	Merge pull request #4681 from FearlessTobi/port-2188-2190	2019-04-09 21:18:34 +02:00
citra_qt	Merge pull request #4726 from FearlessTobi/port-2312	2019-04-13 18:00:09 -04:00
common	common/swap: Improve codegen of the default swap fallbacks	2019-04-15 17:56:16 +02:00
core	Merge pull request #4716 from wwylele/client-is-known	2019-04-15 09:08:07 -04:00
dedicated_room	Fix getopt on systems where char is unsigned by default	2019-03-15 23:19:24 +00:00
input_common	general: Use deducation guides for std::lock_guard and std::unique_lock	2019-04-07 15:14:29 +02:00
network	general: Use deducation guides for std::lock_guard and std::unique_lock	2019-04-07 15:14:29 +02:00
tests	HLE/IPC: HLEContext can memorize the client thread and use it for SleepClientThread	2019-04-02 13:23:39 -04:00
video_core	Merge pull request #4726 from FearlessTobi/port-2312	2019-04-13 18:00:09 -04:00
web_service	Merge pull request #4726 from FearlessTobi/port-2312	2019-04-13 18:00:09 -04:00
.clang-format	add java to .clang-format	2019-02-22 16:29:19 -06:00
CMakeLists.txt	android: move cmakelist	2019-01-15 19:24:03 -06:00