yuzu-emu
/
yuzu-android
Archived
1
0
Fork 0
Commit Graph

218 Commits

Author SHA1 Message Date
Rodrigo Locatti 26f3e18c5c
Merge pull request #2976 from FernandoS27/cache-fast-brx-rebased
Implement Fast BRX, fix TXQ and addapt the Shader Cache for it
2019-10-26 16:56:13 -03:00
Fernando Sahmkow be856a38d6 Shader_IR: Address Feedback. 2019-10-26 15:38:30 -04:00
Rodrigo Locatti d52598173d
Merge pull request #3013 from FernandoS27/tld4s-fix
Shader_Ir: Fix TLD4S from using a component mask.
2019-10-25 20:06:26 -03:00
Fernando Sahmkow 33fcec3502 Shader_IR: allow lookup of texture samplers within the shader_ir for instructions that don't provide it 2019-10-25 09:01:30 -04:00
Lioncash 1f5401c89c video_core/shader: Resolve instances of variable shadowing
Silences a few -Wshadow warnings.
2019-10-23 23:00:31 -04:00
Fernando Sahmkow 1509d2ffbd Shader_Ir: Fix TLD4S from using a component mask.
TLD4S always outputs 4 values, the previous code checked a component 
mask and omitted those values that weren't part of it. This commit 
corrects that and makes sure all 4 values are set.
2019-10-22 10:59:07 -04:00
ReinUsesLisp 1ea07954fb shader_ir/memory: Ignore global memory when tracking fails
Ignore global memory operations instead of invoking undefined behaviour
when constant buffer tracking fails and we are blasting through asserts,
ignore the operation.

In the case of LDG this means filling the destination registers with
zeroes; for STG this means ignore the instruction as a whole.

The default behaviour is still to abort execution on failure.
2019-10-22 02:49:17 -03:00
ReinUsesLisp 3d0f357307
shader/half_set_predicate: Fix HSETP2 for constant buffers
HSETP2 when used with a constant buffer parses the second operand type
as F32. This is not configurable.
2019-10-07 14:49:47 -03:00
ReinUsesLisp 632c9e4ee3
shader/half_set_predicate: Reduce DEBUG_ASSERT to LOG_DEBUG 2019-10-07 14:48:58 -03:00
bunnei 376f1a4432
Merge pull request #2869 from ReinUsesLisp/suld
shader/image: Implement SULD and fix SUATOM
2019-09-23 21:47:03 -04:00
Rodrigo Locatti 9286976948
Merge pull request #2878 from FernandoS27/icmp
shader_ir: Implement ICMP
2019-09-21 18:06:07 -03:00
ReinUsesLisp 44000971e2
gl_shader_decompiler: Use uint for images and fix SUATOM
In the process remove implementation of SUATOM.MIN and SUATOM.MAX as
these require a distinction between U32 and S32. These have to be
implemented with imageCompSwap loop.
2019-09-21 17:33:52 -03:00
ReinUsesLisp 675f23aedc
shader/image: Implement SULD and remove irrelevant code
* Implement SULD as float.
* Remove conditional declaration of GL_ARB_shader_viewport_layer_array.
2019-09-21 17:32:48 -03:00
Fernando Sahmkow 527b841c15 Shader_IR: ICMP corrections and fixes 2019-09-21 14:28:03 -04:00
bunnei 88d857499b
Merge pull request #2855 from ReinUsesLisp/shfl
shader_ir/warp: Implement SHFL for Nvidia devices
2019-09-20 17:10:42 -04:00
Fernando Sahmkow 4b81d19a1a Shader_IR: Implement ICMP. 2019-09-19 20:56:29 -04:00
bunnei b31880dc5e
Merge pull request #2784 from ReinUsesLisp/smem
shader_ir: Implement shared memory
2019-09-18 16:26:05 -04:00
ReinUsesLisp 0526bf1895 shader_ir/warp: Implement SHFL 2019-09-17 17:44:07 -03:00
ReinUsesLisp 36abf67e79 shader/image: Implement SUATOM and fix SUST 2019-09-10 20:22:31 -03:00
bunnei 34b2c60f95
Merge pull request #2823 from ReinUsesLisp/shr-clamp
shader/shift: Implement SHR wrapped and clamped variants
2019-09-10 11:56:17 -04:00
ReinUsesLisp 1f43e5296f gl_shader_decompiler: Keep track of written images and mark them as modified 2019-09-05 23:26:05 -03:00
ReinUsesLisp 4de04eba39 shader_ir: Implement LD_S
Loads from shared memory.
2019-09-05 01:38:37 -03:00
ReinUsesLisp f17415d431 shader_ir: Implement ST_S
This instruction writes to a memory buffer shared with threads within
the same work group. It is known as "shared" memory in GLSL.
2019-09-05 01:38:37 -03:00
ReinUsesLisp 77ef4fa907 shader/shift: Implement SHR wrapped and clamped variants
Nvidia defaults to wrapped shifts, but this is undefined behaviour on
OpenGL's spec. Explicitly mask/clamp according to what the guest shader
requires.
2019-09-04 01:55:24 -03:00
ReinUsesLisp dfae2d141a half_set_predicate: Fix predicate assignments 2019-09-04 01:54:23 -03:00
bunnei 81fbc5370d
Merge pull request #2812 from ReinUsesLisp/f2i-selector
shader_ir/conversion: Implement F2I and F2F F16 selector
2019-09-03 22:35:33 -04:00
bunnei d4f33b822b
Merge pull request #2811 from ReinUsesLisp/fsetp-fix
float_set_predicate: Add missing negation bit for the second operand
2019-09-03 22:34:34 -04:00
Rodrigo Locatti 4d4f9cc104 video_core: Silent miscellaneous warnings (#2820)
* texture_cache/surface_params: Remove unused local variable

* rasterizer_interface: Add missing documentation commentary

* maxwell_dma: Remove unused rasterizer reference

* video_core/gpu: Sort member declaration order to silent -Wreorder warning

* fermi_2d: Remove unused MemoryManager reference

* video_core: Silent unused variable warnings

* buffer_cache: Silent -Wreorder warnings

* kepler_memory: Remove unused MemoryManager reference

* gl_texture_cache: Add missing override

* buffer_cache: Add missing include

* shader/decode: Remove unused variables
2019-08-30 14:08:00 -04:00
bunnei f8cc5668f8
Merge pull request #2758 from ReinUsesLisp/packed-tid
shader/decode: Implement S2R Tic
2019-08-29 12:58:43 -04:00
ReinUsesLisp e3534700d7 shader_ir/conversion: Split int and float selector and implement F2F H1 2019-08-28 16:09:33 -03:00
ReinUsesLisp b13fbc25b8 shader_ir/conversion: Implement F2I F16 Ra.H1 2019-08-27 23:40:40 -03:00
ReinUsesLisp 6207751b00 float_set_predicate: Add missing negation bit for the second operand 2019-08-27 21:57:43 -03:00
ReinUsesLisp 4e35177e23 shader_ir: Implement VOTE
Implement VOTE using Nvidia's intrinsics. Documentation about these can
be found here
https://developer.nvidia.com/reading-between-threads-shader-intrinsics

Instead of using portable ARB instructions I opted to use Nvidia
intrinsics because these are the closest we have to how Tegra X1
hardware renders.

To stub VOTE on non-Nvidia drivers (including nouveau) this commit
simulates a GPU with a warp size of one, returning what is meaningful
for the instruction being emulated:

* anyThreadNV(value) -> value
* allThreadsNV(value) -> value
* allThreadsEqualNV(value) -> true

ballotARB, also known as "uint64_t(activeThreadsNV())", emits

VOTE.ANY Rd, PT, PT;

on nouveau's compiler. This doesn't match exactly to Nvidia's code

VOTE.ALL Rd, PT, PT;

Which is emulated with activeThreadsNV() by this commit. In theory this
shouldn't really matter since .ANY, .ALL and .EQ affect the predicates
(set to PT on those cases) and not the registers.
2019-08-21 14:50:38 -03:00
bunnei dfdd20142e
Merge pull request #2777 from ReinUsesLisp/hsetp2-fe3h-fix
half_set_predicate: Fix HSETP2_C constant buffer offset
2019-08-21 10:29:17 -04:00
bunnei cedc1aab4a
Merge pull request #2753 from FernandoS27/float-convert
Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
2019-08-21 10:27:57 -04:00
bunnei ca61e298b3
Merge pull request #2778 from ReinUsesLisp/nop
shader_ir: Implement NOP
2019-08-18 08:51:34 -04:00
ReinUsesLisp 2ff8044806 shader_ir: Implement NOP 2019-08-04 03:02:55 -03:00
ReinUsesLisp ec0da3ef64 half_set_predicate: Fix HSETP2_C constant buffer offset 2019-08-04 02:50:55 -03:00
ReinUsesLisp 77f1a676a1 decode/half_set_predicate: Fix predicates 2019-07-26 00:12:38 -03:00
bunnei 31e8a61527
Merge pull request #2743 from FernandoS27/surpress-assert
Downgrade and suppress a series of GPU asserts and debug messages.
2019-07-25 12:34:36 -04:00
ReinUsesLisp 104641db07 shader/decode: Implement S2R Tic 2019-07-22 16:16:10 -03:00
Fernando Sahmkow 11f4e739bd Shader_Ir: Implement F16 Variants of F2F, F2I, I2F.
This commit takes care of implementing the F16 Variants of the 
conversion instructions and makes sure conversions are done.
2019-07-20 17:38:25 -04:00
Fernando Sahmkow 1158777737 Shader_Ir: Change Debug Asserts for Log Warnings 2019-07-19 22:15:34 -04:00
ReinUsesLisp 45c162444d shader/half_set_predicate: Fix HSETP2 implementation 2019-07-19 22:21:22 -03:00
ReinUsesLisp 6c4985edc9 shader/half_set_predicate: Implement missing HSETP2 variants 2019-07-19 22:20:47 -03:00
bunnei 63bda67a34
Merge pull request #2738 from lioncash/shader-ir
shader-ir: Minor cleanup-related changes
2019-07-18 13:52:01 -04:00
Fernando Sahmkow 5a06e33859 Shader_Ir: correct clang format 2019-07-18 10:09:26 -04:00
Fernando Sahmkow 0b65e9335e Shader_Ir: Downgrade precision and rounding asserts to debug asserts.
This commit reduces the sevirity of asserts for FP precision and 
rounding as this are well known and have little to no consequences in 
gpu's accuracy.
2019-07-18 08:17:19 -04:00
Fernando Sahmkow 223a535f3f
Merge pull request #2740 from lioncash/bra
shader/decode/other: Correct branch indirect argument within BRA handling
2019-07-17 14:25:08 -04:00
Lioncash 60926ac16b shader_ir: Rename Get/SetTemporal to Get/SetTemporary
This is more accurate in terms of describing what the functions are
actually doing. Temporal relates to time, not the setting of a temporary
itself.
2019-07-16 19:47:43 -04:00