41929371dc
Optimize AttributeBuffer to OutputVertex conversion First I unrolled the inner loop, then I pushed semantics validation outside of the hotloop. I also added overflow slots to avoid conditional branches. Super Mario 3D Land's intro runs at almost full speed when compiled with Clang, and theres a noticible speed increase in MSVC. GCC hasn't been tested but I'm confident in its ability to optimize this code. |
||
---|---|---|
.. | ||
debug_data.h | ||
shader.cpp | ||
shader.h | ||
shader_interpreter.cpp | ||
shader_interpreter.h | ||
shader_jit_x64.cpp | ||
shader_jit_x64.h | ||
shader_jit_x64_compiler.cpp | ||
shader_jit_x64_compiler.h |