Some implementations can use the std::nullopt_t constructor of std::optional to avoid needing to completely zero out the internal buffer of the optional and instead only set the validity byte within it. e.g. Consider the following function: std::optional<std::vector<ShaderDiskCacheRaw>> fn() { return {}; } With libc++ this will result in the following code generation on x86-64: Fn(): mov rax, rdi vxorps xmm0, xmm0, xmm0 vmovups ymmword ptr [rdi], ymm0 vzeroupper ret With libstdc++, we also get the similar equivalent: Fn(): vpxor xmm0, xmm0, xmm0 mov rax, rdi vmovdqu XMMWORD PTR [rdi], xmm0 vmovdqu XMMWORD PTR [rdi+16], xmm0 ret If we change this function to return std::nullopt instead, then this simplifies both the code gen from libc++ and libstdc++ down to: Fn(): mov BYTE PTR [rdi+24], 0 mov rax, rdi ret Given how little of a change is necessary to result in better code generation, this is essentially a "free" very minor optimization. |
||
---|---|---|
.. | ||
texture_filters | ||
frame_dumper_opengl.cpp | ||
frame_dumper_opengl.h | ||
gl_format_reinterpreter.cpp | ||
gl_format_reinterpreter.h | ||
gl_rasterizer_cache.cpp | ||
gl_rasterizer_cache.h | ||
gl_rasterizer.cpp | ||
gl_rasterizer.h | ||
gl_resource_manager.cpp | ||
gl_resource_manager.h | ||
gl_shader_decompiler.cpp | ||
gl_shader_decompiler.h | ||
gl_shader_disk_cache.cpp | ||
gl_shader_disk_cache.h | ||
gl_shader_gen.cpp | ||
gl_shader_gen.h | ||
gl_shader_manager.cpp | ||
gl_shader_manager.h | ||
gl_shader_util.cpp | ||
gl_shader_util.h | ||
gl_state.cpp | ||
gl_state.h | ||
gl_stream_buffer.cpp | ||
gl_stream_buffer.h | ||
gl_surface_params.cpp | ||
gl_surface_params.h | ||
gl_vars.cpp | ||
gl_vars.h | ||
pica_to_gl.h | ||
post_processing_opengl.cpp | ||
post_processing_opengl.h | ||
renderer_opengl.cpp | ||
renderer_opengl.h |