Add code required to use OpenGL assembly programs based on
NV_gpu_program5. Decompilation for ARB programs is intended to be added
in a follow up commit. This does **not** include ARB decompilation and
it's not in an usable state.
The intention behind assembly programs is to reduce shader stutter
significantly on drivers supporting NV_gpu_program5 (and other required
extensions). Currently only Nvidia's proprietary driver supports these
extensions.
Add a UI option hidden for now to avoid people enabling this option
accidentally.
This code path has some limitations that OpenGL compatibility doesn't
have:
- NV_shader_storage_buffer_object is limited to 16 entries for a single
OpenGL context state (I don't know if this is an intended limitation, an
specification issue or I am missing something). Currently causes issues
on The Legend of Zelda: Link's Awakening.
- NV_parameter_buffer_object can't bind buffers using an offset
different to zero. The used workaround is to copy to a temporary buffer
(this doesn't happen often so it's not an issue).
On the other hand, it has the following advantages:
- Shaders build a lot faster.
- We have control over how floating point rounding is done over
individual instructions (SPIR-V on Vulkan can't do this).
- Operations on shared memory can be unsigned and signed.
- Transform feedbacks are dynamic state (not yet implemented).
- Parameter buffers (uniform buffers) are per stage, matching NVN and
hardware's behavior.
- The API to bind and create assembly programs makes sense, unlike
ARB_separate_shader_objects.
Constant attributes (in OpenGL known disabled attributes) are not
supported on Vulkan, even with extensions. To emulate this behavior we
return zero on reads from disabled vertex attributes in shader code.
This has no caching cost because attribute formats are not dynamic state
on Vulkan and we have to store it in the pipeline cache anyway.
- Fixes Animal Crossing: New Horizons terrain borders
This was a left over from OpenGL when disabled buffers where not properly
emulated. We no longer have to assert this as it is checked in vertex
buffer initialization.
Previously we never cleared the states of the entries and the key would stay held down, also looping over the key bytes for each key lead to setting every bit for the key state instead of the key we wanted
"Not equal" operators on GLSL seem to behave as unordered when we expect
an ordered comparison.
Manually emulate this checking for LGE values (numbers, not-NaNs).