This also fixes Turing issues but it avoids doing more bitcasts. This
should improve the generated code while also avoiding more points where
compilers can flush floats.
Implements the common usages for VMNMX. Inputs with a different size
than 32 bits are not supported and sign mismatches aren't supported
either.
VMNMX works as follows:
It grabs Ra and Rb and applies a maximum/minimum on them (this is
defined by .MX), having in mind the input sign. This result can then be
saturated. After the intermediate result is calculated, it applies
another operation on it using Rc. These operations are merges,
accumulations or another min/max pass.
This instruction allows to implement with a more flexible approach GCN's
min3 and max3 instructions (for instance).
preserve_contents was always true. We can't assume we don't have to
preserve clears because scissored and color masked clears exist.
This removes preserve_contents and assumes it as true at all times.
This corrects the behavior of free buffer after witnessing it in an
unrelated hardware test. I haven't found any games affected by it but in
name of better accuracy we'll correct such behavior.
Since commit e22816a5bb we handle type mismatches from the CPU.
We don't need to hack our shader decoder due to game bugs anymore.
Removed in this commit.
On Windows, network shares use paths like \\server\share\file which were
being broken by FileUtil::SanitizePath() removing double slashes.
Changed the code in SanitizePath to permit a double-backslash if it
occurs at the start of a filepath (on Windows only).
Presentation context always has GL_DRAW_FRAMEBUFFER_BINDING as zero.
There is no need to bind the default framebuffer constantly.
According to Nsight this was using ~0.7ms per frame and it broke
renderdoc captures.
This is a reversed look up table extracted from
https://gist.github.com/rygorous/2203834#file-gistfile1-cpp-L41-L62
that is used in
04d4e9e587/source/maxwell/tsc_generate.cpp (L38)
Games usually bind 0xFD expecting a float texture border of 1.0f.
The conversion previous to this commit was multiplying the uint8 sRGB
texture border color by 255. This is close to 1.0f but when that
difference matters, some graphical glitches appear.
This look up table is manually changed in the edges, clamping towards
0.0f and 1.0f.
While we are at it, move this logic to its own translation unit.
Reimplements I2I adding sign extension, saturation (clamp source value
to the destination), selection and destination sizes that are not 32
bits wide.
It doesn't implement CC yet.