Commit Graph

122 Commits

Author SHA1 Message Date
GPUCode
3f1f0aa7c2
arm: De-virtualize ThreadContext (#7119)
* arm: Move ARM_Interface to core namespace

* arm: De-virtualize ThreadContext
2023-11-06 17:55:30 -08:00
Wunk
e13735b624
video_core: Implement an arm64 shader-jit backend (#7002)
* externals: Add oaksim submodule

Used for emitting ARM64 assembly

* common: Implement aarch64 ABI

Utilize oaknut to implement a stack frame.

* tests: Allow shader-jit tests for x64 and a64

Run the shader-jit tests for both x86_64 and arm64 targets

* video_core: Initialize arm64 shader-jit backend

Passes all current unit tests!

* shader_jit_a64: protect/unprotect memory when jit-ing

Required on MacOS. Memory needs to be fully unprotected and then
re-protected when writing or there will be memory access errors on
MacOS.

* shader_jit_a64: Fix ARM64-Imm overflow

These conditionals were throwing exceptions since the immediate values
were overflowing the available space in the `EOR` instructions. Instead
they are generated from `MOV` and then `EOR`-ed after.

* shader_jit_a64: Fix Geometry shader conditional

* shader_jit_a64: Replace `ADRL` with `MOVP2R`

Fixes some immediate-generation exceptions.

* common/aarch64: Fix CallFarFunction

* shader_jit_a64: Optimize `SantitizedMul`

Co-authored-by: merryhime <merryhime@users.noreply.github.com>

* shader_jit_a64: Fix address register offset behavior

Based on https://github.com/citra-emu/citra/pull/6942
Passes unit tests.

* shader_jit_a64: Fix `RET` address offset

A64 stack is 16-byte aligned rather than 8. So a direct port of the x64
code won't work. Fixes weird branches into invalid memory for any
shaders with subroutines.

* shader_jit_a64: Increase max program size

Tuned for A64 program size.

* shader_jit_a64: Use `UBFX` for extracting loop-state

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit_a64: Optimize `SUB+CMP` to `SUBS`

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit_a64: Optimize `CMP+B` to `CBNZ`

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit_a64: Use `FMOV` for `ONE` vector

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit_a64: Remove x86-specific documentation

* shader_jit_a64: Use `UBFX` to extract exponent

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit_a64: Remove redundant MIN/MAX `SRC2`-NaN check

Special handling only needs to check SRC1 for NaN, not SRC2.
It would work as follows in the four possible cases:

No NaN: No special handling needed.
Only SRC1 is NaN: The special handling is triggered because SRC1 is NaN, and SRC2 is picked.
Only SRC2 is NaN: FMAX automatically picks SRC2 because it always picks the NaN if there is one.
Both SRC1 and SRC2 are NaN: The special handling is triggered because SRC1 is NaN, and SRC2 is picked.

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit/tests:: Add catch-stringifier for vec2f/vec3f

* shader_jit/tests: Add Dest Mask unit test

* shader_jit_a64: Fix Dest-Mask `BSL` operand order

Passes the dest-mask unit tests now.

* shader_jit_a64: Use `MOVI` for DestEnable mask

Accelerate certain cases of masking with MOVI as well

Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>

* shader_jit/tests: Add source-swizzle unit test

This is not expansive. Generating all `4^4` cases seems to make Catch2
crash. So I've added some component-masking(non-reordering) tests based
on the Dest-Mask unit-test and some additional ones to test
broadcasts/splats and component re-ordering.

* shader_jit_a64: Fix swizzle index generation

This was still generating `SHUFPS` indices and not the ones that we wanted for the `TBL` instruction. Passes all unit tests now.

* shader_jit/tests: Add `ShaderSetup` constructor to `ShaderTest`

Rather than using the direct output of `CompileShaderSetup` allow a
`ShaderSetup` object to be passed in directly.  This enabled the ability
emit assembly that is not directly supported by nihstro.

* shader_jit/tests: Add `CALL` unit-test

Tests nested `CALL` instructions to eventually reach an `EX2`
instruction.

EX2 is picked in particular since it is implemented as an even deeper
dispatch and ensures subroutines are properly implemented between `CALL`
instructions and implementation-calls.

* shader_jit_a64: Fix nested `BL` subroutines

`lr` was getting writen over by nested calls to `BL`, causing undefined
behavior with mixtures of `CALL`, `EX2`, and `LG2` instructions.

Each usage of `BL` is now protected with a stach push/pop to preserve
and restore teh `lr` register to allow nested subroutines to work
properly.

* shader_jit/tests: Allocate generated tests on heap

Each of these generated shader-test objects were causing the stack to
overflow.  Allocate each of the generated tests on the heap and use
unique_ptr so they only exist within the life-time of the `REQUIRE`
statement.

* shader_jit_a64: Preserve `lr` register from external function calls

`EMIT` makes an external function call, and should be preserving `lr`

* shader_jit/tests: Add `MAD` unit-test

The Inline Asm version requires an upstream fix:
https://github.com/neobrain/nihstro/issues/68

Instead, the program code is manually configured and added.

* shader_jit/tests: Fix uninitialized instructions

These `union`-type instruction-types were uninitialized, causing tests
to indeterminantly fail at times.

* shader_jit_a64: Remove unneeded `MOV`

Residue from the direct-port of x64 code.

* shader_jit_a64: Use `std::array` for `instr_table`

Add some type-safety and const-correctness around this type as well.

* shader_jit_a64: Avoid c-style offset casting

Add some more const-correctness to this function as well.

* video_core: Add arch preprocessor comments

* common/aarch64: Use X16 as the veneer register

https://developer.arm.com/documentation/102374/0101/Procedure-Call-Standard

* shader_jit/tests: Add uniform reading unit-test

Particularly to ensure that addresses are being properly truncated

* common/aarch64: Use `X0` as `ABI_RETURN`

`X8` is used as the indirect return result value in the case that the
result is bigger than 128-bits. Principally `X0` is the general-case
return register though.

* common/aarch64: Add veneer register note

`LR` is generally overwritten by `BLR` anyways, and would also be a safe
veneer to utilize for far-calls.

* shader_jit_a64: Remove unneeded scratch register from `SanitizedMul`

* shader_jit_a64: Fix CALLU condition

Should be `EQ` not `NE`. Fixes the regression on Kid Icarus.
No known regressions anymore!

---------

Co-authored-by: merryhime <merryhime@users.noreply.github.com>
Co-authored-by: JosJuice <JosJuice@users.noreply.github.com>
2023-11-05 21:40:31 +01:00
Steveice10
27bad3a699
audio_core: Replace AAC decoders with single FAAD2-based decoder. (#7098) 2023-11-04 14:56:13 -07:00
Castor215
ec55807669
build: fix build failure when not using precompiled headers (#7087)
Co-authored-by: vitor-k <vitor-kiguchi@hotmail.com>
2023-10-23 17:21:35 -03:00
Wunk
597297ffb4
tests: Fix out-of-bounds access (#7085) 2023-10-22 11:07:06 -07:00
GPUCode
ef43776c7b
shader: Fix address register offset behavior in x64 Jit (#6942)
* shader: Fix address register offset behavior in x64 Jit

* shader: Remove redundant jump

* tests: Add address register tests

* shader: Remove additional pre-multiplications by 16

* tests: Add catch-stringifier for vec4f

* tests: Format
2023-10-18 19:41:36 +03:00
SachinVin
72ff0c5337
AudioCore: Refactor DSP interrupt handling (#7026) 2023-10-04 15:44:59 +02:00
Vitor K
a35f8cbb78
fix include/namespace related compilation errors (#7019)
a user on discord reported compilation errors when trying to compile
on Linux with GCC 13 and Clang 16.
2023-09-28 18:36:50 +05:30
GPUCode
88ea66053e
Miscallenious fixes to gl backend and qt frontend (#6834)
* renderer_gl: Make rasterizer normal class member

* It doesn't need to be heap allocated anymore

* gl_rasterizer: Remove default_texture

* It's unused

* gl_rasterizer: General cleanup

* gl_rasterizer: Lower case lambdas

* Match style with review comments from vulkan backend

* rasterizer_cache: Prevent memory leak

* Since the switch from shared_ptr these surfaces were no longer being destroyed properly. Use our garbage collector for that purpose to destroy it safely for both backends

* rasterizer_cache: Make temp copy of old surface

* The custom surface would override the memory region of the old region resulting in garbage data, this ensures the custom surface is constructed correctly

* citra_qt: Manually create dialog tabs

* Allows for custom constructors which is very useful. While at it, global state is now eliminated from configuration

* citra_qt: Eliminate global system usage

* core: Remove global system usage in memory and HIO

* citra_qt: Use qOverload

* tests: Run clang format

* gl_texture_runtime: Fix surface scaling
2023-08-02 01:40:39 +03:00
GPUCode
f8b8b6e53c
core: De-globalize movie (#6659) 2023-08-01 02:57:38 +02:00
Steveice10
bb364d9bc0
service/apt: Add and implement more service commands. (#6721)
* service/apt: Add and implement more service commands.

* service/apt: Implement power button.

* Address review comments and fix GetApplicationRunningMode bug.
2023-07-29 00:26:16 -07:00
SachinVin
51996c54f0
audio_core\hle\adts_reader.cpp: Use BitField to parse ADTS header (#6719) 2023-07-28 12:15:58 -07:00
GPUCode
cf9bb90ae3
code: Use std::span where appropriate (#6658)
* code: Use std::span when possible

* code: Prefix memcpy and memcmp with std::
2023-07-07 01:52:40 +03:00
GPUCode
4ccd9f24fb
Merge pull request #6638 from GPUCode/new-log
common: Backport yuzu log improvements
2023-07-06 23:44:54 +03:00
Steveice10
9d4609e29a
build: Bundle libraries in-place as well on MSVC. (#6665) 2023-07-06 02:37:06 +02:00
Steveice10
13a8969824
build: Clear out remaining compile warnings. (#6662) 2023-07-04 21:00:24 -07:00
Wunk
0b37c1da57
shader_jit/tests: Add additional shader-jit tests (#6648)
* shader_jit/tests: Add support for multiple inputs

Allows for multiple `Vec4f` inputs for each run

* shader_jit/tests: Add additional shader-jit tests

Add some more expansive tests for each of the shader-instructions for
regression-testing.  `MAD`/`MADI` is not added due to an upstream bug in
nihstro:

https://github.com/neobrain/nihstro/issues/68
2023-07-03 02:44:56 +03:00
yzct12345
3641b9891d logging: Simplify and make thread-safe
This simplifies the logging system.

This also fixes some lost messages on startup.

The simplification is simple. I removed unused functions and moved most things in the .h to the .cpp. I replaced the unnecessary linked list with its contents laid out as three member variables. Anything that went through the linked list now directly accesses the backends. Generic functions are replaced with those for each specific use case and there aren't many. This change increases coupling but we gain back more KISS and encapsulation.

With those changes it was easy to make it thread-safe. I just removed the mutex and turned a boolean atomic. I was planning to use this thread-safety in my next PR about stacktraces. It was actually async-signal-safety at first but I ended up using a different approach. Anyway getting rid of the linked list is important for that because have the list of backends constantly changing complicates things.
2023-06-30 12:15:51 +03:00
GPUCode
9b82de6b24
Refactor software renderer (#6621) 2023-06-24 00:59:18 +02:00
SachinVin
5311c939a2 tests/audio_core: add sanity test cases for LLE vs HLE 2023-05-25 20:23:21 +05:30
SachinVin
8cada619b3 audio_core/hle: Refactor Binary Pipe data structures
audio_core\hle\ffmpeg_decoder.cpp: renames
2023-05-25 20:23:19 +05:30
SachinVin
41f13456c0
Chore: Enable warnings as errors on MSVC (#6456)
* tests: add Sanity test for SplitFilename83

fix test

fix test

* disable `C4715:not all control paths return a value` for nihstro includes

nihstro: no warn

* Chore: Enable warnings as errors on msvc + fix warnings

fixes

some more warnings

clang-format

* more fixes

* Externals: Add target_compile_options `/W0` nihstro-headers and ...

Revert "disable `C4715:not all control paths return a value` for nihstro includes"
This reverts commit 606d79b55d3044b744fb835025b8eb0f4ea5b757.

* src\citra\config.cpp: ReadSetting: simplify type casting

* settings.cpp: Get*Name: remove superflous logs
2023-05-01 22:38:58 +03:00
Steveice10
ea649263b7
build: Improvements to bundled libraries support. (#6435) 2023-04-28 13:02:53 -07:00
Steveice10
a8848cce43 build: Update to support multi-arch builds. 2023-01-07 01:09:32 -08:00
Tobias
ccb50e7f2c
Port yuzu-emu/yuzu#9300: "CMake: Use precompiled headers to improve compile times" (#6213)
Co-authored-by: Ameer J <52414509+ameerj@users.noreply.github.com>
2022-12-17 16:06:38 +01:00
PabloMK7
016ce6c286
Add 3GX plugin loader (#6172)
* Initial plugin loader support

* More plugin loader progress

* Organize code and more plugin features

* Fix clang-format

* Fix compilation and add android gui

* Fix clang-format

* Fix macos build

* Fix copy-paste bug and clang-format

* More merge fixes

* Make suggestions

* Move global variable to static member

* Fix typo

* Apply suggestions

* Proper initialization order

* Allocate plugin memory from SYSTEM instead of APPLICATION

* Do not mark free pages as RWX

* Fix plugins in old 3DS mode.

* Implement KernelSetState and notif 0x203

* Apply changes

* Remove unused variable

* Fix dynarmic commit

* Sublicense files with MIT License

* Remove non-ascii characters from license
2022-12-11 10:08:58 +02:00
Tobias
3201943423
Port yuzu-emu/yuzu#4437: "core_timing: Make use of uintptr_t to represent user_data" (#5499)
Co-authored-by: LC <lioncash@users.noreply.github.com>
2022-11-06 02:24:45 +01:00
Tobias
1ddea27ac8
code: Cleanup and warning fixes from the Vulkan PR (#6163)
Co-authored-by: emufan4568 <geoster3d@gmail.com>
Co-authored-by: Kyle Kienapfel <Docteh@users.noreply.github.com>
2022-11-04 23:32:57 +01:00
GPUCode
cbd5d1c15c
Upgrade codebase to C++ 20 + fix warnings + update submodules (#6115) 2022-09-21 18:36:12 +02:00
BreadFish64
353aaaf665
Merge pull request #6010 from SachinVin/gunman
shader_jit: Fixes for Gunman clive
2022-07-06 23:45:44 -05:00
SachinVin
65611e5b51 Shader jit: Save and restore LOOPCOUNT_REG for nested loops,
also add the assert back for nested loops
update test
2022-05-21 11:24:32 +05:30
Morph
adcc786ef2 tests: Resolve C4267 warning on MSVC 2022-05-18 00:05:41 -04:00
SachinVin
047e238d09 shader_jit: Compile nested loops
and use `T_NEAR` instead of the default in Compile_BREAKC
2022-04-24 23:12:53 +05:30
Tobias
664f5da105
tests: Fix warning about comparison between signed and unsigned (#5632)
Co-authored-by: comex <comexk@gmail.com>
2020-12-05 22:20:50 +01:00
Tobias
f6b543886c
Port yuzu-emu/yuzu#4528: "common: Make use of [[nodiscard]] where applicable" (#5535)
Co-authored-by: LC <712067+lioncash@users.noreply.github.com>
2020-08-31 21:06:16 +02:00
Ben
57aa18f52e
Improve core timing accuracy (#5257)
* Improve core timing accuracy

* remove wrong global_ticks, use biggest ticks over all cores for GetGlobalTicks

* merge max slice length change
2020-05-12 22:48:30 +02:00
Hamish Milne
f156fdd332 clang format fixes etc. 2020-03-31 18:27:33 +01:00
Hamish Milne
9bd189a155 More cleaning up 2020-03-29 19:07:56 +01:00
Hamish Milne
03379b2072 Merge remote-tracking branch 'upstream/master' into feature/savestates-2 2020-03-28 12:46:24 +00:00
Hamish Milne
1ff8d002a9
Merge pull request #5025 from jroweboy/tomoscrewme
Add CPU Clock Frequency slider
2020-03-28 12:31:41 +00:00
Hamish Milne
da3ab3d56e Merge branch 'master' into feature/savestates-2 2020-03-07 21:23:08 +00:00
Tobias
6d3d9f7a8a
core: Add support for N3DS memory mappings (#5103)
* core: Add support for N3DS memory mappings

* Address review comments
2020-02-29 19:48:27 +01:00
James Rowe
276d56ca9b Add CPU Clock Frequency slider
This slider affects the number of cycles that the guest cpu emulation
reports that have passed since the last time slice. This option scales
the result returned by a percentage that the user selects. In some games
underclocking the CPU can give a major speedup. Exposing this as an
option will give users something to toy with for performance, while also
potentially enhancing games that experience lag on the real console
2020-02-21 16:03:07 -07:00
Ben
55ec7031cc
Core timing 2.0 (#4913)
* Core::Timing: Add multiple timer, one for each core

* revert clang-format; work on tests for CoreTiming

* Kernel:: Add support for multiple cores, asserts in HandleSyncRequest because Thread->status == WaitIPC

* Add some TRACE_LOGs

* fix tests

* make some adjustments to qt-debugger, cheats and gdbstub(probably still broken)

* Make ARM_Interface::id private, rework ARM_Interface ctor

* ReRename TimingManager to Timing for smaler diff

* addressed review comments
2020-02-21 19:31:32 +01:00
Hamish Milne
116d22d562 Refactor out the wakeup_callback function pointer 2020-02-13 17:42:05 +08:00
Hamish Milne
96432589bd Use shared_ptr for PageTable 2020-02-13 17:42:04 +08:00
Hamish Milne
65d96bf6c1 Changed u8* to MemoryRef 2020-02-13 17:42:00 +08:00
Hamish Milne
d6862c2fca Some CI fixes 2020-02-13 17:40:52 +08:00
Hamish Milne
7b846ffa98 clang-format fixes 2020-02-13 17:39:15 +08:00
Hamish Milne
3ed8d95866 Serialize FS service; some compiler fixes 2020-02-13 17:38:24 +08:00