Born from years of industry experience. A modern runtime in active development on Windows, macOS, Xbox Series X|S, and Linux, designed from day one for the full console matrix. GPU-first where it matters, data-oriented in spirit, and explicit about ownership and fallback behavior. Targeting late 2027 release.
22 debug visualizers across all backends.
Hardware ray-traced shadows and reflections upgraded to path-traced GI as emissives light the scene.
RTXDI ReSTIR path tracing throughout. Path tracing runs with vendor specific and standardized implementations thorugh DXR and Vulkan KHR-RT.
Editor on the Metal 4 backend with world streaming cells overlaid, meshlet visualization on, and live culling stats. Resident, loaded, spawned, and despawned counts ticking through as the camera moves.
GPU-skinned characters with self-shadowing; full traced render pipeline including shadows, GI, and post-processing.
Sponza authoring world running on the D3D12 backend with deferred lighting, ray-traced features, HDR pipeline.
Same authoring world on the Vulkan backend, demonstrating cross-backend parity, plus dynamic and realistic atmospheric effects using probe based DDGI.
Most engine pages are a wall of features. Here's what you actually do differently with Kapi: the workflows it enables, the iteration loops it shortens, and the things it lets you ship.
Mount additional viewports on different RHIs and inspect them together: final frame, render-graph passes, shader output, motion vectors, depth, debug surfaces. Catch cross-vendor regressions before they ship.
EditorTLRC, DDGI, RTXDI ReSTIR, full path tracing: all swappable on the same scene at runtime. Look-dev moves at thinking speed.
Look-DevRecompile a game module, swap in the DLL, world state preserved. Shipping builds collapse to a static link with zero overhead.
IterationglTF and OpenUSD on import, Slang shaders compiled once for every backend. Whatever your DCC tool exports is what the engine consumes; cold-start is bounded by what changed since last run. PBR with glTF-extension surface model (clear coat, sheen, anisotropy, SSS, transmission) wired from day one.
Artist WorkflowTexture compression, mesh optimisation, meshlet generation, and shader variants dispatch across every CPU core, with multi-GPU acceleration where it pays. Content-addressed caching skips what didn't change; iteration loops in seconds, not coffee breaks.
Fast ImportsEvery runtime and pipeline operation has a CLI path: cook, render, capture, replay, test. Software mode anywhere; multi-GPU acceleration when present.
Headless / CLIParallel-first end to end. Job system scales to extreme core counts (XCR), NUMA-group aware, multi-GPU pipeline. Add hardware, the engine uses it.
Parallel / ScalableStandalone deterministic asset cooker with content-addressed cache and incremental DAG builds. Build farms work without graphics drivers; dev machines pull cached blobs.
Pipeline / CITracy hooks ship in the engine: per-pass GPU markers, per-job CPU markers, per-system phase timing. PIX and RenderDoc captures work too, with real names.
ProfilingNVIDIA, AMD, Intel, Apple: all four vendors run the bindless heap, the visibility buffer, and the path tracer. One code path, one feature contract.
Cross-VendorSame authoring data drives all four profiles (P0–P3). Lower tier means lighter lighting and fewer rays, never different gameplay or different content.
ScalabilitySmall, layered, explicit. No proprietary scripting layer mediating every system, no reflection compiler regenerating headers. When a frame takes 18 ms instead of 16, you can find out why.
TransparencyA subset of the friction that shows up in big-engine workflows, and how Kapi is designed differently. Not a flame; just the design choices that change your day.
Recent renderer milestones landed in the engine codebase: full path tracing across every GPU vendor, the modern temporal stack, vendor upscalers, and multi-backend parity. The engine is targeting late 2027 release.
Temporal Light Radiance Cache. A modern lightweight GI option live on D3D12 & Vulkan, with screen-probe resolve, temporal reprojection, adaptive ray allocation, and a world-cache feedback loop.
Modern GImacOS backend on the Metal 4 API with full path tracing, MetalFX upscaling, residency sets, and TBDR-aware pass ordering.
Apple · Metal 4 · MetalFXNVIDIA NGX integration: DLSS Super Resolution and Ray Reconstruction wired into the render graph. Frame Generation capability is detected and reported; presentation routing is the next phase.
NVIDIA · SR + RRAMD FidelityFX Super Resolution 3 frame interpolation wired across both D3D12 and Vulkan paths, sharing the engine's motion-vector and depth contracts.
AMD · FSR3-FGRTXDI ReSTIR direct illumination and full path tracing running across all four GPU vendors (NVIDIA, AMD, Intel, and Apple) via DXR, Vulkan KHR-RT, and Metal 4 ray tracing.
NVIDIA · AMD · Intel · AppleDynamic Diffuse Global Illumination using ray-traced irradiance probes laid out on a regular grid, classifier-driven probe relocation, and 8-direction visibility. Selectable alongside TLRC.
Probe-Based GIComplete HDR ownership from lighting through post-processing. HDR10 output on D3D12 & Vulkan with correct EOTF/PQ encoding at present.
HDR10NVIDIA Real-time Denoiser integration covering REBLUR/RELAX for raytraced shadows, reflections, and GI, with a custom temporal denoiser as a feature-flagged fallback.
NRD · CustomUnified bindless slot tracking across D3D12 and Vulkan. Materials, textures, structured buffers, samplers and acceleration structures all routed through a single index space.
D3D12 + VulkanProduction VisBuffer pipeline with 12-bit drawcall, 8-bit triangle, 12-bit meshlet packed into a 32-bit payload. 64-vertex/124-triangle meshlets driven by GPU culling.
GPU-DrivenGraph-owned motion vector texture with camera/static reprojection. In-house TAA with neighborhood clamping, plus camera view history available to any temporal consumer.
Temporal FoundationGround Truth Ambient Occlusion and Conservative Morphological AA, both production-validated and selectable through the post-process path enum on either backend.
AO + AAFeature parity across the live RHIs (DirectX 12, Vulkan, Metal 4, GDK / D3D12X for Xbox Series X|S, Vulkan on Linux). Side-by-side viewport comparison lets you A/B any pair in the editor. Remaining console SDKs on the roadmap.
Multi-RHISix points that shape how the engine is built. Expand for the full set of architectural invariants and code-philosophy rules.
Work moves to the GPU when throughput gain is real. GPU-driven culling, skinning dispatch, scene BVH upload, and visibility-buffer material evaluation run on the device today.
32-bit packed visibility payload (12+8+12 bits) feeds compute material evaluation. 64-vertex/124-triangle meshlets driven by GPU frustum/occlusion/backface culling.
Job system scales to all available CPU cores; NUMA-aware worker pinning via env flag. Lock-free atomic queues, DAG dependencies, deterministic mode for repro.
Render-graph DAG with automatic barrier coalescing and transient resource aliasing for VRAM savings. Live on DirectX 12, Vulkan, Metal 4, Xbox GDK (D3D12X), and Linux (Vulkan); remaining consoles on the roadmap.
POD-by-default data, contiguous storage, deterministic job scheduling. The full archetype ECS is on the roadmap; current systems use scene-graph storage with cache-friendly access.
Same data contract, different execution path. Lower-tier fallbacks reduce quality but never require different gameplay code. One codebase, all platforms.
Deterministic content-addressed cooking with DDC. Budget validation at cook time. Runtime consumes prebuilt streaming-friendly formats exclusively.
Module / DLL reload via DynamicModuleLibrary today; shader reload, engine hot-restart, and Live++ C++ patching designed for sub-second iteration.
Systems schedule jobs and declare dependencies. No direct OS threads as authoring model. Synchronization on hot paths is minimized via lock-free atomics.
Game state structured for locality, batching, and vectorization. ECS with contiguous storage, SIMD-friendly data layout, and predictable iteration patterns.
Render-graph style dependency control with deliberate pass dependencies, resource transitions, and feature fallbacks. No hidden state machines.
Throughput-critical work moves to GPU. Gameplay-critical, deterministic, or rollback-sensitive work may remain CPU-owned when latency demands it.
Every simulation product declares CPU-owned, GPU-owned, or mirrored. Sync phases with explicit visibility rules prevent readback stalls.
CPU, GPU, memory, I/O, and thermal envelopes per subsystem. Cook-time validation. Runtime budget broker with degradation ladders and hysteresis.
No exceptions, no RTTI. Engine-owned types with explicit allocators in hot paths; std used pragmatically where it pulls its weight. constexpr/consteval over template metaprogramming.
No tracing garbage collection. Exclusively manual or deterministic reference counting with cycle-free invariants validated by telemetry.
Interface + implementation only. No virtual dispatch on hot paths. No dynamic_cast. Composition over inheritance throughout.
No SFINAE, CRTP frameworks, expression templates, or policy-template design. Narrow typed templates for containers and math only.
No global new/delete. Arena, pool, frame, stack, and TLSF allocators. Every allocation has a known lifetime and budget owner.
Zig and Rust permitted for offline tools (asset processors, build utilities). AngelScript for gameplay scripting with GC-dormant contract.
Four profiles drive every subsystem decision. Same content, same contract, scaled execution.
| P0 Compatibility Low-end / Fallback |
P1 Sustained Mobile Phones & Tablets |
P2 Fixed High-BW PS5 · Xbox Series X |
P3 Scalable Discrete High-End PC |
|
|---|---|---|---|---|
| Render path | Forward | Forward, TBDR-optimised | Visibility buffer | VisBuffer + mesh shaders + RT |
| Culling | CPU BVH frustum | CPU BVH frustum | GPU-driven | Meshlet cluster + Hi-Z |
| GI / lighting | Baked + SSGI fallback | Baked + selective probes | RT shadows, screen-probe GI | Full path tracing · RT GI · RT reflections |
| Post / AA | FXAA, basic bloom | TAA, GTAO | TAA + virtual textures | DLSS-RR, FSR3, full post chain |
| Simulation | CPU only | CPU + light GPU | GPU particles & audio | Full GPU simulation |
| Notable | — | Thermal governance, unified memory | Tier-1/2/3 RT detection | Multi-GPU pipeline acceleration |
For when you want to drill into a specific subsystem. 40+ engine systems organised into seven categories. Click any category to browse, or open the full reference grid below.
KAPI_JOBS_NUMA_PIN env flag)KAPI_DETERMINISTIC_JOBSKAPI_TRACY_ENABLED)Companion tools that let you drill through Kapi's execution layers, render passes, and runtime steps — without launching the engine.
Interactive explorer for the engine's execution layers and runtime steps. Drill from frame boundary down through render-graph passes, job graphs, GPU dispatches, and tooling hand-offs.
Live · InteractiveStatic reference: every layer of the runtime from app host through render graph to GPU backends. Visibility-buffer payload, pipeline flow, and per-pass tables.
ReferenceLive on Windows (DirectX 12 & Vulkan), macOS (Metal 4), Xbox Series X|S (GDK · D3D12X), and Linux (Vulkan). Remaining console matrix in development. Engine targets late 2027 release.