You Can Use Vulkan Without Pipelines Today - Khronos Blog

You Can Use Vulkan Without Pipelines Today - Khronos Blog

You Can Use Vulkan Without Pipelines Today


Today, Khronos® is releasing a new multi-vendor Vulkan® extension that aims to radically simplify how applications specify shaders and shader state while maintaining Vulkan’s ethos of being a highly performant “API without secrets.”

This extension is VK_EXT_shader_object. It introduces a new VkShaderEXT object type which represents a single compiled shader stage, along with 4 new functions to manipulate and use VkShaderEXT objects.

Shader objects serve a role similar to pipelines but expose a simpler, more flexible interface designed to empower applications whose architectures and/or needs for “dynamism” might have previously made the use of pipelines difficult. In environments where VK_EXT_shader_object is supported, applications can choose to use only pipelines, only shader objects, or an arbitrary mix of the two.

In comparison with pipelines, shader objects differ in a number of key ways:

  • Pipelines require every desired combination of shaders (all stages when using conventional “monolithic” pipelines, or certain pre-defined combinations of stages when using pipeline libraries) to be compiled together, whereas shader objects allow stages to be compiled in arbitrary combinations, including compiling every shader individually.
  • With pipelines, linking is an explicit step that creates a new object with its own lifetime that needs to be managed by the application. With shader objects, linking is simply a creation-time promise from the application to the implementation that it will always use certain combinations of shader objects together.
  • Pipelines must always be linked before being used, whereas linking shader objects is optional.
  • Pipelines allow implementations to require some state to be statically provided at compile time, whereas with shader objects all state is always set dynamically and is independent from shaders.
  • Pipelines require rendering attachment formats to be specified at pipeline creation time, whereas shader objects don’t. Shader objects can be used with any valid combination of attachment formats supported by the device.
  • With pipelines, compiled shader code can be retrieved and reused by the application in the form of a pipeline cache, but this data is not usable by itself to create new pipelines. With shader objects, compiled shader code can be retrieved directly from any shader object and is guaranteed to be usable to create an equivalent shader object on any compatible physical device without the need to provide the original SPIR-V.

In summary, shader objects impose substantially fewer restrictions on applications compared to pipelines, and enable dynamism-heavy applications like games and game engines to avoid the explosive pipeline permutation combinatorics, which until now might have been seen as a cost of admission for access to modern graphics APIs.

For developers who may have considered Vulkan in the past but were intimidated by the architectural implications of using pipelines, Vulkan with shader objects should warrant another look.

Using Shader Objects

Applications can create one or more VkShaderEXT objects by calling vkCreateShadersEXT() with an array of creation parameters, one for each shader:

VKAPI_ATTR VkResult VKAPI_CALL vkCreateShadersEXT(
    VkDevice                                    device,
    uint32_t                                    createInfoCount,
    const VkShaderCreateInfoEXT*                pCreateInfos,
    const VkAllocationCallbacks*                pAllocator,
    VkShaderEXT*                                pShaders);

The creation parameters are specified by VkShaderCreateInfoEXT, which is defined as follows:

typedef struct VkShaderCreateInfoEXT {
    VkStructureType                 sType;
    const void*                     pNext;
    VkShaderCreateFlagsEXT          flags;
    VkShaderStageFlagBits           stage;
    VkShaderStageFlags              nextStage;
    VkShaderCodeTypeEXT             codeType;
    size_t                          codeSize;
    const void*                     pCode;
    const char*                     pName;
    uint32_t                        setLayoutCount;
    const VkDescriptorSetLayout*    pSetLayouts;
    uint32_t                        pushConstantRangeCount;
    const VkPushConstantRange*      pPushConstantRanges;
    const VkSpecializationInfo*     pSpecializationInfo;
} VkShaderCreateInfoEXT;

Any valid shader object can be queried by an application to retrieve a binary representation of its compiled shader code. This binary shader code can be passed to a future vkCreateShadersEXT() call on any compatible device to create a functionally equivalent shader object. The full details of this mechanism are described in detail in the specification.

At command buffer recording time, instead of binding a pipeline, applications directly bind one or more shader objects using vkCmdBindShadersEXT():

VKAPI_ATTR void VKAPI_CALL vkCmdBindShadersEXT(
    VkCommandBuffer                             commandBuffer,
    uint32_t                                    stageCount,
    const VkShaderStageFlagBits*                pStages,
    const VkShaderEXT*                          pShaders);

State is set using the same vkCmdSet*() functions from core Vulkan or an extension, or that were introduced by one of the VK_EXT_extended_dynamic_state, VK_EXT_extended_dynamic_state2, VK_EXT_extended_dynamic_state3, or VK_EXT_vertex_input_dynamic_state extensions. As long as the VK_EXT_shader_object extension and shaderObject feature are enabled, these dynamic state extensions and their features don’t need to be explicitly enabled (or even advertised as supported by the driver) in order to use their functions with shader objects.

This means that when drawing with shader objects, all state is dynamic. However, not all state always needs to be set. Applications are only required to set those pieces of state that affect enabled functionality.

For example, if vkCmdSetDepthTestEnable() was most recently called with depthTestEnable set to VK_FALSE, vkCmdSetDepthCompareOp() doesn’t need to have ever been called before drawing.

The full specifics of exactly which states must be set under which circumstances are spelled out explicitly in the specification. Of course, applications are free to ignore these rules and set as much “extra” state as they want.

Performance

One natural question to ask at this point is whether all this new flexibility comes at some performance cost. After all, if pipelines as they were originally conceived needed so many more restrictions, how can those restrictions be rolled back without negative consequences?

On some implementations, there is no downside. On these implementations, unless your application calls every state setter before every draw, shader objects outperform pipelines on the CPU and perform no worse than pipelines on the GPU. Unlocking the full potential of these implementations has been one of the biggest motivating factors driving the development of this extension.

On other implementations, CPU performance improvements from simpler application code using shader object APIs can outperform equivalent application code redesigned to use pipelines by enough that the cost of extra implementation overhead is outweighed by the performance improvements in the application.

In either case, all conformant VK_EXT_shader_object implementations are tested to meet specific performance requirements:

  • Draw calls using shader objects must not take more than 150% of the CPU time of draw calls using fully static graphics pipelines
  • Draw calls using shader objects must not take more than 120% of the CPU time of draw calls using maximally dynamic graphics pipelines
  • Dispatch calls using compute shader objects must not be measurably slower than dispatch calls using compute pipelines
  • Creating a shader object from binary shader code must not take more than 150% of the CPU time of the cost of copying an equivalent amount of data into device local memory

These tests are intended to establish a minimum performance bar for VK_EXT_shader_object implementations that developers can rely on. This means that if a driver advertises support for VK_EXT_shader_object, you can depend on it to perform well.

If you’re interested in the details of this extension’s performance goals and design criteria, or just more information about some of the motivations that drove the development of this extension, please see the formal extension proposal.

Using Shader Objects Today

An Nvidia beta driver with support for VK_EXT_shader_object is being released today. It can be downloaded from https://developer.nvidia.com/vulkan-driver.

Also today, the source code for a new layer that implements VK_EXT_shader_object support on top of existing functionality already shipping across a wide range of Vulkan drivers is being released in the Vulkan-ExtensionLayer repository. Binary builds of this layer will begin shipping as part of the Vulkan SDK beginning with the next release.

The layer has been designed from the ground up with integration into game codebases in mind, and every care has been taken to meet the expectations of game developers in terms of performance, disciplined allocation and use of memory, debuggability, and predictability of performance and other important runtime characteristics. It aims to silently do the best possible job of translating the application's use of shader object APIs into use of pipeline APIs by internally making use of any available dynamic state, plus the VK_EXT_graphics_pipeline_library extension if available. When running on top of a driver with native support for VK_EXT_shader_object, the layer turns itself off.

This means that you can start writing shader object code today without having to maintain pipeline-based code as a fallback. You can even ship the layer alongside your application. As support for VK_EXT_shader_object becomes more widespread and users update their drivers, your application will gradually experience more benefits from more performant native implementations.

Ray Tracing

The base VK_EXT_shader_object extension does not include support for ray tracing.

However, Vulkan Working Group interest in a separate future extension to support ray tracing shader objects has been accounted for in this extension’s design, and so a solid foundation exists for when work on such an extension begins.

If you are a Vulkan developer interested in using ray tracing with shader objects, please don’t hesitate to bring it up in the Vulkan Discord, a GitHub issue, or in an upcoming Vulkan Ecosystem Survey.

Conclusion

VK_EXT_shader_object represents the culmination of years of work to bring Vulkan developers ever more powerful and easier-to-use ways to manage shaders and state. We’re proud of what we have achieved and can’t wait to see what developers can do with this extension once you get your hands on it.

If you’re ready to dive in, you can find more detailed information in the proposal and the Shader Objects chapter of the Vulkan specification.