There is no specified maximum register count, and the compiler can spill registers to memory if necessitated by register pressure. The DirectX® 11 Shader Model 5 compute shader specification (2009) mandates a maximum allowable memory size per thread group of 32 KiB, and a maximum workgroup size of 1024 threads. This article will be focusing on the problem set of large thread groups, but these tips and tricks are helpful in the common case as well. This article discusses potential performance issues, and techniques and optimizations that can dramatically increase performance if correctly applied. Limited register space, memory latency and SIMD occupancy each affect shader performance in different ways. When using a compute shader, it is important to consider the impact of thread group size on performance. Occupancy and Resource Usage Optimization with Large Thread Groups Sebastian is going to cover an interesting problem he faced while working on Claybook: how you can optimize GPU occupancy and resource usage of compute shaders that use large thread groups. Second Order have recently announced their first game, Claybook! Alongside the game looking like really great fun, its renderer is novel, using the GPU in very non-traditional ways in order to achieve its look. Welcome to our guest posting from Sebastian Aaltonen, co-founder of Second Order LTD and previously senior rendering lead at Ubisoft®.
0 Comments
Leave a Reply. |
AuthorEnergy ArchivesCategories |