I just started using the Compute shader scene in DirectX 11 and encountered some unwanted behavior when writing to an output resource in Compute shader. It seems that I get only zeros as the output, which, in my opinion, means that reads outside the bounds were performed in the output of the Compute shader. (Out-of-bound writes the results in no-ops)
Creating Compute shader Components
Input resources
First I create an ID3D11Buffer* for the input. This is passed as a resource when creating the SRV used to enter the Compute shader stage. If the input never changes, we could free the ID3D11Buffer* object after creating the SRV, since the SRV will act as a resource descriptor.
However, I want to update the input for each frame, so I'm just going to keep the buffer at my disposal for matching.
Create SRV using the newly created buffer as a resource
D3D11_SHADER_RESOURCE_VIEW_DESC srvDesc; srvDesc.Format = DXGI_FORMAT_UNKNOWN; srvDesc.ViewDimension = D3D11_SRV_DIMENSION_BUFFEREX; srvDesc.BufferEx.FirstElement = 0; srvDesc.BufferEx.Flags = 0; srvDesc.BufferEx.NumElements = NUM_PARTICLES; hr = device->CreateShaderResourceView( mInputBuffer, &srvDesc, &mInputView );
Output resources
Now I need to create a resource for the Compute shader for writing. I will also create a version of the system memory for reading the buffer. I will use the ID3D11DeviceContext :: CopyResource method to copy data from the output buffer of the Compute shader connected to the UAV to the system memory version to display and save its contents to the system memory.
Create a read and write buffer that the compute shader can write to
(D3D11_BIND_UNORDERED_ACCESS). D3D11_BUFFER_DESC outputDesc; outputDesc.Usage = D3D11_USAGE_DEFAULT; outputDesc.ByteWidth = sizeof(ParticleData) * NUM_PARTICLES; outputDesc.BindFlags = D3D11_BIND_UNORDERED_ACCESS; outputDesc.CPUAccessFlags = 0; outputDesc.StructureByteStride = sizeof(ParticleData); outputDesc.MiscFlags = D3D11_RESOURCE_MISC_BUFFER_STRUCTURED; hr = ( device->CreateBuffer( &outputDesc, 0, &mOutputBuffer ) );
Create a buffer version for system memory to read results from
outputDesc.Usage = D3D11_USAGE_STAGING; outputDesc.BindFlags = 0; outputDesc.CPUAccessFlags = D3D11_CPU_ACCESS_READ; hr = ( device->CreateBuffer( &outputDesc, 0, &mOutputResultBuffer ) );
Create a UAV to calculate the shader to record the results.
D3D11_UNORDERED_ACCESS_VIEW_DESC uavDesc; uavDesc.Buffer.FirstElement = 0; uavDesc.Buffer.Flags = 0; uavDesc.Buffer.NumElements = NUM_PARTICLES; uavDesc.Format = DXGI_FORMAT_UNKNOWN; uavDesc.ViewDimension = D3D11_UAV_DIMENSION_BUFFER; hr = device->CreateUnorderedAccessView( mOutputBuffer, &uavDesc, &mOutputUAV );
Execution of the calculated shader (each frame)
C ++
mParticleSystem.FillConstantDataBuffer( mDeviceContext, mInputBuffer ); // Enable Compute Shader mDeviceContext->CSSetShader( mComputeShader, nullptr, 0 ); mDeviceContext->CSSetShaderResources( 0, 1, &mInputView ); mDeviceContext->CSSetUnorderedAccessViews( 0, 1, &mOutputUAV, 0 ); // Dispatch mDeviceContext->Dispatch( 1, 1, 1 ); // Unbind the input textures from the CS for good housekeeping ID3D11ShaderResourceView* nullSRV[] = { NULL }; mDeviceContext->CSSetShaderResources( 0, 1, nullSRV ); // Unbind output from compute shader ID3D11UnorderedAccessView* nullUAV[] = { NULL }; mDeviceContext->CSSetUnorderedAccessViews( 0, 1, nullUAV, 0 ); // Disable Compute Shader mDeviceContext->CSSetShader( nullptr, nullptr, 0 ); // Copy result mDeviceContext->CopyResource( mOutputBuffer, mOutputResultBuffer ); // Update particle system data with output from Compute Shader D3D11_MAPPED_SUBRESOURCE mappedResource; HRESULT hr = mDeviceContext->Map( mOutputResultBuffer, 0, D3D11_MAP_READ, 0, &mappedResource ); if( SUCCEEDED( hr ) ) { ParticleData* dataView = reinterpret_cast<ParticleData*>(mappedResource.pData); // Update particle positions and velocities mParticleSystem.UpdatePositionAndVelocity( dataView ); mDeviceContext->Unmap( mOutputResultBuffer, 0 ); }
Hlsl
struct ConstantParticleData { float3 position; float3 velocity; float3 initialVelocity; }; struct ParticleData { float3 position; float3 velocity; }; StructuredBuffer<ConstantParticleData> inputConstantParticleData : register( t0 ); RWStructuredBuffer<ParticleData> outputParticleData : register( u0 ); [numthreads(32, 1, 1)] void CS_main( int3 dispatchThreadID : SV_DispatchThreadID ) { outputParticleData[dispatchThreadID.x].position = inputConstantParticleData[dispatchThreadID.x].position; }
Sorry for the amount of content in this question. I structured it with care so that it would be easier for you to get a review.
The number of elements passed to the shader is 32.
Any suggestions on my issue? Thanks!