Summary

This project dynamically generates terrain using GPU-accelerated compute shaders.

Dynamic Terrain Generation Project

The key functionality of this project is implemented in the following function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
void Tutorial::run_terrain_generation(std::vector<BlockCoord> &blocks) {

// Use a hash map to handle evicted blocks in case of collisions
for (auto &block : blocks) {
// ---- First Pipeline: Generate Density and Normal Maps ----
vkCmdBindPipeline(terrain_cmd_buf, VK_PIPELINE_BIND_POINT_COMPUTE, pterrain_pipeline.handle);
vkCmdDispatch(terrain_cmd_buf, groupCountX, groupCountY, groupCountZ);

// ---- Second Pipeline: Generate Mesh Vertices and Triangle Count ----
vkCmdBindPipeline(terrain_cmd_buf, VK_PIPELINE_BIND_POINT_COMPUTE, pterrain_triangle_pipeline.handle);
vkCmdDispatch(terrain_cmd_buf, groupCountX, groupCountY, groupCountZ);
}

// Update global variable
}

This function utilizes two compute shaders to accomplish dynamic terrain generation. The program allocates GPU memory and descriptor sets to store the resulting vertices, indexed by their coordinates. A hash map is employed to handle evicted blocks in case of collisions. During the rendering stage, only the blocks within the camera’s frustum are processed for rendering.

Pipeline Diagram for Terrain Generation

Function Descriptions

Function Description
update() Sets 27 blocks in a 3×3×3 grid around the center view direction in the (z = 0) plane as target blocks and updates the index_terrain_map.
render() Iterates over all blocks in index_terrain_map. Performs view frustum culling, sorts the blocks by their distance to the camera, and executes vkCmdDraw for each block.

Table: Description of update() and render() Functions

Performance

The performance impact of terrain block sizes is compared in a demo terrain. Larger blocks reduce CPU workload but increase GPU workload and rendering time.

Texture Size and Blocks CPU Run Time (µs) GPU Run Time (µs)
32 × 32 × 32 (9 × 9 blocks) 160.762 10778.140
64 × 64 × 64 (4 × 5 blocks) 39.433 12133.002
96 × 96 × 96 (3 × 3 blocks) 21.316 13167.654

Table: Run time comparison for 3D textures on CPU and GPU.

The CPU workload primarily involves recording and sorting the distances to the camera, an (O(n \log n)) operation. As the texture size increases, the number of blocks decreases, reducing computational load.

The GPU workload measures block generation time. With a fixed local workgroup size of (4 × 4 × 4), larger texture sizes lead to more workgroups being created, increasing the overhead associated with managing these additional workgroups.

Final Visual Effect