I’m trying to write a bare minimum GPU raycaster using compute shaders in OpenGL. I’m confident the raycasting itself is functional, as I’ve gotten clean outlines of bounding boxes via a ray-box intersection algorithm.

However, when attempting ray-triangle intersection, I get strange artifacts. My shader is programmed to simply test for a ray-triangle intersection, and color the pixel white if an intersection was found and black otherwise. When the triangle should be visible onscreen, the screen is instead filled with black and white squares/blocks/tiles which flicker randomly like TV static. The squares are at most 8×8 pixels (the size of my compute shader blocks), although there are dots as small as single pixels as well. The white blocks generally lie in the expected area of my triangle, although sometimes they are spread out across the bottom of the screen as well.

Here is a video of the artifact. In my full shader the camera can be rotated around and the shape appears more triangle-like, but the flickering artifact is the key issue and still appears in this video which I generated from the following minimal version of my shader code:

```
layout(local_size_x = 8, local_size_y = 8, local_size_z = 1) in;
uvec2 DIMS = gl_NumWorkGroups.xy*gl_WorkGroupSize.xy;
uvec2 UV = gl_GlobalInvocationID.xy;
vec2 uvf = vec2(UV) / vec2(DIMS);
layout(location = 1, rgba8) uniform writeonly image2D brightnessOut;
struct Triangle
{
vec3 v0;
vec3 v1;
vec3 v2;
};
struct Ray
{
vec3 origin;
vec3 direction;
vec3 inv;
};
// Wikipedia Moller-Trumbore algorithm, GLSL-ified
bool ray_triangle_intersection(vec3 rayOrigin, vec3 rayVector,
in Triangle inTriangle, out vec3 outIntersectionPoint)
{
const float EPSILON = 0.0000001;
vec3 vertex0 = inTriangle.v0;
vec3 vertex1 = inTriangle.v1;
vec3 vertex2 = inTriangle.v2;
vec3 edge1 = {0.0, 0.0, 0.0};
vec3 edge2 = {0.0, 0.0, 0.0};
vec3 h = {0.0, 0.0, 0.0};
vec3 s = {0.0, 0.0, 0.0};
vec3 q = {0.0, 0.0, 0.0};
float a = 0.0, f = 0.0, u = 0.0, v = 0.0;
edge1 = vertex1 - vertex0;
edge2 = vertex2 - vertex0;
h = cross(rayVector, edge2);
a = dot(edge1, h);
// Test if ray is parallel to this triangle.
if (a > -EPSILON && a < EPSILON)
{
return false;
}
f = 1.0/a;
s = rayOrigin - vertex0;
u = f * dot(s, h);
if (u < 0.0 || u > 1.0)
{
return false;
}
q = cross(s, edge1);
v = f * dot(rayVector, q);
if (v < 0.0 || u + v > 1.0)
{
return false;
}
// At this stage we can compute t to find out where the intersection point is on the line.
float t = f * dot(edge2, q);
if (t > EPSILON) // ray intersection
{
outIntersectionPoint = rayOrigin + rayVector * t;
return true;
}
return false;
}
void main()
{
// Generate rays by calculating the distance from the eye
// point to the screen and combining it with the pixel indices
// to produce a ray through this invocation's pixel
const float HFOV = (3.14159265359/180.0)*45.0;
const float WIDTH_PX = 1280.0;
const float HEIGHT_PX = 720.0;
float VIEW_PLANE_D = (WIDTH_PX/2.0)/tan(HFOV/2.0);
vec2 rayXY = vec2(UV) - vec2(WIDTH_PX/2.0, HEIGHT_PX/2.0);
// Rays have origin at (0, 0, 20) and generally point towards (0, 0, -1)
Ray r;
r.origin = vec3(0.0, 0.0, 20.0);
r.direction = normalize(vec3(rayXY, -VIEW_PLANE_D));
r.inv = 1.0 / r.direction;
// Triangle in XY plane at Z=0
Triangle debugTri;
debugTri.v0 = vec3(-20.0, 0.0, 0.0);
debugTri.v1 = vec3(20.0, 0.0, 0.0);
debugTri.v0 = vec3(0.0, 40.0, 0.0);
// Test triangle intersection; write 1.0 if hit, else 0.0
vec3 hitPosDebug = vec3(0.0);
bool hitDebug = ray_triangle_intersection(r.origin, r.direction, debugTri, hitPosDebug);
imageStore(brightnessOut, ivec2(UV), vec4(vec3(float(hitDebug)), 1.0));
}
```

I render the image to a fullscreen triangle using a normal `sampler2D`

and rasterized triangle UVs chosen to map to screen space.

None of this code should be time dependent, and I’ve tried multiple ray-triangle algorithms from various sources including both branching and branch-free versions and all exhibit the same problem which leads me to suspect some sort of memory incoherency behavior I’m not familiar with, a driver issue, or a mistake I’ve made in configuring or dispatching my compute (I dispatch 160x90x1 of my 8x8x1 blocks to cover my 1280×720 framebuffer texture).

I’ve found a few similar issues like this one on SE and the general internet, but they seem to almost exclusively be caused by using uninitialized variables, which I am not doing as far as I can tell. They mention that the pattern continue to move when viewed in the NSight debugger; while RenderDoc doesn’t do that, the contents of the image *do* vary between draw calls even after the compute shader has finished. E.g. when inspecting the image in the compute draw call there is one pattern of artifacts, but when I scrub to the subsequent draw calls which use my image as input, the pattern in the image has changed.

I also found this post which seems very similar, but that one also seems to be caused by an uninitialized variable, which again I’ve been careful to avoid. I’ve also not been able to alleviate the issue by tweaking the code as they have done.

I’m running the latest NVidia drivers (461.92) on a GTX 1070. I’ve tried inserting `glMemoryBarrier(GL_TEXTURE_FETCH_BARRIER_BIT);`

(as well as some of the other barrier types) after my compute shader dispatch which I believe is the correct barrier to use if using a `sampler2D`

to draw a texture that was previously modified by an image load/store operation, but it doesn’t seem to change anything.

Odds are that the cause of the problem lies somewhere between my chair and keyboard, but this kind of problem lies outside my usual shader debugging abilities as I’m relatively new to OpenGL. Any ideas would be appreciated! Thanks.