drtk.edge_grad_estimator#

edge_grad_estimator

Makes the rasterized image img differentiable at visibility discontinuities and backpropagates the gradients to v_pix.

edge_grad_estimator_ref

Python reference implementation for drtk.edge_grad_estimator().

drtk.edge_grad_estimator(v_pix, vi, bary_img, img, index_img, v_pix_img_hook=None)[source]#

Makes the rasterized image img differentiable at visibility discontinuities and backpropagates the gradients to v_pix.

This function takes a rasterized image img that is assumed to be differentiable at continuous regions but not at discontinuities. In some cases, img may not be differentiable at all. For example, if the image is a rendered segmentation mask, it remains constant at continuous regions, making it non-differentiable. However, edge_grad_estimator can still compute gradients at the discontinuities with respect to v_pix.

The arguments bary_img and index_img must correspond exactly to the rasterized image img. Each pixel in img should correspond to a fragment originated prom primitive specified by index_img and it should have barycentric coordinates specified by bary_img. This means that with a small change to v_pix, the pixels in img should change accordingly. A frequent mistake that violates this condition is applying a mask to the rendered image to exclude unwanted regions, which leads to erroneous gradients.

The function returns the img unchanged but with added differentiability at the discontinuities. Note that it is not necessary for the input img to require gradients, but the returned img will always require gradients.

Parameters:
  • v_pix (Tensor) – Pixel-space vertex coordinates, preserving the original camera-space Z-values. Shape: \((N, V, 3)\).

  • vi (Tensor) – Face vertex index list tensor. Shape: \((V, 3)\).

  • bary_img (Tensor) – 3D barycentric coordinate image tensor. Shape: \((N, 3, H, W)\).

  • img (Tensor) – The rendered image. Shape: \((N, C, H, W)\).

  • index_img (Tensor) – Index image tensor. Shape: \((N, H, W)\).

  • v_pix_img_hook (Optional[Callable[[th.Tensor], None]]) – An optional backward hook that will be registered to v_pix_img. Useful for examining the generated image space. Default is None.

Returns:

Returns the input img unchanged. However, the returned image now has added differentiability at visibility discontinuities. This returned image should be used for computing losses

Return type:

Tensor

Note

It is crucial not to spatially modify the rasterized image before passing it to edge_grad_estimator. That stems from the requirement that bary_img and index_img must correspond exactly to the rasterized image img. That means that the location of all discontinuities is controlled by v_pix and can be modified by modifing v_pix.

Operations that are allowed, as long as they are differentiable, include:
  • Pixel-wise MLP

  • Color mapping

  • Color correction, gamma correction

  • Anything that would be indistinguishable from processing fragments independently before their values get assigned to pixels of img

Operations that must be avoided before edge_grad_estimator include:
  • Gaussian blur

  • Warping or deformation

  • Masking, cropping, or introducing holes

There is however, no issue with appling them after edge_grad_estimator.

If the operation is highly non-linear, it is recommended to perform it before calling edge_grad_estimator(). All sorts of clipping and clamping (e.g., x.clamp(min=0.0, max=1.0)) must also be done before invoking this function.

Usage Example:

import torch.nn.functional as thf
from drtk import transform, rasterize, render, interpolate, edge_grad_estimator

...

v_pix = transform(v, tex, campos, camrot, focal, princpt)
index_img = rasterize(v_pix, vi, width=512, height=512)
_, bary_img = render(v_pix, vi, index_img)
vt_img = interpolate(vt, vti, index_img, bary_img)

img = thf.grid_sample(
    tex,
    vt_img.permute(0, 2, 3, 1),
    mode="bilinear",
    padding_mode="border",
    align_corners=False
)

mask = (index_img != -1)[:, None, :, :]

img = img * mask

img = edge_grad_estimator(
    v_pix=v_pix,
    vi=vi,
    bary_img=bary_img,
    img=img,
    index_img=index_img
)

optim.zero_grad()
image_loss = loss_func(img, img_gt)
image_loss.backward()
optim.step()
drtk.edge_grad_estimator_ref(v_pix, vi, bary_img, img, index_img, v_pix_img_hook=None)[source]#

Python reference implementation for drtk.edge_grad_estimator().