Re-annotate
Zihou Ng, 2025
One of the common visual conventions in computer vision is the use of geometric annotations overlaid on images—such as bounding boxes for object detection or keypoints for human pose estimation. These annotations reflect how machine learning models operate: extracting, compressing, and reducing information from unstructured raw data. In ML-related artworks, this convention is often embraced by overlaying geometric annotations on images or videos, subtly implying the presence of an underlying machine learning model.
In this work, we explore the inverse perspective—considering annotations not as an added layer but as the underlying essence. Specifically, we focus on human pose estimation data, using joint position data from the AMASS dataset [Mahmood et al. 2019] and applying inverse rendering [Vicini et al. 2022] to partially reconstruct the human body. Through these images, we invite viewers to reconsider the reductive nature of machine learning models and reflect on the origins of the data they process.



Bibliography
[1] Mahmood, N., Ghorbani, N., Troje, N., Pons-Moll, G., & Black, M. (2019). AMASS: Archive of Motion Capture as Surface Shapes. In International Conference on Computer Vision (pp. 5442–5451).
[2] Delio Vicini, Sébastien Speierer, & Wenzel Jakob (2022). Differentiable Signed Distance Function Rendering. Transactions on Graphics (Proceedings of SIGGRAPH), 41(4), 125:1–125:18.