Tobias Hinz - Publications

Selected Publications

For a full list, have a look at my Google Scholar page.

ShotAdapter: Text-to-Multi-Shot Video Generation with Diffusion Models

A framework that enables text-to-multi-shot video generation with shot-specific conditioning and full attention across all frames.

O. Kara, K. Singh, F. Liu, D. Ceylan, J. M. Rehg, T. Hinz, Conference on Computer Vision and Pattern Recognition 2025.

Project Paper

Personalized Residuals for Concept-Driven Text-to-Image Generation

A novel and efficient approach for enabling personalized image generation with diffusion models.

C. Ham, M. Fisher, J. Hays, N. Kolkin, Y. Liu, R. Zhang, T. Hinz, Conference on Computer Vision and Pattern Recognition 2024.

Project Paper

Modulating Pretrained Diffusion Models for Multimodal Image Synthesis

A multimodal conditioning module (MCM) for enabling conditional image synthesis using pretrained diffusion models.

C. Ham, J. Hays, J. Lu, K. Singh, Z. Zhang, T. Hinz, SIGGRAPH Conference Proceedings 2023.

Project Paper

SmartBrush: Text and Shape Guided Object Inpainting with Diffusion Model

A diffusion model for shape-guided inpainting with better shape control and background preservation within the inpainted region.

S. Xie, Z. Zhang, Z. Lin, T. Hinz, K. Zhang, Conference on Computer Vision and Pattern Recognition 2023.

Paper

ASSET: Autoregressive Semantic Scene Editing with Transformers at High Resolutions

A neural architecture for automatically modifying an input high-resolution image according to a user's edits on its semantic segmentation map.

D. Liu, S. Shetty, T. Hinz, M. Fisher, R. Zhang, T. Park, E. Kalogerakis, ACM Transactions on Graphics (SIGGRAPH 2022) 2022.

Project Code

Improved Techniques for Training Single-Image GANs

Improving the results and training speed of single-image GANs.

T. Hinz, M. Fisher, O. Wang, S. Wermter, IEEE Winter Conference on Applications of Computer Vision 2021.

Paper Blog Post Code

Semantic Object Accuracy for Generative Text-to-Image Synthesis

A novel GAN architecture and an improved metric to evaluate generative text-to-image synthesis models.

T. Hinz, S. Heinrich, S. Wermter, IEEE Transactions on Pattern Analysis and Machine Intelligence 2020.

Paper Blog Post Code

Generating Multiple Objects at Spatially Distinct Locations

Fine-grained control over the placement and identity of objects in images generated with a Generative Adversarial Network.

T. Hinz, S. Heinrich, S. Wermter, International Conference on Learning Representations 2019.

Paper Blog Post Code

Image Generation and Translation with Disentangled Representations

Controllable image generation and translation with very little supervision.

T. Hinz, S. Wermter, IEEE International Joint Conference on Neural Networks 2018.

Paper Code

Speeding Up the Hyperparameter Optimization Of Deep Convolutional Neural Networks

How to use lower dimensional data representations to speed up the hyperparameter optimization for CNNs processing images..

T. Hinz, N. Navarro-Guerrero, S. Magg, S. Wermter, International Journal of Computational Intelligence and Applications 2018.

Paper

Inferencing Based on Unsupervised Learning of Disentangled Representations

Unsupervised learning of disentangled representations with a Bidirectional GAN.

T. Hinz, S. Wermter, European Symposium on Artificial Neural Networks 2018.

Paper Code

The Effects of Regularization on Learning Facial Expressions with Convolutional Neural Networks

How modern regularization techniques for CNNs affect the learned representations.

T. Hinz, P. Barros, S. Wermter, International Conference on Artificial Neural Networks 2016.

Paper

Back to Home