A deep learning technique for finding semantically-meaningful dense correspondences between images to enable visual attribute transfer.
Deep Image Analogy is a computer vision research project that finds semantically-meaningful dense correspondences between two images using deep convolutional neural network features. It enables visual attribute transfer applications like style transfer between photos and artworks, color transfer between photos, and converting sketches to photorealistic images. The technique adapts traditional image analogy concepts with deep learning to achieve more coherent and visually pleasing results.
Computer vision researchers, graphics programmers, and developers working on image manipulation, style transfer, or visual correspondence problems. It's particularly relevant for those implementing advanced image editing tools or studying deep learning applications in graphics.
Unlike simpler style transfer methods, Deep Image Analogy establishes semantic correspondences between images, enabling more controlled and coherent attribute transfers. It combines the efficiency of patch-based methods with the semantic understanding of deep neural networks, offering researchers a powerful tool for visual attribute manipulation.
The source code of 'Visual Attribute Transfer through Deep Image Analogy'.
Open-Awesome is built by the community, for the community. Submit a project, suggest an awesome list, or help improve the catalog on GitHub.
Uses deep CNN features from VGG-19 to find meaningful dense matches, enabling coherent attribute transfers as validated in the SIGGRAPH 2017 paper.
Supports photo-to-style, style-to-style, style-to-photo, and photo-to-photo transfers, offering versatility for different image editing tasks as shown in the examples.
Provides adjustable parameters like blend weight and ratio to control output appearance, with specific recommendations in the Tips section for different use cases.
Official implementation of a peer-reviewed technique, ensuring reliability for academic and research purposes in computer vision and graphics.
Primarily tested on Windows with specific Nvidia GPUs (e.g., Titan X, K40) and CUDA 7.5/8, limiting accessibility and compatibility with modern systems.
Requires building Caffe first, along with Visual Studio 2013 and specific CUDA versions, making installation non-trivial and error-prone, as detailed in the Build section.
Input images should not be larger than 700x500 for ratio=1.0, restricting high-resolution processing without downscaling and potential quality loss.
Based on Caffe and older CUDA versions, which are largely superseded by newer frameworks, raising concerns about long-term support and integration with modern tools.