Impressions from CVPR 2017

By: Assaf Shocher, Weizmann Institute

Hi everyone,
I was asked to highlight some trends and papers from CVPR 2017 (Honolulu July 21-26) that just ended. Here is my top ten list and some insights I can share, all from my point of view and according to my interests and opinions.

1. Best paper awards were given to the well known DenseNet by Facebook and Learning from Simulated and Unsupervised Images through AdversarialTraining by Apple (Apple’s first ML paper).
A very impressive honorable mention is YOLO9000:Better, Faster, Stronger that demonstrated live real-time object detection much more impressive than the previous 2016 version.

2. One of the most noticeable trends is using the concept of Attention, not only for image captioning; some good examples are:
• Residual Attention Network for Image Classification
• Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition
• Attentional Push: A Deep Convolutional Network for Augmenting ImageSalience with Shared Attention Modeling in Social Scenes
• And also a very impressive improvement for image captioning by Richard Sochers’s group- Knowing When to Look: Adaptive Attention viaA Visual Sentinel for Image Captioning.

All in all, 19 different papers on CVPR had ‘Attention’ in their title.

3. Another trend, is using Reinforcement Learning for challenges that typically used to be approached by other methods. Examples I found interesting:
• Deep Reinforcement Learning-Based Image Captioning With Embedding Reward
• Collaborative Deep Reinforcement Learning for Joint Object Search
• Action-Decision Networks for Visual Tracking with Deep ReinforcementLearning

4. A group I found very impressive was Kyoung Mu Lee‘s group from Seoul National University vision lab. They won both 1st and 2nd places in the super-resolution challenge (NTIRE workshop) with big margin and presented state of the art SR- EDSR and MDSR,
They used the same exact architecture to get best results for 2x, 3x and 4x SR for both known and unknown kernels. They also showed impressive extreme SR (up to x64).
that actually beat CVPR16’s SR two state of the art methods, both theirs too- DRCN, VDSR
They also had a very impressive deblurring paper – Deep Multi-scale Convolutional Neural Network for Dynamic Scene Deblurring and also recolorization- PaletteNet: Image Recolorization with Given Color Palette

5. Image to Image translation was already dealt with in several papers, e.g., Isola, Efros, Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks, Yaniv Taigman, Adam Polyak, Lior Wolf, Unsupervised cross-domain image generation.
In the NTIRE workshop Jan Kautz presented his arxiv paper which approaches this challenge in a different way and gets state of the art results.

6. A paper I found very innovative, from CMU: From Red Wine to Red Tomato: Composition with Context.

7. Image segmentation per se has evolved to more challenging tasks: Instance segmentation and Image-matting (all using DL).

• http://openaccess.thecvf.com/content_cvpr_2017/papers/Ren_End-To-End_Instance_Segmentation_CVPR_2017_paper.pdf
• http://openaccess.thecvf.com/content_cvpr_2017/papers/Arnab_Pixelwise_Instance_Segmentation_CVPR_2017_paper.pdf
• http://openaccess.thecvf.com/content_cvpr_2017/papers/Hayder_Boundary-Aware_Instance_Segmentation_CVPR_2017_paper.pdf
• http://openaccess.thecvf.com/content_cvpr_2017/papers/Liu_Matting_and_Depth_CVPR_2017_paper.pdf
• http://openaccess.thecvf.com/content_cvpr_2017/papers/Xu_Deep_Image_Matting_CVPR_2017_paper.pdf

8. There is a theoretic paper presented both as an oral and also in the mathematics of deep learning tutorial by Rene Vidal – Global Optimality in Neural Network Training. Attempting to explain deep learning optimization using matrix factorization.

9. There was a very interesting “Negative Results in Computer Vision” workshop featuring Jitendra Malik, Alyosha Efros and others.
Among other stuff, Dhruv Batra presented a work that was actually on ICCV2017 that is super impressive- Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning which is basically a game between two agents, one of which ‘sees’ an image and the other needs to guess which one.

10. Some more papers I found worth mentioning:
• Emotion Recognition in Context (A poster, first time I saw using context for emotions and not only the subject)
• Wetness and Color from A Single Multispectral Image (Nice and physics based, only one in my list not using Deep Learning at all)
• A very cool paper that was presented at an oral talk: The Amazing Mysteries of the Gutter:Drawing Inferences Between Panels in Comic Book Narratives

Legal Disclaimer:

You understand that when using the Site you may be exposed to content from a variety of sources, and that SagivTech is not responsible for the accuracy, usefulness, safety or intellectual property rights of, or relating to, such content and that such content does not express SagivTech’s opinion or endorsement of any subject matter and should not be relied upon as such. SagivTech and its affiliates accept no responsibility for any consequences whatsoever arising from use of such content. You acknowledge that any use of the content is at your own risk.

Impressions from CVPR 2017