|
@@ -65,16 +65,28 @@
|
|
|
<li><a href="#imagenet-classification">ImageNet Classification</a></li>
|
|
|
<li><a href="#object-detection">Object Detection</a></li>
|
|
|
<li><a href="#object-tracking">Object Tracking</a></li>
|
|
|
-<li><a href="#super-resolution">Super Resolution</a></li>
|
|
|
-<li><a href="#low-level-vision">Low-Level Vision</a></li>
|
|
|
+<li>
|
|
|
+<a href="#low-level-vision">Low-Level Vision</a>
|
|
|
+
|
|
|
+<ul>
|
|
|
+<li><a href="#super-resolution">Super-Resolution</a></li>
|
|
|
+<li><a href="#other-applications">Other Applications</a></li>
|
|
|
+</ul>
|
|
|
+</li>
|
|
|
<li><a href="#edge-detection">Edge Detection</a></li>
|
|
|
<li><a href="#semantic-segmentation">Semantic Segmentation</a></li>
|
|
|
<li><a href="#visual-attention-and-saliency">Visual Attention and Saliency</a></li>
|
|
|
<li><a href="#object-recognition">Object Recognition</a></li>
|
|
|
<li><a href="#understanding-cnn">Understanding CNN</a></li>
|
|
|
+<li>
|
|
|
+<a href="#image-and-language">Image and Language</a>
|
|
|
+
|
|
|
+<ul>
|
|
|
<li><a href="#image-captioning">Image Captioning</a></li>
|
|
|
<li><a href="#video-captioning">Video Captioning</a></li>
|
|
|
<li><a href="#question-answering">Question Answering</a></li>
|
|
|
+</ul>
|
|
|
+</li>
|
|
|
<li><a href="#other-topics">Other Topics</a></li>
|
|
|
</ul>
|
|
|
</li>
|
|
@@ -118,7 +130,7 @@
|
|
|
<li>GoogLeNet <a href="http://arxiv.org/pdf/1409.4842">[Paper]</a>
|
|
|
|
|
|
<ul>
|
|
|
-<li>Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich, CVPR, 2015. </li>
|
|
|
+<li>Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich, CVPR, 2015.</li>
|
|
|
</ul>
|
|
|
</li>
|
|
|
<li>VGG-Net <a href="http://www.robots.ox.ac.uk/%7Evgg/research/very_deep/">[Web]</a> <a href="http://arxiv.org/pdf/1409.1556">[Paper]</a>
|
|
@@ -201,7 +213,10 @@
|
|
|
</ul>
|
|
|
|
|
|
<h3>
|
|
|
-<a id="super-resolution" class="anchor" href="#super-resolution" aria-hidden="true"><span class="octicon octicon-link"></span></a>Super-Resolution</h3>
|
|
|
+<a id="low-level-vision" class="anchor" href="#low-level-vision" aria-hidden="true"><span class="octicon octicon-link"></span></a>Low-Level Vision</h3>
|
|
|
+
|
|
|
+<h4>
|
|
|
+<a id="super-resolution" class="anchor" href="#super-resolution" aria-hidden="true"><span class="octicon octicon-link"></span></a>Super-Resolution</h4>
|
|
|
|
|
|
<ul>
|
|
|
<li>Super-Resolution (SRCNN) <a href="http://mmlab.ie.cuhk.edu.hk/projects/SRCNN.html">[Web]</a> <a href="http://personal.ie.cuhk.edu.hk/%7Eccloy/files/eccv_2014_deepresolution.pdf">[Paper-ECCV14]</a> <a href="http://arxiv.org/pdf/1501.00092.pdf">[Paper-arXiv15]</a>
|
|
@@ -221,7 +236,8 @@
|
|
|
<li>Deeply-Recursive Convolutional Network
|
|
|
|
|
|
<ul>
|
|
|
-<li>Jiwon Kim, Jung Kwon Lee, Kyoung Mu Lee, Deeply-Recursive Convolutional Network for Image Super-Resolution, arXiv:1511.04491, 2015. <a href="http://arxiv.org/abs/1511.04491">[Paper]</a> </li>
|
|
|
+<li>Jiwon Kim, Jung Kwon Lee, Kyoung Mu Lee, Deeply-Recursive Convolutional Network for Image Super-Resolution, arXiv:1511.04491, 2015. <a href="http://arxiv.org/abs/1511.04491">[Paper]</a>
|
|
|
+</li>
|
|
|
</ul>
|
|
|
</li>
|
|
|
<li>Others
|
|
@@ -233,8 +249,8 @@
|
|
|
</li>
|
|
|
</ul>
|
|
|
|
|
|
-<h3>
|
|
|
-<a id="low-level-vision" class="anchor" href="#low-level-vision" aria-hidden="true"><span class="octicon octicon-link"></span></a>Low-Level Vision</h3>
|
|
|
+<h4>
|
|
|
+<a id="other-applications" class="anchor" href="#other-applications" aria-hidden="true"><span class="octicon octicon-link"></span></a>Other Applications</h4>
|
|
|
|
|
|
<ul>
|
|
|
<li>Optical Flow (FlowNet) <a href="http://arxiv.org/pdf/1504.06852">[Paper]</a>
|
|
@@ -249,10 +265,13 @@
|
|
|
<li>Chao Dong, Yubin Deng, Chen Change Loy, Xiaoou Tang, Compression Artifacts Reduction by a Deep Convolutional Network, arXiv:1504.06993.</li>
|
|
|
</ul>
|
|
|
</li>
|
|
|
-<li>Non-Uniform Motion Blur Removal <a href="http://arxiv.org/pdf/1503.00593">[Paper]</a>
|
|
|
+<li>Blur Removal
|
|
|
|
|
|
<ul>
|
|
|
-<li>Jian Sun, Wenfei Cao, Zongben Xu, Jean Ponce, Learning a Convolutional Neural Network for Non-uniform Motion Blur Removal, CVPR, 2015. </li>
|
|
|
+<li>Christian J. Schuler, Michael Hirsch, Stefan Harmeling, Bernhard Schölkopf, Learning to Deblur, arXiv:1406.7444 <a href="http://arxiv.org/pdf/1406.7444.pdf">[Paper]</a>
|
|
|
+</li>
|
|
|
+<li>Jian Sun, Wenfei Cao, Zongben Xu, Jean Ponce, Learning a Convolutional Neural Network for Non-uniform Motion Blur Removal, CVPR, 2015 <a href="http://arxiv.org/pdf/1503.00593">[Paper]</a>
|
|
|
+</li>
|
|
|
</ul>
|
|
|
</li>
|
|
|
<li>Image Deconvolution <a href="http://lxu.me/projects/dcnn/">[Web]</a> <a href="http://lxu.me/mypapers/dcnn_nips14.pdf">[Paper]</a>
|
|
@@ -285,7 +304,7 @@
|
|
|
<li>Holistically-Nested Edge Detection <a href="http://arxiv.org/pdf/1504.06375">[Paper]</a>
|
|
|
|
|
|
<ul>
|
|
|
-<li>Saining Xie, Zhuowen Tu, Holistically-Nested Edge Detection, arXiv:1504.06375. </li>
|
|
|
+<li>Saining Xie, Zhuowen Tu, Holistically-Nested Edge Detection, arXiv:1504.06375.</li>
|
|
|
</ul>
|
|
|
</li>
|
|
|
<li>DeepEdge <a href="http://arxiv.org/pdf/1412.1123">[Paper]</a>
|
|
@@ -309,8 +328,8 @@
|
|
|
(from Jifeng Dai, Kaiming He, Jian Sun, BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation, arXiv:1503.01640.)</p>
|
|
|
|
|
|
<ul>
|
|
|
-<li>PASCAL VOC2012 Challenge Top 10 (14 Aug. 2015)
|
|
|
-<img src="http://cv.snu.ac.kr/hmyeong/files/150814_pascal_voc.png" alt="VOC2012_top_10">
|
|
|
+<li>PASCAL VOC2012 Challenge Leaderboard (02 Dec. 2015)
|
|
|
+<img src="https://cloud.githubusercontent.com/assets/7778428/11527440/5724d2bc-9924-11e5-9614-01b863629af3.png" alt="VOC2012_top_rankings">
|
|
|
(from PASCAL VOC2012 <a href="http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php?challengeid=11&compid=6">leaderboards</a>)</li>
|
|
|
<li>Adelaide
|
|
|
|
|
@@ -319,32 +338,50 @@
|
|
|
<li>Guosheng Lin, Chunhua Shen, Ian Reid, Anton van den Hengel, Deeply Learning the Messages in Message Passing Inference, arXiv:1508.02108. <a href="http://arxiv.org/pdf/1506.02108">[Paper]</a> (4th ranked in VOC2012)</li>
|
|
|
</ul>
|
|
|
</li>
|
|
|
-<li>BoxSup <a href="http://arxiv.org/pdf/1503.01640">[Paper]</a>
|
|
|
+<li>Deep Parsing Network (DPN)
|
|
|
|
|
|
<ul>
|
|
|
-<li>Jifeng Dai, Kaiming He, Jian Sun, BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation, arXiv:1503.01640. (2nd ranked in VOC2012)</li>
|
|
|
+<li>Ziwei Liu, Xiaoxiao Li, Ping Luo, Chen Change Loy, Xiaoou Tang, Semantic Image Segmentation via Deep Parsing Network, arXiv:1509.02634 / ICCV 2015 <a href="http://arxiv.org/pdf/1509.02634.pdf">[Paper]</a> (2nd ranked in VOC 2012)</li>
|
|
|
</ul>
|
|
|
</li>
|
|
|
-<li>Conditional Random Fields as Recurrent Neural Networks <a href="http://arxiv.org/pdf/1502.03240">[Paper]</a>
|
|
|
+<li>CentraleSuperBoundaries, INRIA <a href="http://arxiv.org/pdf/1511.07386">[Paper]</a>
|
|
|
|
|
|
<ul>
|
|
|
-<li>Shuai Zheng, Sadeep Jayasumana, Bernardino Romera-Paredes, Vibhav Vineet, Zhizhong Su, Dalong Du, Chang Huang, Philip H. S. Torr, Conditional Random Fields as Recurrent Neural Networks, arXiv:1502.03240. (3rd ranked in VOC2012)</li>
|
|
|
+<li>Iasonas Kokkinos, Surpassing Humans in Boundary Detection using Deep Learning, arXiv:1411.07386 (4th ranked in VOC 2012)</li>
|
|
|
</ul>
|
|
|
</li>
|
|
|
-<li>DeepLab
|
|
|
+<li>BoxSup <a href="http://arxiv.org/pdf/1503.01640">[Paper]</a>
|
|
|
|
|
|
<ul>
|
|
|
-<li> Liang-Chieh Chen, George Papandreou, Kevin Murphy, Alan L. Yuille, Weakly-and semi-supervised learning of a DCNN for semantic image segmentation, arXiv:1502.02734. <a href="http://arxiv.org/pdf/1502.02734">[Paper]</a> (5th ranked in VOC2012)</li>
|
|
|
+<li>Jifeng Dai, Kaiming He, Jian Sun, BoxSup: Exploiting Bounding Boxes to Supervise Convolutional Networks for Semantic Segmentation, arXiv:1503.01640. (6th ranked in VOC2012)</li>
|
|
|
</ul>
|
|
|
</li>
|
|
|
<li>POSTECH
|
|
|
|
|
|
<ul>
|
|
|
-<li>Hyeonwoo Noh, Seunghoon Hong, Bohyung Han, Learning Deconvolution Network for Semantic Segmentation, arXiv:1505.04366. <a href="http://arxiv.org/pdf/1505.04366">[Paper]</a> (6th ranked in VOC2012)</li>
|
|
|
+<li>Hyeonwoo Noh, Seunghoon Hong, Bohyung Han, Learning Deconvolution Network for Semantic Segmentation, arXiv:1505.04366. <a href="http://arxiv.org/pdf/1505.04366">[Paper]</a> (7th ranked in VOC2012)</li>
|
|
|
<li>Seunghoon Hong, Hyeonwoo Noh, Bohyung Han, Decoupled Deep Neural Network for Semi-supervised Semantic Segmentation, arXiv:1506.04924. <a href="http://arxiv.org/pdf/1506.04924">[Paper]</a>
|
|
|
</li>
|
|
|
</ul>
|
|
|
</li>
|
|
|
+<li>Conditional Random Fields as Recurrent Neural Networks <a href="http://arxiv.org/pdf/1502.03240">[Paper]</a>
|
|
|
+
|
|
|
+<ul>
|
|
|
+<li>Shuai Zheng, Sadeep Jayasumana, Bernardino Romera-Paredes, Vibhav Vineet, Zhizhong Su, Dalong Du, Chang Huang, Philip H. S. Torr, Conditional Random Fields as Recurrent Neural Networks, arXiv:1502.03240. (8th ranked in VOC2012)</li>
|
|
|
+</ul>
|
|
|
+</li>
|
|
|
+<li>DeepLab
|
|
|
+
|
|
|
+<ul>
|
|
|
+<li> Liang-Chieh Chen, George Papandreou, Kevin Murphy, Alan L. Yuille, Weakly-and semi-supervised learning of a DCNN for semantic image segmentation, arXiv:1502.02734. <a href="http://arxiv.org/pdf/1502.02734">[Paper]</a> (9th ranked in VOC2012)</li>
|
|
|
+</ul>
|
|
|
+</li>
|
|
|
+<li>Zoom-out <a href="http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Mostajabi_Feedforward_Semantic_Segmentation_2015_CVPR_paper.pdf">[Paper]</a>
|
|
|
+
|
|
|
+<ul>
|
|
|
+<li>Mohammadreza Mostajabi, Payman Yadollahpour, Gregory Shakhnarovich, Feedforward Semantic Segmentation With Zoom-Out Features, CVPR, 2015</li>
|
|
|
+</ul>
|
|
|
+</li>
|
|
|
<li>Joint Calibration <a href="http://arxiv.org/pdf/1507.01581">[Paper]</a>
|
|
|
|
|
|
<ul>
|
|
@@ -360,13 +397,7 @@
|
|
|
<li>Hypercolumn <a href="http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Hariharan_Hypercolumns_for_Object_2015_CVPR_paper.pdf">[Paper]</a>
|
|
|
|
|
|
<ul>
|
|
|
-<li>Bharath Hariharan, Pablo Arbelaez, Ross Girshick, Jitendra Malik, Hypercolumns for Object Segmentation and Fine-Grained Localization, CVPR, 2015. </li>
|
|
|
-</ul>
|
|
|
-</li>
|
|
|
-<li>Zoom-out <a href="http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Mostajabi_Feedforward_Semantic_Segmentation_2015_CVPR_paper.pdf">[Paper]</a>
|
|
|
-
|
|
|
-<ul>
|
|
|
-<li>Mohammadreza Mostajabi, Payman Yadollahpour, Gregory Shakhnarovich, Feedforward Semantic Segmentation With Zoom-Out Features, CVPR, 2015.</li>
|
|
|
+<li>Bharath Hariharan, Pablo Arbelaez, Ross Girshick, Jitendra Malik, Hypercolumns for Object Segmentation and Fine-Grained Localization, CVPR, 2015.</li>
|
|
|
</ul>
|
|
|
</li>
|
|
|
<li>Deep Hierarchical Parsing
|
|
@@ -383,7 +414,7 @@
|
|
|
<li>Clement Farabet, Camille Couprie, Laurent Najman, Yann LeCun, Learning Hierarchical Features for Scene Labeling, PAMI, 2013.</li>
|
|
|
</ul>
|
|
|
</li>
|
|
|
-<li>University of Cambridge <a href="http://mi.eng.cam.ac.uk/projects/segnet/">[Web]</a>
|
|
|
+<li>University of Cambridge <a href="http://mi.eng.cam.ac.uk/projects/segnet/">[Web]</a>
|
|
|
|
|
|
<ul>
|
|
|
<li>Vijay Badrinarayanan, Alex Kendall and Roberto Cipolla "SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation." arXiv preprint arXiv:1511.00561, 2015. <a href="http://arxiv.org/abs/1511.00561">[Paper]</a>
|
|
@@ -493,7 +524,10 @@
|
|
|
</ul>
|
|
|
|
|
|
<h3>
|
|
|
-<a id="image-captioning" class="anchor" href="#image-captioning" aria-hidden="true"><span class="octicon octicon-link"></span></a>Image Captioning</h3>
|
|
|
+<a id="image-and-language" class="anchor" href="#image-and-language" aria-hidden="true"><span class="octicon octicon-link"></span></a>Image and Language</h3>
|
|
|
+
|
|
|
+<h4>
|
|
|
+<a id="image-captioning" class="anchor" href="#image-captioning" aria-hidden="true"><span class="octicon octicon-link"></span></a>Image Captioning</h4>
|
|
|
|
|
|
<p><img src="https://cloud.githubusercontent.com/assets/5226447/8452051/e8f81030-2022-11e5-85db-c68e7d8251ce.PNG" alt="image_captioning">
|
|
|
(from Andrej Karpathy, Li Fei-Fei, Deep Visual-Semantic Alignments for Generating Image Description, CVPR, 2015.)</p>
|
|
@@ -532,7 +566,7 @@
|
|
|
<li>UML / UT <a href="http://arxiv.org/pdf/1412.4729">[Paper]</a>
|
|
|
|
|
|
<ul>
|
|
|
-<li>Subhashini Venugopalan, Huijuan Xu, Jeff Donahue, Marcus Rohrbach, Raymond Mooney, Kate Saenko, Translating Videos to Natural Language Using Deep Recurrent Neural Networks, NAACL-HLT, 2015. </li>
|
|
|
+<li>Subhashini Venugopalan, Huijuan Xu, Jeff Donahue, Marcus Rohrbach, Raymond Mooney, Kate Saenko, Translating Videos to Natural Language Using Deep Recurrent Neural Networks, NAACL-HLT, 2015.</li>
|
|
|
</ul>
|
|
|
</li>
|
|
|
<li>CMU / Microsoft <a href="http://arxiv.org/pdf/1411.5654">[Paper-arXiv]</a> <a href="http://www.cs.cmu.edu/%7Exinleic/papers/cvpr15_rnn.pdf">[Paper-CVPR]</a>
|
|
@@ -545,7 +579,7 @@
|
|
|
<li>Microsoft <a href="http://arxiv.org/pdf/1411.4952">[Paper]</a>
|
|
|
|
|
|
<ul>
|
|
|
-<li>Hao Fang, Saurabh Gupta, Forrest Iandola, Rupesh Srivastava, Li Deng, Piotr Dollár, Jianfeng Gao, Xiaodong He, Margaret Mitchell, John C. Platt, C. Lawrence Zitnick, Geoffrey Zweig, From Captions to Visual Concepts and Back, CVPR, 2015. </li>
|
|
|
+<li>Hao Fang, Saurabh Gupta, Forrest Iandola, Rupesh Srivastava, Li Deng, Piotr Dollár, Jianfeng Gao, Xiaodong He, Margaret Mitchell, John C. Platt, C. Lawrence Zitnick, Geoffrey Zweig, From Captions to Visual Concepts and Back, CVPR, 2015.</li>
|
|
|
</ul>
|
|
|
</li>
|
|
|
<li>Univ. Montreal / Univ. Toronto [<a href="http://kelvinxu.github.io/projects/capgen.html">Web</a>] [<a href="http://www.cs.toronto.edu/%7Ezemel/documents/captionAttn.pdf">Paper</a>]
|
|
@@ -594,13 +628,13 @@
|
|
|
<li>Cornell [<a href="http://arxiv.org/pdf/1508.02091.pdf">Paper</a>]
|
|
|
|
|
|
<ul>
|
|
|
-<li>Jack Hessel, Nicolas Savva, Michael J. Wilber, Image Representations and New Domains in Neural Image Captioning, arXiv:1508.02091 </li>
|
|
|
+<li>Jack Hessel, Nicolas Savva, Michael J. Wilber, Image Representations and New Domains in Neural Image Captioning, arXiv:1508.02091</li>
|
|
|
</ul>
|
|
|
</li>
|
|
|
</ul>
|
|
|
|
|
|
-<h3>
|
|
|
-<a id="video-captioning" class="anchor" href="#video-captioning" aria-hidden="true"><span class="octicon octicon-link"></span></a>Video Captioning</h3>
|
|
|
+<h4>
|
|
|
+<a id="video-captioning" class="anchor" href="#video-captioning" aria-hidden="true"><span class="octicon octicon-link"></span></a>Video Captioning</h4>
|
|
|
|
|
|
<ul>
|
|
|
<li>Berkeley <a href="http://jeffdonahue.com/lrcn/">[Web]</a> <a href="http://arxiv.org/pdf/1411.4389.pdf">[Paper]</a>
|
|
@@ -653,8 +687,8 @@
|
|
|
</li>
|
|
|
</ul>
|
|
|
|
|
|
-<h3>
|
|
|
-<a id="question-answering" class="anchor" href="#question-answering" aria-hidden="true"><span class="octicon octicon-link"></span></a>Question Answering</h3>
|
|
|
+<h4>
|
|
|
+<a id="question-answering" class="anchor" href="#question-answering" aria-hidden="true"><span class="octicon octicon-link"></span></a>Question Answering</h4>
|
|
|
|
|
|
<p><img src="https://cloud.githubusercontent.com/assets/5226447/8452068/ffe7b1f6-2022-11e5-87ab-4f6d4696c220.PNG" alt="question_answering">
|
|
|
(from Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C. Lawrence Zitnick, Devi Parikh, VQA: Visual Question Answering, CVPR, 2015 SUNw:Scene Understanding workshop)</p>
|
|
@@ -738,7 +772,7 @@
|
|
|
<li>Human Gaze Estimation
|
|
|
|
|
|
<ul>
|
|
|
-<li>Xucong Zhang, Yusuke Sugano, Mario Fritz, Andreas Bulling, Appearance-Based Gaze Estimation in the Wild, CVPR, 2015. <a href="http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Zhang_Appearance-Based_Gaze_Estimation_2015_CVPR_paper.pdf">[Paper]</a>
|
|
|
+<li>Xucong Zhang, Yusuke Sugano, Mario Fritz, Andreas Bulling, Appearance-Based Gaze Estimation in the Wild, CVPR, 2015. <a href="http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Zhang_Appearance-Based_Gaze_Estimation_2015_CVPR_paper.pdf">[Paper]</a> <a href="https://www.mpi-inf.mpg.de/departments/computer-vision-and-multimodal-computing/research/gaze-based-human-computer-interaction/appearance-based-gaze-estimation-in-the-wild/">[Website]</a>
|
|
|
</li>
|
|
|
</ul>
|
|
|
</li>
|
|
@@ -833,7 +867,7 @@
|
|
|
<a id="applications" class="anchor" href="#applications" aria-hidden="true"><span class="octicon octicon-link"></span></a>Applications</h3>
|
|
|
|
|
|
<ul>
|
|
|
-<li>Adversarial Training
|
|
|
+<li>Adversarial Training
|
|
|
|
|
|
<ul>
|
|
|
<li>Code and hyperparameters for the paper "Generative Adversarial Networks" <a href="https://github.com/goodfeli/adversarial">[Web]</a>
|