| 
					
				 | 
			
			
				@@ -29,6 +29,7 @@ Please feel free to [pull requests](https://github.com/kjw0612/awesome-deep-visi 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				   - [Semantic Segmentation](#semantic-segmentation) 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				   - [Visual Attention and Saliency](#visual-attention-and-saliency) 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				   - [Object Recognition](#object-recognition) 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+  - [Human Pose Estimation](#human-pose-estimation) 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				   - [Understanding CNN](#understanding-cnn) 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				   - [Image and Language](#image-and-language) 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				     - [Image Captioning](#image-captioning) 
			 | 
		
	
	
		
			
				| 
					
				 | 
			
			
				@@ -231,6 +232,14 @@ Please feel free to [pull requests](https://github.com/kjw0612/awesome-deep-visi 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				 * FV-CNN [[Paper]](http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Cimpoi_Deep_Filter_Banks_2015_CVPR_paper.pdf) 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				   * Mircea Cimpoi, Subhransu Maji, Andrea Vedaldi, Deep Filter Banks for Texture Recognition and Segmentation, CVPR, 2015. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				  
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+### Human Pose Estimation 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+* Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh, Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields, CVPR, 2017. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+* Leonid Pishchulin, Eldar Insafutdinov, Siyu Tang, Bjoern Andres, Mykhaylo Andriluka, Peter Gehler, and Bernt Schiele, Deepcut: Joint subset partition and labeling for multi person pose estimation, CVPR, 2016. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+* Shih-En Wei, Varun Ramakrishna, Takeo Kanade, and Yaser Sheikh, Convolutional pose machines, CVPR, 2016. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+* Alejandro Newell, Kaiyu Yang, and Jia Deng, Stacked hourglass networks for human pose estimation, ECCV, 2016. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+* Tomas Pfister, James Charles, and Andrew Zisserman, Flowing convnets for human pose estimation in videos, ICCV, 2015. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+* Jonathan J. Tompson, Arjun Jain, Yann LeCun, Christoph Bregler, Joint training of a convolutional network and a graphical model for human pose estimation, NIPS, 2014. 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				+ 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				 ### Understanding CNN 
			 | 
		
	
		
			
				 | 
				 | 
			
			
				  
			 | 
		
	
		
			
				 | 
				 | 
			
			
				 (from Aravindh Mahendran, Andrea Vedaldi, Understanding Deep Image Representations by Inverting Them, CVPR, 2015.) 
			 |