Ping Luo (羅平)

● I am moving to EE, CUHK as a Research Assitant Professor.
● Positions for research assistants and interns are available.
● ExpW, a facial expression and relation benchmark with 90k images is released. See paper, project, and dataset.
● Our team won Gold medal in 2017 Google YouTube-8M Video Understanding Challenge. Rank top 1.5% = 9/650. See leaderboard.
● Our team ranks 1st out of 22 teams in 2017 DAVIS Challenge on Video Object Segmentation.
● 1 ICCV2017, 2 CVPR2017 papers for Semantic Image Segmentation are appeared. We are topping Pascal VOC2012 leaderboard.

Large-Scale Fashion Database #DeepFashion#

We contribute DeepFashion database, a large-scale clothes database, which has over 800,000 diverse fashion images ranging from well-posed shop images to unconstrained consumer photos. DeepFashion is annotated with rich information of clothing items. Each image in this dataset is labeled with 50 categories, 1,000 descriptive attributes, bounding box and clothing landmarks. DeepFashion contains over 300,000 cross-pose/cross-domain image pairs. The data can be downloaded from the Project Page.

A Face Detection Benchmark #WIDER Face#

We conducted a large-scale face detection benchmark, containing 32,203 images and 393,703 face annotations, which are ten times larger than the exitsing datasets. More details can be found in the Technical Report. The data can be downloaded from the Project Page.

A Large-Scale Face Attributes Dataset #CelebA Dataset#

We presented a large-scale face attribute database, which contains 200K face images. Each image was labeled with 40 facial attributes and five landmarks. More details can be found in this technical report.

A Large Scale Car Database #CompCar Database#

The Comprehensive Cars (CompCars) dataset contains data from two scenarios, including images from web-nature and surveillance-nature. The web-nature data contains 163 car makes with 1,716 car models. There are a total of 136,726 images capturing the entire cars and 27,618 images capturing the car parts. Please refer to our paper for the details. Project Page

Pedestrian Attribute Recognition Database #PETA Database#

We release a new pedestrian attribute dataset, which is by far the largest and most diverse of its kind. We present the benchmark performance by SVM-based method and propose an alternative approach that exploits context of neighboring pedestrian images for improved attribute inference.
The PETA dataset consists of 19,000 images, with resolution ranging from 17×39 to 169×365 pixels, covering more than 60 attributes. Project Page

Robust Facial Landmark Detection and Attribute Learning #ECCV 2014#

Facial landmark detection has long been impeded by the problems of occlusion and pose variation. Instead of treating the detection task as a single and independent problem, we investigate to optimize facial landmark detection together with heterogeneous but subtly correlated tasks, e.g. head pose estimation and facial attribute inference.

Our new model detects 68 landmarks and achieves the state-of-the-art result on the 300-W benchmark dataset (mean error of 9.15% on the challenging IBUG subset). See technical report for details.
Z. Zhang, P. Luo, C. C. Loy, X. Tang, Facial Landmark Detection by Deep Multi-task Learning, ECCV 2014
PDF Technical Report Project Page

Multi-View Face Reconstruction #Technical Report#

This Technical Report proposes a novel deep neural net, named multi-view perceptron (MVP), which can untangle the identity and view features, and infer a full spectrum of multi-view images in the meanwhile, given a single 2D face image. The identity features of MVP achieve superior performance on the MultiPIE dataset. MVP is also capable to interpolate and predict images under viewpoints that are unobserved in the training data.

Zhenyao Zhu, Ping Luo, Xiaogang Wang, Xiaoou Tang, Deep Learning Multi-View Representation for Face Recognition, Technical Report, arXiv:1406.6947, 2014