About Me


I currently work at Meituan Inc. I worked at Malong LLC from Apr. 2019 to Apr. 2021.

I completed my DPhil degree in the Visual Geometry Group (VGG), Department of Engineering Science, University of Oxford. My supervisors are Prof. Andrew Zisserman and Dr. Relja Arandjelović. Prior to this, I obtained my BA and MEng degrees at the University of Oxford.

My research direction includes object detection, self-supervised learning, vision-language modeling, neural architecture search, image retrieval and face recognition.

Email: jaszhong@hotmail.com


C. Han, Y. Zhong, D. Li, K. Han, L. Ma
Zero-Shot Semantic Segmentation with Decoupled One-Pass Network
Preprint, 2023.
PDF | arXiv | code

D. Li, S. Chen, Y. Zhong, L. Ma
DiP: Learning Discriminative Implicit Parts for Person Re-Identification
Preprint, 2023.
PDF | arXiv | code

X. Zhou, Y. Zhong, Z. Cheng, F. Liang, L. Ma
Adaptive Sparse Pairwise Loss for Object Re-Identification
CVPR, 2023.
PDF | arXiv | code

D. Shi, Y. Zhong, Q. Cao, L. Ma, J. Li, D. Tao
TriDet: Temporal Action Detection with Relative Boundary Modeling
CVPR, 2023. 
PDF | arXiv | code

C. Feng, Z. Jie, Y. Zhong, X. Chu, L. Ma
AeDet: Azimuth-invariant Multi-view 3D Object Detection
CVPR, 2023.
PDF | arXiv | page | code

C. Liu, Y. Zhong, A. Zisserman, W. Xie
CounTR: Transformer-based Generalised Visual Counting
BMVC, 2022. 
PDF | arXiv | code

Z. Wang, Y. Zhong, Y. Miao, L. Ma, L. Specia
Contrastive Video-Language Learning with Fine-grained Frame Sampling
PDF | arXiv

D. Shi, Y. Zhong, Q. Cao, J. Zhang, L. Ma, J. Li, D. Tao
ReAct: Temporal Action Detection with Relational Queries
ECCV, 2022. 
PDF | arXiv

C. Feng, Y. Zhong, Z. Jie, X. Chu, H. Ren, X. Wei, W. Xie, L. Ma
PromptDet: Towards Open-vocabulary Detection using Uncurated Images
ECCV, 2022.
PDF | arXiv | page | code

S. Guo, Z, Xiong, Y. Zhong, W. Li, X. Guo, B. Han, W. Huang
Cross-Architecture Self-supervised Video Representation Learning
CVPR, 2022. 
PDF | arXiv | Code

X. Chen, Q. Cao, Y. Zhong, J. Zhang, S. Gao, D. Tao
DearKD: Data-Efficient Early Knowledge Distillation for Vision Transformers
CVPR, 2022. 
PDF | arXiv

X. Chen, C. Chen, Q. Cao, J. Xu, Y. Zhong, J. Xu, Z. Li, J. Wang, S. Guo
OH-Former: Omni-Relational High-Order Transformer for Person Re-Identification
Preprint, 2022. 
PDF | arXiv

Z. Deng*, Y. Zhong*, S. Guo, W. Huang (* equal contribution)
InsCLR: Improving Instance Retrieval with Self-Supervision

AAAI, 2022. 
PDF | arXiv | Code 

C. Feng*, Y. Zhong*, Y. Gao, M. Scott, W. Huang (* equal contribution)
TOOD: Task-aligned One-stage Object Detection
ICCV, 2021.  Oral
PDF | arXiv | code 

C. Feng, Y. Zhong, W. Huang
Exploring Classification Equilibrium in Long-Tailed Object Detection
ICCV, 2021. 
PDF | arXiv | code 

G. Liu, Y. Zhong, S. Guo, M. Scott, W. Huang
Unchain the Search Space with Hierarchical Differentiable Architecture Search
AAAI, 2021. 
PDF | arXiv | code 

H Tan, S. Guo, Y. Zhong, W. Huang
Mutually-aware Sub-Graphs Differentiable Architecture Search
Preprint, 2021
PDF | arXiv

Y. Zhong, L. Xie, S. Wang, L. Specia, Y. Miao
Watch and Learn: Mapping Language and Noisy Real-world Videos with Self-supervision
NeurIPS, 2020.  Self-Supervised Learning Workshop.
PDF | arXivDataset

Y. Zhong, Z. Deng, S. Guo, M. Scott, W. Huang
Representation Sharing for Fast Object Detector Search and Beyond

ECCV, 2020. 
PDF | arXiv | Code 

Y. Zhong, R.Arandjelović, A. Zisserman
GhostVLAD for Set-based Face Recognition

ACCV, 2018.
PDF | arXiv

Y. Zhong, R.Arandjelović, A. Zisserman
Compact Deep Aggregation for Set Retrieval

ECCV Workshop on CEFRL, 2018. Oral, Best Paper Award
PDF | DatasetExtended version (arXiv) 

Y. Zhong, R.Arandjelović, A. Zisserman
Faces in Places: Compound Query Retrieval

BMVC, 2016.
PDF | Dataset Project Page

M. Malaspina, Y. Zhong
Image-matching Technology Applied to Fifteenth-century Printed Book Illustration

Lettera Matematica, Springer, 2017.


CVPR2022 SoccerNet ChallengeAction Spotting Task
4th Place

CVPR2022 SoccerNet ChallengeRe-Identification Task
3rd Place


Exploring the British Library’s 1 Million Images
Instance and object retrieval across 1 million images from 17th-19th century books.

Matching Ballad Illustrations
Instantly match and compare printed illustrations in the Bodleian library ballads.