Skip to content

imageCompression1995/CLIP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 

Repository files navigation

CLIP

Collect recent papers on CLIP and prompt learning

core thinking

  • Why CLIP-prior works well?
    • CLIP learns relationships between vision and language from 400 million text-image pairs.
  • How to transfer?
    • Zero-shot transfer
    • Prompt learning

Seminal Work

Title Year Venue Code Notes
Learning Transferable Visual Models From Natural Language Supervision 2021 ICML Link Link

Improved Work

Title Year Venue Code
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm 2022 ICLR Link
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation 2022 Arxiv Link
SLIP: Self-supervision meets Language-Image Pre-training 2022 Arxiv Link

Applications

Image Classification

Title Year Venue Code
Learning to Prompt for Vision-Language Models 2021 Arxiv Link
Neural Prompt Search 2022 Arxiv None
Prompt Distribution Learning 2022 CVPR None
Conditional Prompt Learning for Vision-Language Models 2022 CVPR Link
CLIP-Adapter: Better Vision-Language Models with Feature Adapters 2022 Arxiv Link
Tip-Adapter: Training-free CLIP-Adapter for Better Vision-Language Modeling 2022 Arxiv Link
Unsupervised Prompt Learning for Vision-Language Models 2022 Arxiv Link

Detection

Title Year Venue Code Notes
Learning to Prompt for Open-Vocabulary Object Detection with Vision-Language Model 2022 CVPR Link Link
DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting 2022 CVPR Link Link

Image/Video translation/editing

Title Year Venue Code
HairCLIP: Design Your Hair by Text and Reference Image 2022 CVPR Link
FlexIT: Towards Flexible Semantic Image Translation 2022 CVPR None
VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance 2022 Arxiv Link

Image/Video Understanding

Title Year Venue Code
Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos 2022 CVPR Link

Visual Grounding

Title Year Venue Code
ClipCap: CLIP Prefix for Image Captioning 2021 Arxiv Link
CPT: Colorful prompt tuning for pre-trained vision-language models 2021 Arxiv None
ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension 2022 ACL None

About

collect recent papers on prompt learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors