Skip to content

Commit 40de438

Browse files
authored
Update README.md
1 parent d4df890 commit 40de438

File tree

1 file changed

+4
-4
lines changed

1 file changed

+4
-4
lines changed

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,9 @@ Recipes for shrinking, optimizing, customizing cutting edge vision and multimoda
77
**NOTE:** GitHub refuses to render notebooks for a long time now, so the notebooks of smol-vision with rich outputs now lives [here](https://huggingface.co/merve/smol-vision). I still update this repository but it's inconvenient to read here.
88

99
Latest examples 👇🏻
10-
- [Grounded Fine-tuning](https://github.com/merveenoyan/smol-vision/blob/main/Grounded_Fine_tuning%20GH.ipynb)
11-
- [Fine-tune Florence-2](https://github.com/merveenoyan/smol-vision/blob/main/Fine_tune_Florence_2.ipynb)
12-
- [Fine-tune DINOv3](https://github.com/merveenoyan/smol-vision/blob/main/DINOv3_FT.ipynb)
10+
- [Fine-tune Kosmos2.5 on OCR with bounding boxes](https://github.com/merveenoyan/smol-vision/blob/main/Grounded_Fine_tuning%20GH.ipynb)
11+
- [Fine-tune Florence-2 on document question answering](https://github.com/merveenoyan/smol-vision/blob/main/Fine_tune_Florence_2.ipynb)
12+
- [Fine-tune DINOv3 on image classification](https://github.com/merveenoyan/smol-vision/blob/main/DINOv3_FT.ipynb)
1313

1414
**Note**: The script and notebook are updated to fix few issues related to QLoRA!
1515

@@ -31,4 +31,4 @@ Latest examples 👇🏻
3131
| Any-to-Any Fine-tuning | [Fine-tune Gemma-3n for all modalities (audio-text-image)](https://github.com/merveenoyan/smol-vision/blob/main/Gemma3n_Fine_tuning_on_All_Modalities.ipynb) | Fine-tune Gemma-3n model to handle any modality: audio, text, and image. |
3232
| Any-to-Any RAG | [Any-to-Any (Video) RAG with OmniEmbed and Qwen](https://github.com/merveenoyan/smol-vision/blob/main/Any_to_Any_RAG.ipynb) | Do retrieval and generation across modalities (including video) using OmniEmbed and Qwen. |
3333
| Speed-up/Memory Optimization | Vision language model serving using TGI (SOON) | Explore speed-ups and memory improvements for vision-language model serving with text-generation inference |
34-
| Quantization/Optimum/ORT | All levels of quantization and graph optimizations for Image Segmentation using Optimum (SOON) | End-to-end model optimization using Optimum |
34+
| Quantization/Optimum/ORT | All levels of quantization and graph optimizations for Image Segmentation using Optimum (SOON) | End-to-end model optimization using Optimum |

0 commit comments

Comments
 (0)