Smaller models suffer a lot if asked to provide sturcuted output. That is why we are using extractors, however they can become expensive with big models, and not effective with small models.
That is why it would be awesome if we had a model which can extract structured outputs from an LLMs answer.
There is such mode: https://huggingface.co/osmosis-ai/Osmosis-Structure-0.6B, however it doesn't support sophistictead output formats like the one we have!
The task is to fine-tune a model to extract our expected structured outputs in a small model which we can run on edge.