Skip to content

Conversation

@notriddle
Copy link
Collaborator

This adds a callback handler that can parse any arbitrary block quote kind, as long as square brackets are correctly nested. You can choose your own data type for the payload, as long as it can be copied, and it should cover obsidion's stuff.

@Martin1887
Copy link
Collaborator

Interesting, thanks! However, fmt errors should be fixed.

Also, we initially integrated admonitions in a miscellaneous GFM flag, changing it in favor of an admonition flag would be a breaking change (maybe a better approach, we have to discuss it).

@Martin1887 Martin1887 requested a review from ollpu February 9, 2025 11:33
@notriddle
Copy link
Collaborator Author

  • okay, I've fixed the fmt errors
  • as for this being a breaking change, I'm pretty sure the new generic param is a breaking change anyway, so it's kind of moot
  • now that it supports non-GFM admonition tags, it's not really a GFM feature any more

@cestef
Copy link

cestef commented Feb 10, 2025

@notriddle push_html is missing a type parameter for the underlying Event<'a>:

pub fn push_html<'a, I>(s: &mut String, iter: I)
where
I: Iterator<Item = Event<'a>>,
{
write_html_fmt(s, iter).unwrap()
}

Which I think forces you to use the default implementation of Event.

@cestef
Copy link

cestef commented Feb 10, 2025

My idea was to implement a simple ToClass trait for admonitions that will be used with HtmlWriter:

pub trait ToClass<'a> {
    fn to_class(&self) -> &'a str;
}

impl<'a, I, W, AD> HtmlWriter<'a, I, W>
where
    I: Iterator<Item = Event<'a, AD>>,
    W: StrWrite,
    AD: ToClass<'a>,
// ...

The default html conversion then becomes:

Tag::BlockQuote(kind) => {
    let class_str = kind.map(|k| k.to_class()).unwrap_or("");
    if self.end_newline {
        self.write(&format!("<blockquote{}>\n", class_str))
    } else {
        self.write(&format!("\n<blockquote{}>\n", class_str))
    }
}

as for the implementation for the GFM BlockquoteKind:

impl ToClass<'_> for BlockQuoteKind {
    fn to_class(&self) -> &'static str {
        match self {
            BlockQuoteKind::Note => "markdown-alert-note",
            BlockQuoteKind::Tip => "markdown-alert-tip",
            BlockQuoteKind::Important => "markdown-alert-important",
            BlockQuoteKind::Warning => "markdown-alert-warning",
            BlockQuoteKind::Caution => "markdown-alert-caution",
        }
    }
}

This adds a callback handler that can parse any arbitrary block
quote kind, as long as square brackets are correctly nested.
You can choose your own data type for the payload, as long as
it can be copied, and it should cover obsidion's stuff.
@ollpu
Copy link
Collaborator

ollpu commented Mar 4, 2025

Apologies for not getting to this sooner.

I'm not really a fan of making Tag generic and adding further generics to Parser.

Could the callback traits be combined? So BrokenLinkCallback is replaced with something like:

pub trait ParserCallbacks<'input> {
    fn handle_broken_link(...) -> ...;
    fn handle_admonition_tag(...) -> ...;
}

(possibly with default bodies so you can override only one of them)

In hindsight, this is how it should have been named from the beginning.

As for avoiding the generic on Tag, could we do something like this?

pub enum BlockQuoteKind<'input> {
    Note,
    Tip,
    Important,
    Warning,
    Caution,
    Custom(CowStr<'input>),
}

I imagine putting the raw text (WARNING, MY_CUSTOM_ADMONITION) in the custom field is usable enough for the majority of use cases.

@notriddle
Copy link
Collaborator Author

I like the idea of combining the traits.

I don't like the "Note, Tip, Important, Warning, Caution, Custom(CowStr)" design. If making it generic isn't acceptable, then it seems simpler to pass the string through, even when it's one of GitHub's standard tags.

@ollpu
Copy link
Collaborator

ollpu commented Mar 4, 2025

pass the string through, even when it's one of GitHub's standard tags.

That sounds alright too

@Martin1887
Copy link
Collaborator

Martin1887 commented Mar 5, 2025

But if we use just a string, the renderer behaviour to display built-in icons and so would have to match strings instead of enum variants, that does not seem good... Well, it could just append the lowercase trimmed string to the text ' markdown-alert-`, but not making distinctions between standard and non-standard tags seems weird to me.

@notriddle
Copy link
Collaborator Author

Because the set of supported tags is fixed, we don't actually have to allocate anything. The 'static lifetime is a subtype of 'input. And we can define the return value as the class, not just the tag, making the renderer simpler.

pub trait ParserCallbacks<'input> {
    fn handle_broken_link(
        &mut self,
        link: BrokenLink<'input>,
    ) -> Option<(CowStr<'input>, CowStr<'input>)> {
        None
    }
    fn handle_admonition_tag(
        &mut self,
        tag: &'input str,
    ) -> Option<CowStr<'input>> {
        Some(CowStr::Borrowed(if tag.eq_ignore_ascii_case("note") {
            "markdown-alert-note"
        } else if tag.eq_ignore_ascii_case("tip") {
            "markdown-alert-tip"
        } else if tag.eq_ignore_ascii_case("important") {
            "markdown-alert-important"
        } else if tag.eq_ignore_ascii_case("warning") {
            "markdown-alert-warning"
        } else if tag.eq_ignore_ascii_case("caution") {
            "markdown-alert-caution"
        } else {
            return None;
        }))
    }
}

The biggest problem with involving strings in this data structure at all, even if we don't ordinarily use it, comes from producing a valid TagEnd. We have to keep that string around to be able to emit it on both events.

Copy link
Collaborator

@ollpu ollpu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we move forward with what @notriddle suggested above, i.e., have the callback return the class name directly.

The biggest problem comes from producing a valid TagEnd. We have to keep that string around to be able to emit it on both events.

But do we have to include the string in TagEnd? It may be a ltitle asymmetric, but there is precedent in that Tag::List(Option<..>) turns into TagEnd::List(bool). Similarly, we could have TagEnd::BlockQuote(bool). That is enough for our renderer at least, though it is potentially a challenge in some custom rendering situations.

In any case, I don't think the user-provided callback should be called twice (for both begin and end events). It could be annoying if you want to make it stateful or some kind of expensive lookup. Better to store it in the tree (via CowIndex).

Copy link
Collaborator

@ollpu ollpu Jul 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather keep these svgs out of the repo, as they are quite large and quickly obsolete.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But do we have to include the string in TagEnd?

Yes, other renderers want enough information to avoid maintaining a stack. #889

The bool approach is good enough for some of them, but someone trying to parse MSDN docs won't find it sufficient.

I would rather keep these svgs out of the repo

You're right. That's a mistake. I didn't intend to commit a flamegraph to the repo.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, other renderers want enough information to avoid maintaining a stack.

Alright, then let's have it in TagEnd too.

@ollpu
Copy link
Collaborator

ollpu commented Jul 12, 2025

I might make an initial PR to rename / adjust BrokenLinkCallback to ParserCallbacks in case we want to use it in #1043, if that's ok.

At the same time I'm considering moving the 'input lifetime parameter from the trait itself to the individual callbacks.

@ollpu
Copy link
Collaborator

ollpu commented Jul 28, 2025

I guess #1043 won't be needing a callback so my PR #1049 is not crucial. Use/review it if you like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants