Investigate what data can be pulled from the Wikitext and HTML to simplify our API calls
Closed, ResolvedPublic8 Estimated Story Points
Actions

Assigned To

Authored By

	Protsack.stephan
	Sep 8 2022, 10:58 AM

Description

When we are fetching the metadata for an article we make a huge actions API call to do it, in order to make our API requests leaner and faster we want to investigate what we can pull off Wikitext and HTML instead of Actions API call.

Acceptance criteria
Tickets with actionable item about how and what we can extract a being created.

Things to consider

we can extract Templates
we can extract Categories
what other things we can exctract?

Parsers to evaluate
https://gitlab.wikimedia.org/repos/research/html-dumps
https://github.com/earwig/mwparserfromhell
https://www.mediawiki.org/wiki/Alternative_parsers

Event Timeline

Protsack.stephan created this task.Sep 8 2022, 10:58 AM

Restricted Application added a subscriber: Aklapper. · View Herald TranscriptSep 8 2022, 10:58 AM

Protsack.stephan triaged this task as Medium priority.Sep 8 2022, 11:22 AM

Protsack.stephan updated the task description. (Show Details)

Protsack.stephan added a project: Wikimedia Enterprise Engineering.Sep 8 2022, 11:29 AM

Protsack.stephan updated the task description. (Show Details)Sep 8 2022, 2:39 PM

• Lena.Milenko set the point value for this task to 8.Sep 8 2022, 2:43 PM

• Lena.Milenko moved this task from To Be Estimated/To Be Discussed- Migrating to Estimated /Discussed- Migrating on the Wikimedia Enterprise board.Sep 9 2022, 11:04 AM

Protsack.stephan assigned this task to prabhat.Oct 4 2022, 2:15 PM

Protsack.stephan moved this task from Estimated /Discussed- Migrating to In Progress on the Wikimedia Enterprise board.

AnnaMikla changed the task status from Open to In Progress.Oct 6 2022, 12:26 PM

Protsack.stephan claimed this task.Oct 12 2022, 9:47 AM

Protsack.stephan added a subscriber: prabhat.

Protsack.stephan moved this task from In Progress to Estimated /Discussed- Migrating on the Wikimedia Enterprise board.Oct 20 2022, 11:49 AM

Protsack.stephan moved this task from Estimated /Discussed- Migrating to Merge Request on the Wikimedia Enterprise board.Oct 27 2022, 2:16 PM

Protsack.stephan moved this task from Merge Request to In Progress on the Wikimedia Enterprise board.

AnnaMikla changed the task status from In Progress to Open.Oct 28 2022, 12:27 PM

AnnaMikla changed the task status from Open to In Progress.

Protsack.stephan moved this task from In Progress to Machine Readability PB- Migrating on the Wikimedia Enterprise board.Nov 2 2022, 10:26 AM

Protsack.stephan moved this task from Machine Readability PB- Migrating to Done Sprint 27 (20Oct-2Nov2022) on the Wikimedia Enterprise board.Nov 3 2022, 7:05 PM

Daria_Kevana changed the task status from In Progress to Open.Nov 6 2022, 11:17 PM

JArguello-WMF moved this task from Done Sprint 27 (20Oct-2Nov2022) to Done Sprint 28 (3Nov-16Nov2022) on the Wikimedia Enterprise board.Jul 14 2023, 3:30 PM

JArguello-WMF moved this task from Done Sprint 28 (3Nov-16Nov2022) to Done 31 on the Wikimedia Enterprise board.

JArguello-WMF closed this task as Resolved.Jul 14 2023, 3:36 PM

Investigate what data can be pulled from the Wikitext and HTML to simplify our API callsClosed, ResolvedPublic8 Estimated Story PointsActions

Description

Event Timeline

Investigate what data can be pulled from the Wikitext and HTML to simplify our API calls
Closed, ResolvedPublic8 Estimated Story Points
Actions