Page MenuHomePhabricator

Investigate How Wikidata Edits Appear on Wikipedia Changelogs
Closed, ResolvedPublic

Description

As a researcher, I want to understand how links between Wikidata and Wikipedia are structured so that I can determine how changelogs are currently generated and displayed.

Acceptance Criteria:

  • A documented review of how Wikidata changes are reflected in Wikipedia changelogs
    • What do we see on the screen as an editor?
    • What happens in the background? Which extensions, functions and Databases are involved?
  • Identification of key data sources and how they contribute to the changelogs.
    • Ensuring that we also check the information that is not in use in the current UI but could be helpful in future implementations.
  • Ticket owner(s) hold a knowledge-sharing session for other engineers to ensure everyone is informed and can ask questions about the documentation

Check-list

  • Identify key sources where changelogs originate (e.g. Wikidata change logs, Wikipedia edit history, API endpoints e.t.c.).
  • Analyze how different types of Wikidata edits (e.g. adding/removing statements, merging items) appear on Wikipedia.
  • Document how these changelogs are formatted and displayed in Wikipedia edit histories.
  • Summarize findings and highlight inconsistencies or unclear elements.
  • What is the 'related changes' page? (out of scope)
  • add task
  • add task
  • add task

Details

Other Assignee
WMDECyn

Event Timeline

Weekly progress update:

  • Local, beta, live frontend investigation
    • Wikidata changes are grouped with an arrow
    • There are differences in format for local vs. live - to research:
      • Which changes don’t come through at all
      • Which changes come through but without enough detail in summary
      • Example: for changing property of type item’s item, the ‘label’ is what we need and this works correctly in local but not production
  • Documenting examples in progress
  • Backend investigating
  • Languages investigating

Ahh I did not refresh the page thus did not see Suzie already has written almost the same summary :')

There are slight differences so I keep mine as well.

A very brief update about investigation so far:

We followed a similar scenario each time to observe current behaviour.

  1. Login
  2. Turn on the setting to display Wikidata changes
  3. We make sure a Wikidata item is related (sitelink, parser funclion or Lua)
  4. Edit the item on Wikidata
  5. Check the watchlist in Wikipedia for Wikidata edit

Different ways to make a Wikidata edit appear on Wikipedia Watchlist is explored:

  • Lua
  • Parser function
  • Sitelinks

Different test environments are visited with the same scenario:

  • Local
  • Live
  • Beta

Some differences between local and live is visible.

  • PIDs and QIDs are resolved to their labels on local but not on live.
  • curr,prev in local and diff, hist on live.

Possible related code:
extractTextAndLanguage function in DefaultEntityLinkFormatter.php seems like responsible for resolving the labels from QIDs.

Weekly progress update:

  • Collation of formats of change logs in progress
  • Setup of local to match live with client-only connected to client+repo
  • Investigated how client watchlists and recent change tables are notified after repo-side changes

Weekly progress update:

  • Exploration of local setup
    • Mapping out how changes are propagated from repo to client
    • Finding which hook handler might control QID/PID resolution in repo (HtmlPageLinkRendererEndHookHandler)
  • Investigating how interwiki links ([[d:Q1]]) are rendered in content and edit summaries
  • Recognizing namespace ID could be about resolving (or not resolving) QIDs
  • Research about grouping searches (only happens on local afaik)

Weekly progress update:

Gathering findings from this investigation into a summary - this is being finalised and will include:

  • Summary of all Wikidata change types
  • Client / repo+client setups
  • Possible methods to see WD edits on Watchlist
  • Difference between grouped and ungrouped view
  • Difference between Summary and Changelogs
  • Wikdata edits formats
  • Related possible usages with [[]]
  • Namespaces
  • Change propagation diagram

Progress with planning how to implement adding labels into the summaries:

  • Found a potential place in the code for adding labels into the summaries - LinkerMakeExternalLinkHook. Exploring this further by looking at different ways to resolve the label through the hook, e.g: A new implementation / moving the label resolving mechanism to shared lib so it can be accessed by both repo and client as they shared the same logic
  • Labels in the client is resolved again in the client side as external link.
  • Hook interface provides different parameters than the one in repo
  • Some functions are not in shared lib and only in repo which is not accessible by client

Change #1129955 had a related patch set uploaded (by Seanleong-wmde; author: Neslihan Turan):

[mediawiki/extensions/Wikibase@master] Resolve labels on Watchlist

https://gerrit.wikimedia.org/r/1129955

Weekly progress update

  • Report almost finalised

Still to investigate:

  • Why “Wikibase Item Changed” sometimes comes up
  • How the hooks interact
  • Check with product: do we need to resolve the square bracket links from client-side summaries

Change #1129955 had a related patch set uploaded (by Seanleong-wmde; author: Neslihan Turan):

[mediawiki/extensions/Wikibase@master] Resolve labels on Watchlist

https://gerrit.wikimedia.org/r/1129955