Planning work related to the Linked Open Data Network Program - see https://meta.wikimedia.org/wiki/LinkedOpenData
Archived per T404303
Planning work related to the Linked Open Data Network Program - see https://meta.wikimedia.org/wiki/LinkedOpenData
Archived per T404303
Done.
Yes, let's!
This looks to me like it could be archived. Do you agree @Lydia_Pintscher ?
Hey all -- I'm closing out this task. Based on the discussion above, this prototype is complete but we appear to be stalled as far as follow-up work. I have quickly drafted a model card to capture where we're at in a slightly more accessible/discoverable manner: https://meta.wikimedia.org/wiki/Machine_learning_models/Proposed/Wikidata_item_completeness.
sounds good. thanks Miriam !
@Mayakp.wiki no action items for you right now. We are going to revise our timelines and priorities for this work in a few months time.
@Miriam , can you pls let us know if there is an action item for Movement-Insights ?
In T321224#10072541, @Lydia_Pintscher wrote:In T321224#10054680, @XiaoXiao-WMF wrote:from product side:
- product adoption part is not clear @Lydia_Pintscher please clarify
From my side this would still be very useful to have, especially for tracking how the content evolves over time at scale. It is however less important than the revert risk model for Wikidata.
In T321224#10054680, @XiaoXiao-WMF wrote:from product side:
- product adoption part is not clear @Lydia_Pintscher please clarify
summary from discussion with Isaac:
the points above are mostly infrastructure constrains
@isarantopoulos fyi
From my side it'd still be great to move this forward and have a better Item Quality model in Liftwing. If there is anything I can do to help please let me know.
thanks @Isaac ! I will keep an eye on this task.
Pls reach out to Movement-Insights team if you need help or support with use cases, or to chime in on prioritization.
just double checking - what is the status of this? Should we close this / move to freezer? Any update we can add here?
@Miriam thanks for checking - this seems to be a victim of my sabbatical last year. Summary of where we are at:
will this have an impact, or help improve our existing way of measuring content gaps? if yes, then I would be happy to weigh in on some questions - like providing use-cases associated with this model.
@Isaac just double checking - what is the status of this? Should we close this / move to freezer? Any update we can add here?
Removing inactive task assignee. (Please do so as part of offboarding - thanks.)
Removing inactive task assignee. (Please do so as part of offboarding - thanks.)
Removing inactive task assignee. (Please do so as part of offboarding - thanks.)
I'm going to be out the next several weeks so FYI likely won't hear updates until mid-September on this. Thanks for these additional details though!
Thanks for this!
So in general it is pretty important for Items to be classified and put into the right place in the larger ontology. So these statements do imho deserve some sort of special status as they are generally more important than other statements.
Now there are several Properties that can represent such relations. The main ones we should probably focus on are instance of, subclass of and part of as explained on https://www.wikidata.org/wiki/Help:Basic_membership_properties.
That's quite an interesting table! Would it be possible to get the actual Item IDs for the last two rows? It could be instructive to know which Items the model thinks are very incomplete but have excellent quality :)
@Michael thanks for the questions! Some context: I think the completeness model is better suited for evaluating items (it's much more nuanced than the quality model, which largely just takes into consideration the number of statements an item has). This analysis hopefully will do two things: 1) help us find some places where the completeness model doesn't do great and we could tweak it, and, 2) build a sample of items to give to Wikidata experts to ensure that the completeness model is in fact capturing their expectations better than the quality model.
In T321224#9035684, @Isaac wrote:Oooh and the job worked! High-level data on overlap between the two scores where they are the same except completeness just takes into account how many of the expected claims/refs/labels are there and quality adds the total number of claims to the features too:
+------------------+-------------+---------+ |completeness_label|quality_label|num_items| +------------------+-------------+---------+ |D |D |29955491 | [..] |E |A |241 | |A |E |6 | +------------------+-------------+---------+
Oooh and the job worked! High-level data on overlap between the two scores where they are the same except completeness just takes into account how many of the expected claims/refs/labels are there and quality adds the total number of claims to the features too:
Updates:
Updates:
Updates:
Updates:
No updates still with prep for wikiworkshop/hackathon but after next week, hoping to get back to this!
From discussion with Lydia/Diego: