Tools and data for metrics on Wikimedia's technical community at wikimedia.biterg.io (hosted by Bitergia, based on the GrimoireLab software suite).
See mw:Community metrics for more information.
Tools and data for metrics on Wikimedia's technical community at wikimedia.biterg.io (hosted by Bitergia, based on the GrimoireLab software suite).
See mw:Community metrics for more information.
unlicking the cookie
@Aklapper @thcipriani I think this needs you or we need to chat. I don't have an immediate actionable and don't want it to be stuck.
It sounds like, many of the repos you're unable to access are deleted or private. I suppose what we should work on is how to keep this list up-to-date? What do you think @Izubiaurre?
I am thinking the ideal outcome would be if we stop maintaining a manual list and switch to a model where we pull all repositories that are publicly available, and simply skip 404s.
Hi all!
...our current config, it says "Wikimedia sync by RepOSSync" updated it a couple days ago. So there is clearly more to this than I thought.
See T385529: Automatically export and publish a list of WMF deployed code repositories in Bitergia's JSON format for some related background.
Oh, I see. Thank you for the extra details. Let me try to sync with people on our side who were more involved with that and get back to you.
@Izubiaurre I see! I think it might be best if we replace a manual list with a programmatic way to get a list of all (accessible) repos.
It seems a manual list would be outdated almost instantly.
Then the approach would be to just get all repos you can get, ignoring 404s.
We would always have the current number of repos and nothing would have to be maintained manually (both new repos being added and old repos deleted or moved).
@Izubiaurre I see! I think it might be best if we replace a manual list with a programmatic way to get a list of all (accessible) repos.
let's take a step back for a second and let me ask
@Izubiaurre Maybe let's take a step back for a second and let me ask this: How do you get the list of all repos you are crawling to begin with?
private repos then we probably would not want to extract data to external.
Can we allowlist their IPs (or user agent which I believe is owlbot)?
The errors are HTTP 404. But if we browse the URLs we get asked to login.
https://gitlab.wikimedia.org/sonarbot/ is a user, should it be offering a git link?
I get a 404 from https://gitlab.wikimedia.org/sonarbot/sonarqube.git when logged in and an IDP login prompt when not logged in. Do we have any other examples? Can Bitergia identify a date when the issues started?
Uh, thanks for filing this! Appreciated!
@ldelench_wmf No reply; assuming this is resolved. Please reopen if there is more to sort out - thanks!
@ldelench_wmf: Hi, could you please confirm whether you received an email and if you can log in? Thanks!
Also, per https://www.mediawiki.org/wiki/Community_metrics, please note that
In general, the name of a custom object (such as a custom visualization or custom dashboard) created manually by an admin must have the prefix C_ so it does not get overwritten by the next upstream software update.
@ldelench_wmf: The credentials should be in your mail inbox. Please confirm and if all works well, please set the status of this task to resolved. Thanks!
Requested in https://support.bitergia.com/support/tickets/1586
After talking to Bitergia folks, we'll need to make some tweaks.
Rough solution starting from @thcipriani's initial sketch:
In T306770#7933958, @Aklapper wrote:Thanks. That means that I could at least run manual checks on https://ldap.toolforge.org/user/username (replace username) which lists email addresses and group membership (such as wmf).
At some point months ago I switched on "Affiliate to organizations automatically" at https://wikimedia.biterg.io/identities/settings/general once a week
Currently our release tools has a dump-bundle.py script that allows us to dump the list of repos we branch each week: