Skip to content

T403500 Add DAG for calculating Wikipedia template usage via flag table

Contributor checklist

Description

  • T403500: DAG runs HQL queries to derive the pages that have infoboxes, databoxes and Listeria Wikidata lists on a monthly basis and then calculates the total usages.
    • DAG ID: wikipedia_template_monthly
    • Destination: wmde.wikipedia_template_flags_monthly
    • Destination: wmde.wikipedia_template_usage_monthly

Note: Included is removing instances of VariableProperties that are hanging in similar DAGs as I realized that this was from a prior method we were using to derive the path to wmf_raw tables.

Test outputs

Destination table summary
  • wmde.wikipedia_template_flags_monthly
month wikipedia page_id infobox_found databox_found listeria_found
DATE STRING BIGINT BOOLEAN BOOLEAN BOOLEAN
  • wmde.wikipedia_template_usage_monthly
month wikipedia total_infobox_pages total_databox_pages total_listeria_pages
DATE STRING BIGINT BIGINT BIGINT
Test screenshots
  • DAG_ID

SCREENSHOT_OF_EACH_COMPLETED_DAG_GRAPH

Edited by Andrew McAllister (WMDE)

Merge request reports

Loading