Skip to content

nethahussain/lingualibre-ml-wikt-bot

Repository files navigation

LinguaLibre → Malayalam Wiktionary Pronunciation Bot

A standalone Python bot that automatically transfers audio pronunciation recordings from LinguaLibre (hosted on Wikimedia Commons) to Malayalam Wiktionary pages.

Based on the LinguaLibre Bot project by Wikimedia France / LinguaLibre contributors.

What it does

  1. Queries the Wikimedia Commons API for all Malayalam (LL-Q36236 (mal)) pronunciation recordings
  2. Checks each corresponding Malayalam Wiktionary page
  3. Adds the audio template under the ==ഉച്ചാരണം== (Pronunciation) section — only if no audio file already exists
  4. Creates the pronunciation section as the first section if it doesn't exist yet

Example edit

Before:

==നിരുക്തം==
Etymology content...

==നാമം==
Definition here...

After:

==ഉച്ചാരണം==
* ശബ്ദം: {{audio|LL-Q36236 (mal)-Vis M-അമ്മ.wav}}

==നിരുക്തം==
Etymology content...

==നാമം==
Definition here...

Setup

1. Install dependencies

pip install pywikibot requests

2. Configure credentials

Copy the sample config files and fill in your Wikimedia credentials:

cp user-config.py.sample user-config.py
cp user-password.py.sample user-password.py

Edit user-config.py and set your username. Edit user-password.py with your bot name and password.

To get a bot password, go to: https://ml.wiktionary.org/wiki/Special:BotPasswords

3. Get bot approval

Before running in live mode, you need bot approval from the Malayalam Wiktionary community.

Usage

Dry-run mode (default — no edits made)

python lingualibre_ml_wikt_bot.py

This will query all Malayalam recordings and show what edits would be made, without touching any pages.

Live mode

python lingualibre_ml_wikt_bot.py --live

You'll be asked to type yes to confirm before any edits begin.

Process specific words only

python lingualibre_ml_wikt_bot.py --words അമ്മ പശു ഇന്ത്യ

Limit batch size

python lingualibre_ml_wikt_bot.py --limit 500

Filter by speaker

python lingualibre_ml_wikt_bot.py --speaker "Vis M"

All options

--live            Enable live editing (default: dry-run)
--words W [W ..]  Process only these specific Malayalam words
--limit N         Maximum number of recordings to process
--speaker NAME    Filter recordings by speaker name
--source {sparql,commons}   Data source (default: commons)
--edit-delay N    Seconds between edits (default: 10)
--verbose         Enable debug logging
--log-file FILE   Write logs to this file

Running tests

pip install pytest
pytest test_bot_logic.py -v

Files

File Description
lingualibre_ml_wikt_bot.py Main bot script
user-config.py.sample Pywikibot configuration template (copy to user-config.py)
user-password.py.sample Password file template (copy to user-password.py)
test_bot_logic.py Unit tests for wikitext manipulation logic
.gitignore Excludes credentials and logs from version control
LICENSE GPL-3.0 license

Technical details

  • Language code: mal (ISO 639-3), ml (Wikimedia), Q36236 (Wikidata)
  • File prefix: LL-Q36236 (mal)-
  • Target section: ==ഉച്ചാരണം==
  • Audio template: * ശബ്ദം: {{audio|FILENAME}}
  • Edit summary: FILENAME, LinguaLibre-യിൽ നിന്ന് ഉച്ചാരണം ചേർക്കുന്നു.
  • Safety: Only adds audio if none exists on the page; dry-run by default; rate-limited to 1 edit per 10 seconds

Attribution

This bot was created in response to the LinguaLibre Bot Request for Malayalam Wiktionary.

Licence

This project is released into the public domain under the CC0 1.0 Universal (CC0 1.0) Public Domain Dedication.

You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission.

About

A bot to add pronunciations of Malayalam words to Malayalam Wiktionary.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages