We’re looking for freelance technologists to import personal profiles for politically exposed people (PEPs) into Wikidata. Specifically, we’re seeking help with these tasks:
- Scrape membership lists for the national parliaments in the following countries: Costa Rica, Honduras, Mexico, Salvador, Guatemala, Nicaragua.
- For Mexico, we’d further like to explore covering state parliaments and governments.
- For all of these countries, we’d like to explore if the governments publish PEP datasets.
- Match the resulting profiles against Wikidata to find which politicians already have existing items (Q-IDs), and create missing items where needed.
- Upload the following information for all members, if available from an official source:
- a) name, b) date of birth, c) citizenship, d) gender, e) political position held (i.e. member of legislature, Wikidata P39). Optional properties: occupation, place of birth, family members, party affiliation.
- Each of the resulting statements should be sourced to a government web site using a Wikidata statement reference.
- Follow conventions established here: https://www.wikidata.org/wiki/Wikidata:WikiProject_every_politician/Political_data_model
- This is a test run for future similar projects, and we’d like to gather and document best practices (esp. for how to match published government data and Wikidata items).
- We expect that importing the data into Wikidata will be a more challenging task than getting the membership lists.
Technologies:
- Python (preferred), NodeJS (possible)
- Desirable: implement in a way that is repeatable, e.g. executable via Github Actions
Terms:
- Ideally conducted within the next 4-6 weeks (Feb/Mar 2023).
- Freelance service contract, paid per day or (if preferred) hourly. Rate based on experience.
- Code to be published as open source, under MIT license.
Contact:
- [email protected]
- Please include samples of prior work in scraping and/or using the Wikibase API
- Ideally include a time/cost estimate
Notes from feedback to this ad: