-
Notifications
You must be signed in to change notification settings - Fork 239
Description
There are cases where ORCID URLs and Handles are not valid URIs, which breaks attempts to parse JSON-LD as RDF. These happen in about 10-20 records in a sample of 5000 that I am working with. Not supper common, but enough to break things.
URLs sometimes lack the http prefix, e.g the personal page for https://orcid.org/0000-0003-1802-2649. This breaks RDF, but also the ORCID web page: The personal page for Andrey I. Khalaim is given as https://orcid.org/www.zin.ru/labs/insects/hymenopt/personalia/khalaim/ instead of https://www.zin.ru/labs/insects/hymenopt/personalia/khalaim/
Ideally a simple regular expression to check users have actually input a URL would catch these.
For Handles there are some very bad examples at https://orcid.org/0000-0003-2573-1371 such as:
2018 | Dissertation/Thesis
SOURCE-WORK-ID: cv-prod-id-513032
HANDLE: Cecchetti, Arianna. "Effects of tourism operations on the bahavioural patterns of dolphin populations off the Azores with particular emphasis on the common dolphin (Delphinus delphis)". 2018. 112 p.. (Dissertação de Mestrado em Biologia). Ponta Delgada: U
HANDLE: http://hdl.handle.net/10400.3/4982
OTHER-ID: 101606494
CONTRIBUTORS: Cecchetti, Arianna
Note that first Handle is http://hdl.handle.net/cecchetti,%20arianna.%20%22effects%20of%20tourism%20operations%20on%20the%20bahavioural%20patterns%20of%20dolphin%20populations%20off%20the%20azores%20with%20particular%20emphasis%20on%20the%20common%20dolphin%20(delphinus%20delphis)%22.%202018.%20112%20p..%20(disserta%C3%A7%C3%A3o%20de%20mestrado%20em%20biologia).%20ponta%20delgada:%20u
This is probably a trivial error in the user-supplied content, but ideally this would be caught on input. I realise that dealing with user-supplied content can be a bit of a nightmare.