Comments on GrandIR Blog: Badly-coded affiliations: a too long-standing curse

2014-04-26T08:59:05.176-07:00

This comment has been removed by a blog administrator.

Thanks for your comment, Muriel.I recently came ac...

2013-07-10T15:02:47.455-07:00

Thanks for your comment, Muriel.
I recently came across the following paragraph by David Palmer, Hong Kong University, in his paper "The benefits of authority management in an IR; more than name disambiguation" about correcting mistakes regarding ID issues for HKU institutional authors, http://hub.hku.hk/handle/10722/184124:

"(...) Our work with Scopus corrected 10,000s of egregious Scopus errors on HKU data, again exacerbated by the use of Romanization for Hanzi names. These errors include one author with two or more Scopus AU-ID profiles, one author’s papers distributed amongst two or more profiles of orthographically dissimilar people, two or more homonymous individuals erroneously shown as one Scopus profile, erroneous affiliations, and more".
With this I mean that the post is addressing a well-known, very frequent issue in authors' affiliation identification and that the reference to EuropePMC is rather incidental here. When using the expression "badly coded" I mean of course impossible to process by the RJ Broker, as the next paragraph makes clear. No offence meant then to EuropePMC, whose data are anyway often delivered by publishers themselves, who collected them in turn from the very authors at manuscript submission time. Hopefully the ORCID affiliation feature to be shortly provided by Ringgold/ISNI will gradually alleviate this burden.

I’m glad you enjoyed the seminar on the RJ Broker....

2013-07-09T01:54:07.511-07:00

I’m glad you enjoyed the seminar on the RJ Broker. However, I would like to add some background information to the figures you highlight in my slide. The context of this slide was RJ Broker centric and on what the broker can do with the data. The RJ broker identified organisation for ~22,500 records, leaving us with ~36,000 for which we couldn’t identified an organisation. You jumped to the wrong conclusion that these were all due to “badly coded” metadata. This isn’t the case. There are several reasons why the RJ Broker can’t identify an organisation. Yes, some records had issues with the metadata, whether missing, incorrect or incomplete. A typical example is to have an organisation field stating ‘Medical Institute’ which although correct is not specific enough. The RJ Broker relies on the Organisation and Repository Identification (ORI) tool (ori.edina.ac.uk) to identify organisations which holds information on over 24,000 organisations worldwide. However, we know ORI does not hold all organisations, for example we recently discovered that the British Antarctic Survey wasn’t in the ORI because it is not listed in any of the sources ORI harvests data from. Some of these records will have valid organisation field but the RJ Broker cannot identify them.

It would be interesting to analyse these ~36,000 records and sort them according to why an organisation cannot be identify. However, this is a time consuming process which we have not yet been able to perform.