GrandIR Blog

Sunday, 27 April 2014

JGIC'2014: Discussing Research Information Management in Catalonia

The third edition of the Workshop on Research Information Management (Jornades de Gestio de la Informacio Cientifica, JGIC'2014) was held at the Institute for Catalan Studies (IEC) in Barcelona last Apr 24-25. The event brought together over 150 attendees from Catalan universities, research centres, funders and policymakers to discuss different areas in research information management such as research data management, research impact and evaluation, global standards for system interoperability and new features under implementation at institutional research information management systems at universities.

Emphasis was made at JGIC'2014 on topics such as the need to find new models and indicators for research assessment, the emerging best practices in research data management at specific universities, the gradual uptake of ORCID through an intensive advocacy campaign at institutional level and the positive impact of international collaborations on the quality of the research performed in Catalonia. The new Futur institutional research portal developed at Polytechnical University of Catalonia (UPC) was presented. Futur, which has been modeled upon the Dutch Narcis initiative, aims to to increase the online visibility and impact of UPC research results, raise the profile of the University’s researchers worldwide, and facilitate access to scientific communication.

Up-to-date information was also delivered at the event on the Catalan Research Portal (PRC) project, which was first presented last year at the previous JGIC edition . PRC is an initiative by the Consortium for University Services in Catalonia (CSUC) for aggregating research information from different Catalan universities by collecting the data from the institutional CRISs. Once a simplified CERIF-compliant datamodel has been defined, PRC is presently working for ensuring the appropriate mechanisms are made available for the various CRISs (none of which is CERIF-compliant at the moment) to provide a common CERIF-XML datastream into the Portal. PRC will thus become a Catalan CRIS with emphasis on people, publications and projects while providing the means at the same time for all institutional CRISs at Catalan universities to become able to export CERIF-compliant datasets.

Wednesday, 19 February 2014

Integrating ORCID iDs into repositories: two tips

Emails keep arriving these days from colleagues in different countries asking about the steps to follow in order to integrate ORCID identifiers into their institutional repositories. My regular answer to them is two-fold: “please make sure you have all your institutional ORCID iDs available before moving onto the integration process” and “please be patient and wait for a standard for technical integration to arrive from institutions already working on this area”. While the first recommendation seems quite straightforward, I am often asked to explain why I'm suggesting people to wait a bit instead of encouraging them to go ahead with their technical work. So here's the explanation:

The questions I am regularly collecting will usually deal with the best way to code the ORCID iD into a repository metadata set (usually for DSpace). Colleagues accurately guess the ORCID iD will need to be linked to the author's name, but the ways they propose to actually do this are not always identical. This would be no major issue if the objective were just featuring the ORCID iDs as an additional piece of data in the repository items, but ORCID is able to offer much more than this repository-based functionality to institutions and their scholars. The real objective of the ORCID iD integration into repositories is achieving interoperability with the ORCID profiles for researchers, enabling a two-way syncing between both systems that will allow publications hosted in the repository to be automatically displayed on the author's ORCID profile and vice-versa. This way, researchers will be spared the tedious work of manually updating their “low-profile” publications (those which won't be automatically retrieved from Scopus, the WoS or CrossRef) by having them automatically delivered by the repository, where Open Access to the full-text will usually be offered from on top of that.

In order to design and implement an effective ORCID-repository handshake, there should ideally be oneintegration standard for each main repository platform, and this should be delivered by the repository platforms themselves – same way as if we're expecting to collect a feature to integrate ORCID iDs into Open Journal System, we would expect it to be delivered by PKP, not single institutions working independently from the system provider. This will make things much easier for ORCID – who will need to technically support just one integration process per platform and version – and also for institutions, who will be able to benefit from a tested integration mechanism already validated by colleagues at pioneering HEIs.

There is an ongoing effort in this regard since a few months ago as a result of the funding provided by the Sloan Foundation to a set ofORCID integration projects in the US, some of which are dealing with the integration of ORCID iDs into DSpace repositories. As stated in the post, "grantees (...) will share a demo of their prototype integration at the Spring 2014 ORCID Outreach Meeting to be held in Chicago on May 21-22". It's then a matter of three months to have the technical means freely available for integrating ORCID iDs into repositories in a harmonised way. When replied from the most innovative colleagues that three months is a long time, I suggest them to directly contact the University of Missouri MOSpaceRepository Team for more info – and especially to try to make sure they'll have all their ORCID iDs ready when the technical solution becomes available.

Tuesday, 18 February 2014

Building pioneering functionality around ORCID integration: FCT and Portugal

Last week I was kindly invited by the the Portuguese Fundação para a Ciência e a Tecnologia (FCT) to deliver an ORCID presentation at the annual workshop on technical issues FCT held in Évora for HEIs in Portugal, the Jornadas FCT-FCCN. The talk was scheduled within a PT-CRIS session dealing with the converging worklines in research information management (RIM) that FCT have in mind to build a strong national RIM infrastructure with a CERIF-compliant National Research Information System or PT-CRIS at the top and ORCID playing a key role for ensuring interoperability among the different systems involved.

Save for the Sloan-funded ORCID integration projects being presently carried out in the U.S., Portugal is providing the most innovative approach to ORCID exploitation one is aware of to date. Once the research funder (FCT) has ensured a very significant ORCID uptake by researchers in a remarkably short time – collecting 40,000 registrations in three weeks – they are now planning the strategy to effectively put these identifiers to work by integrating them into the different RIM systems that are run at national level. These include National Open Access platforms such as RCAAP, with links to institutional repositories and OJS-managed Open Access journals – an area the Sloan-funded projects are also covering to some extent – but also national CV platforms like DeGóis or systems like Authenticus for automatic publication retrieval for institutions and researchers in the whole country. And all of this on a shoestring budget which fits the difficult economic situation Southern European countries are presently undergoing.

The ORCID presentation provided a brief analysis of the FCT-driven process for making researchers register with ORCID and some available examples for current ORCID integration projects for funders, publishers and institutions. The work the FCT-driven Working Group will be carrying out during the next months will build on these best practices to develop pioneering functionality. At this point one cannot help but again praising the way some small countries seem regularly able to coordinate relevant RIM stakeholders at national level in an efficient fashion.

Sunday, 22 September 2013

An attempt to provide new services to the repository network in the UK: the UK RepositoryNet+ Project

A while ago I was asked to write a brief note on the RepNet project for a the 'ThinkEPI Notes', a Spanish series of short updates on recent developments on the area of libraries and technology. Since it's a rather long text with a significant number of hyperlinks in it, I have chosen to offer it online from this blog as well so that readers may find it easier to read than via a message in a mail list. The text below is in Spanish as a result – but I shall try to provide an English translation as soon as I'm able to.

Un ensayo para el desarrollo de servicios para repositorios en el Reino Unido: el proyecto UK RepositoryNet+

El texto de esta nota está también disponible en la web de ThinkEPI.

Después de la ya tardía Declaracion de la Alhambra (mayo 2010), continúan llegando en estos días desde España noticias sobre nuevas declaraciones en apoyo del acceso abierto a nivel institucional. Aunque no completamente desprovistas de utilidad –especialmente si redundan en una mejor dotación de medios técnicos y humanos para los equipos que tratan de implantar los objetivos citados en dichos textos– estas declaraciones carecen de sentido si se limitan a ser meras expresiones de apoyo a una iniciativa próxima a cumplir diez años desde su lanzamiento [1]. Una vez que como fruto del trabajo de muchos profesionales en las bibliotecas universitarias y de centros de investigación de todo el mundo se ha alcanzado un grado de consolidación de la red de repositorios de acceso abierto que no admite vuelta atrás, el siguiente paso es aventurarse en el desarrollo de servicios sobre esa capa de infraestructura que atiendan a las necesidades de académicos e investigadores y de sus instituciones. Este es el espíritu que ha guiado el devenir del proyecto UK RepositoryNet+ en el Reino Unido [2], que se autodefine como "una iniciativa para la creación de una infraestructura socio-técnica que soporte el depósito, la curación y la difusión en acceso abierto de la literatura de investigación".

Mucho se ha hablado en este último año de la "errónea apuesta del Gobierno Británico por un modelo insostenible de acceso abierto 'dorado' (Gold Open Access) financiado mediante cuotas por procesamiento de artículos detraídas de los magros presupuestos disponibles para la investigación". Sin pretender que dicha afirmación sea completamente errónea, es preciso tener también en cuenta la cuantiosa inversión (con cifras de siete dígitos en libras esterlinas) realizada simultáneamente en una investigación sobre las vías de consolidación de la ruta verde y los repositorios de acceso abierto sin parangón en Europa [3] a través de este proyecto RepNet, apenas mencionado por contra en las acaloradas discusiones "Gold vs Green" que vienen teniendo lugar desde hace algún tiempo en las listas de distribucion de la disciplina.

Esto se debe principalmente al hecho de que, frente a la simplicidad de una política de acceso abierto concreta que es fácil juzgar y aprobar o condenar, el análisis de un proyecto tan complejo como RepNet requiere un conocimiento profundo de los retos técnicos que plantean los diferentes servicios para repositorios y de los enfoques adoptados para resolverlos por los equipos encargados de su desarrollo. De esta manera, aunque prácticamente ausente de las –frecuentemente bizantinas– discusiones entre los abogados del acceso abierto, RepNet ha sido por el contrario muy comentado y debatido por la comunidad de 'repository managers' en el Reino Unido, que es la encargada de implantar las a menudo cambiantes, cuando no contradictorias, políticas emanadas desde las distintas instancias administrativas a nivel institucional, regional o nacional.

Tal como se presenta en la página principal del proyecto, el desarrollo de servicios sobre la capa de repositorios se sustenta sobre un análisis previo de las necesidades de los diferentes actores implicados (instituciones, agencias de financiación, investigadores...) y sobre la definición de una serie de áreas de trabajo en las cuales es perentorio proporcionar nuevas funcionalidades para garantizar la continuidad de los repositorios de acceso abierto en un momento en el que las exigencias para cumplir con los requisitos de aportación de información científica que plantea el Research Excellence Framework (REF) –el ejercicio de evaluación científica que se llevará a cabo en el Reino Unido en 2014– hacen que muchas instituciones hayan optado por adquirir e implantar sistemas CRIS que a menudo amenazan con reemplazar a los repositorios de acceso abierto, pese a basarse en un enfoque mucho más centrado en la gestión de información científica que en el acceso abierto como tal [4].

Las áreas de actividad de RepNet a nivel de identificación, diseño, desarrollo e implantación de servicios para repositorios son las siguientes:

1. Agregación de Contenidos. En este ámbito, RepNet propone la construcción de un agregador de contenidos de toda la red de repositorios del país. A diferencia de muchos otros países en los que esta funcionalidad existe desde hace tiempo, en el Reino Unido no se ha consolidado ninguna de las diferentes iniciativas que han desarrollado prototipos para la agregación de contenidos. Esta desventaja a nivel de infraestructura tiene la contrapartida de que una plataforma contruida en este momento puede ofrecer funcionalidades mucho más avanzadas que las que poseen las plataformas desarrolladas con anterioridad, tales como la minería de datos sobre los textos completos de los documentos archivados con asignación automática de descriptores, la detección e integración de duplicados a partir de una estrategia similar de análisis del texto completo de los contenidos y la detección de registros metadata-only (sin texto completo asociado) incluso aunque contengan un archivo PDF por defecto o 'default dummy file' para indicar que el texto completo no está disponible. Teniendo en cuenta que la adopción de las directrices DRIVER ha sido muy escasa en el Reino Unido (lo que ha llevado a su vez a niveles de cumplimiento inusitadamente bajos de los estándares de OpenAIRE), una agregación puede ofrecer una novedosa funcionalidad de validación de esquemas de metadatos, aplicando criterios muy avanzados como los de detección de las versiones de los articulos archivados o la agregación de información de financiación de los trabajos.

Workflow ITIL para la incubación de servicios en RepNet

2. Generación de Informes y Comparativa de Plataformas. En el area de 'reporting', RepNet viene operando el proyecto IRUS-UK [5] siguiendo un modelo común de incubación de servicios externalizados de acuerdo con la metodología ITIL [6]. IRUS-UK es un proyecto desarrollado en el Centro de Datos MIMAS de la Universidad de Manchester para recolectar estadísticas de uso de múltiples repositorios armonizadas de acuerdo con el estándar COUNTER. A mediados de septiembre de 2013, IRUS-UK recoge y agrega datos de 40 repositorios institucionales –lo que supone aproximadamente un tercio de la red nacional– y continúa extendiendo su cobertura, limitada por el momento a EPrints (29) y DSpace (11) en tanto el equipo de desarrollo trabaja en el módulo de intercambio de datos para Fedora y otras plataformas. Además de permitir la comparación para diferentes plataformas y tipos de documentos, el objetivo de IRUS-UK es obtener una estimación de las estadísticas de uso agregadas para toda la red, en la confianza de que los niveles de uso globales resultarán un argumento convincente para garantizar la utilización continuada de la misma por parte de autores e instituciones.

3. Deposito Automático de Contenidos. El proyecto Repository Junction Broker (RJB) es una iniciativa desarrollada en el EDINA National Data Centre para la transferencia automatizada de contenidos a la red de repositorios a través del protocolo SWORD. Después de varios años de trabajo, el proyecto RJB se incluyó como parte de los servicios a prestar por parte de RepNet, y ha sido bajo este paraguas cuando ha comenzado a funcionar como servicio en fase piloto desde mediados de este año [7]. RJB pretende consolidar una base de proveedores de contenido, fundamentalmente a nivel de artículos de revista, que puedan ser distribuidos, bien como registros sólo de metadatos o como metadatos+texto completo, a los diversos repositorios institucionales correspondientes a las afiliaciones de los autores de cada artículo concreto. En un principio, el RJ Broker ha firmado acuerdos con el repositorio temático EuropePMC y con Nature Publishing Group para distribuir los contenidos de ambos proveedores como proyecto piloto (el primero de ellos según el modelo 'metadata-only' y el segundo transfiriendo metadata+full-text, lo que requiere el compromiso expreso por parte de los repositorios receptores de no difundir los textos completos antes de la fecha de embargo). Un aspecto clave de la operación de este servicio es su naturaleza internacional por defecto: dado que los autores de los artículos son con frecuencia internacionales, basta con que los repositorios institucionales susceptibles de recibir información esten registrados con el servicio para que automáticamente puedan recibir los contenidos (previa instalación de SWORD) con independencia del país en el que esten ubicados.

Servicio RJB para la distribución automática de contenidos

4. Enriquecimiento de Metadatos. El area de Metadata Enhancement es posiblemente la más amplia de las que aborda el proyecto RepNet. Fruto de las investigaciones previas sobre necesidades de los diferentes ámbitos implicados, se puso de manifiesto la existencia de estrategias para la asignación de metadatos puestas en práctica por repositorios aislados (por ejemplo en el ámbito de la preservación de contenidos) que no se difundían al resto de la red. Vista la necesidad de armonizar el desarollo de toda la red al compás, se puso en marcha la iniciativa RIOXX [8] para el desarrollo e implantacion de un 'application profile' que permitiera la incorporación conjunta de metadatos sobre financiación (algo que ya abordaba OpenAIRE para los proyectos FP7), sobre aspectos específicos relativos al acceso y sobre identificadores como ORCID. Las iniciativas preliminares para la incorporación de estos metadatos avanzados a los repositorios han comenzado a difundirse recientemente [9] de modo que puedan gradualmente adoptarse de manera conjunta por parte de toda la red.

5. Registro de Repositorios. Los dos principales directorios de repositorios existentes en la actualidad, OpenDOAR y ROAR, mantenidos respectivamente por las universidades de Nottingham y Southampton, aportan una información más que aceptable sobre la red mundial de repositorios. Sin embargo, ninguno de ambos proporciona una cobertura completa de la red. Por este motivo, y también para actualizar el perfil que los directorios proporcionan sobre las plataformas que indexan, se ha puesto en marcha como parte de RepNet el proyecto Open Access Repository Registry (OARR) [10]. Este proyecto pretende actualizar la informacion de OpenDOAR cubriendo en mayor detalle las características de los repositorios, en un momento en que tanto la implantación generalizada de sistemas CRIS como el creciente numero de repositorios de datos de investigación estan introduciendo cambios significativos en el sector. El nuevo directorio, cuyo proyecto lidera el equipo CRC-SHERPA en la Universidad de Nottingham, se alojará eventualmente en los servidores de RepNet junto a otros servicios proporcionados por SHERPA tales como RoMEO, JULIET o más recientemente, FACT. De hecho, una de las líneas para el diseño de nuevos servicios para repositorios pasa por explotar las sinergias entre estas aplicaciones gestionadas de manera integrada.

6. Localización de la Información. Una de las cuestiones más problemáticas de los repositorios hace referencia a la escasa visibilidad de sus contenidos en la red. Junto a la creación de esquemas de metadatos suficientemente comprensivos que puedan servir los propósitos de la 'discoverability', la línea de trabajo orientada a la mejora de la visibilidad de los contenidos pretende sobre todo optimizar los ratios de indexación de los materiales archivados en la red de repositorios del Reino Unido por parte de motores de búsqueda como Google Scholar o Microsoft Academic Search. Sea a través de la identificación de buenas prácticas a nivel de repositorio individual o bien a través de la indexación masiva de una agregación de contenidos [11], es preciso mejorar la visibilidad de los contenidos de los repositorios en la red, así como identificar su procedencia de modo que el usuario final de la información pueda conocer y valorar la labor realizada desde estas plataformas.

7. Preservación/Continuidad de Acceso. Sin entrar directamente en el área de la preservación digital, cubierta por otros programas y proyectos del Jisc como SPRUCE [12], el proyecto RepNet sí se planteó en cambio ofrecer alguna clase de servicio para la red de repositorios en el sentido de asegurar la continuidad de acceso a los contenidos archivados en la misma. Para ello, RepNet trabaja sobre la extensión a los materiales archivados en acceso abierto del modelo LOCKSS, ya empleado con éxito para la gestión de la continuidad en el acceso a materiales obtenidos a traves de suscripción por parte de las bibliotecas [13]. Este modelo se basa en el archivo periódico de los contenidos en una red de servidores distribuidos (las 'LOCKSS Boxes') gestionada por las instituciones.

Servicios de nueva creación Además del énfasis en la integración y ulterior desarrollo de los servicios para repositorios ya existentes, el proyecto RepNet pretende también abordar el diseño, desarrollo e implantación de una serie de nuevos servicios. Para ello, RepNet adopta el modelo para la construcción de una infraestructura (de servicios) basada en datos o 'data-driven infrastructure' [14] que permita plantear la puesta en marcha de servicios de nueva creación largamente demandados por la comunidad, tales como herramientas para la monitorización del cumplimiento de mandatos de acceso abierto. La creación de nuevos servicios se lleva a cabo mediante el establecimiento de partnerships con instituciones concretas que permitan el ensayo y testeo de desarrollos piloto. Así, la iniciativa STARS [15] llevada a cabo en colaboración con la Universidad de St Andrews y el Scottish Digital Library Consortium (SDLC) se ha planteado como una prueba piloto para la implantación del conjunto de servicios que una iniciativa como RepNet puede ofrecer a una institución y un repositorio específicos.

Referencias

[1] La Declaración de Berlín, publicada por la Sociedad Max Planck en octubre de 2003, puede considerarse razonablemente como el pistoletazo de salida del movimiento del acceso abierto con la opción que ofrecía a organismos académicos y de investigación para suscribirla de manera institucional. De hecho, la Semana de Acceso Abierto se celebra anualmente en el mes de octubre como conmemoración de la publicación de esta Declaración.

[2] Proyecto UK RepositoryNet+ (comúnmente conocido como "RepNet"), http://repositorynet.ac.uk/

[3] Sólo el proyecto europeo OpenAIRE plantea un nivel de objetivos de similar amplitud y ambición a los de RepNet a nivel de servicios a desarrollar sobre la red de repositorios de acceso abierto existente en la actualidad.

[4] En relación con el impacto sobre las instituciones del ejercicio de recopilación de información científica para el REF2014, véase la excelente presentación 'I am turning enterprisey' realizada por Chris Keene ('repository manager' en la Universidad de Sussex) en la reciente conferencia Repository Fringe 2013 celebrada en Edimburgo el pasado mes de agosto.

[5] Institutional Repository Usage Statistics (IRUS-UK), http://irus.mimas.ac.uk/

[6] Ver referencia a ITIL en la sección de preguntas frecuentes de RepNet, http://www.repositorynet.ac.uk/?q=content/faq

[7] “RJ Broker delivers its first test transfers”, http://bit.ly/16dsJmq

[8] "RIOXX: Developing Repository Metadata Guidelines", http://bit.ly/18hlzQW

[9] Nixon, W.J., Ashworth, S., and McCutcheon, V. (2013) “Enlighten: Research and APC funding workflows at the University of Glasgow”. Insights: the UKSG journal, 26 (2). pp. 159-167. ISSN 2048-7754 (doi:10.1629/2048-7754.80), http://eprints.gla.ac.uk/83882/

[10] Open Access Repository Registry (OARR), http://bit.ly/LeXGjp

[11] Kenning Arlitsch, Patrick S. O'Brien, (2012) "Invisible institutional repositories: Addressing the low indexing ratios of IRs in Google Scholar", Library Hi Tech, Vol. 30 Iss: 1 pp. 60-81, DOI: 10.1108/07378831211213210

[12] Sustainable Preservation Using Community Engagement (SPRUCE), http://bit.ly/1aXY1Vd

[13] UK LOCKSS Alliance Case Studies Now Available, http://bit.ly/18Gkw9j

[14] Informe “Preparing for Data-driven Infrastructure”, http://bit.ly/1a8SuXe

[15] Pablo de Castro, Jackie Proven, “The STARS Shared Initiative: Delivering Repository Services in an Advanced CRIS/IR Environment”. Presentación en el RepositoryFringe 2013, http://slidesha.re/1eXH2nI

Monday, 3 June 2013

Badly-coded affiliations: a too long-standing curse

A webinar on the Repository Junction Broker (RJB) Project being presently carried out at EDINA National Data Centre in Edinburgh was delivered last week by Muriel Mewissen, RJ Broker Project manager. The RJ Broker is a SWORD-based tool for automated content delivery into institutional repositories which will identify target IRs by associating the co-authors' affiliations to their institution's platform (where available).

In the course of this RSP-organised event, Muriel shared some slides with an analysis of the preliminary content transfers the RJ Broker has performed so far. The first RJB-mediated transfer test involved processing in excess of 60,000 Europe PubMed Central articles and delivering them into the (mock) worldwide repository network.

EuropePMC is a solid disciplinary platform for the biosciences, whose content is often delivered straight from publishers. The platform's contents do usually feature good-quality metadata as a result, and EuropePMC provides thus a good example for testing research article transfer. Moreover, the specific EuropePMC article set selected for this test was remarkably modern. However, the statistical figures Muriel presented for the RJ Broker's ability to resolve author's affiliations in EuropePMC articles were simply astonishing (see figure below): author's affiliations were badly coded for over half the transferred articles' metadata.

This is a well-known issue the PEER project also had to deal with at the time. Institutions have been telling their authors since ages to try to harmonise their affiliation when signing their papers, but it's still very frequent to find affiliations such as Department of Psychology, Compton Rd or Radiology Unit, Hearts Lane which are literally impossible to process by the RJ Broker since they lack their main affiliation node.

A large collective effort needs to be done in order to provide the means for somehow tackling this long-standing issue once and for all, and ORCID looks a very promising initiative in this regard. If it were somehow possible to have author's affiliations coded into their ORCID iDs – something ORCID is actually aiming to do – the rate of miscoded affiliations could be expected to rapidly drop as a result.

Very much like the author identification, this is of course a huge challenge no-one has so far been able to tackle, and ORCID faces a lot of hard work in order to find a way to attack the miscoded affiliation issue. But there is currently much talk in the community about organisational IDs and having some system put in place that will hopefully provide the means to start solving this seemingly unsolvable difficulty. The research information management community badly needs ORCID to succeed in this challenge if it is to be able to ever start building the eagerly awaited service layer on top of the infrastructure one.

Friday, 24 May 2013

It takes two to tango: a few post-ORCID Outreach meeting reflections

After listening to the presentations delivered at the ORCID Outreach meeting held yesterday at St Anne's College in Oxford, the impression remains that this promising initiative resembles a ball game played by publishers (and the like) on one side of the pitch and researchers and institutions on the other one. In order to enjoy a reasonably amusing game, you need the two sides to be sufficiently balanced. But this is not the case for ORCID - or at least it's not the case so far.

The 'publisher side' - for simplicity purposes - features not just publishers, but also large commercial stakeholders such as Thomson Reuters, CRIS vendors and a wide range of third-party companies. This side is delivering an excellent performance so far by solving all the (otherwise not too complicated) technical challenges posed by the use of ORCID for populating submission systems or CRISes. However, the other side is not doing so well at the moment. One could expect an ORCID deluge to arrive from researchers and institutions interested in becoming ORCID members for providing iDs to all their staff. But this is not happening, or at least not as quickly as the other side is progressing. Which leaves us with fully prepared technical systems and no incoming stream of ORCID iDs to test them and prove their benefits to the research community.

It is true that there are over 140,000 registered authors in the ORCID database as of today. How many of those registered themselves and proceeded to populate their publications into their ORCID account it is impossible to know thus far. But after listening to Paul Peters's presentation on the huge advocacy campaign carried out by a 10-strong team at Hindawi HQs, it's easy to see that many ORCIDs out there are the result of the 'publisher side' work too (oh but wait, we may have some approximate stats on the provenance of ORCID accounts based on the highly-correlated number of visits to the ORCID website).

The argument held by patient observers (which one shares to some extent) says the game re-balancing will eventually happen, but institutions need more time to react - and once they start, the contribution from their side will become unstoppable. Institutions need to figure out their business models and their mechanisms for involving their researchers and all their relevant units in the process for creating and populating ORCID accounts.

Other critical observers -the institutions themselves- seem however not to completely share this approach. In their view, ORCID should be made available as a free service to them, since they're the ones expected to do the hard work anyway. A significant number of stakeholders argue that ORCID needn't become an overcomplicated platform aiming to achieve too many goals at the same time, but rather focus on the basic functionality, namely providing a unique identifier for researchers. Then again, it may not be that simple at all: features like researcher affiliation pose a huge challenge themselves that must be dealt with for offering really useful information from the ORCID iDs.

It becomes evident at some point that best practices are badly needed in ORCID implementation at institutional level so that the advantages of having their ORCID iDs institutionally created (and even maintained) can start to be perceived by researchers. So it's just about getting a critical mass of member institutions in different countries that will pioneer the adoption process - and hopefully receive some credit for it from the community, as they are dealing with the issues in a much harder early adopting way than those institutions that will follow suit.

Thursday, 9 May 2013

Are publishers "the enemy"?

This interesting issue came up (again) at the Author ID Tutorial delivered within the 4th COAR Annual Meeting in Istanbul - and it might be useful to devote a couple of reflections to it here. This Author ID Tutorial was jointly delivered on May 8th by Titia van der Werf from OCLC and myself as part of an attractive set of four tutorials at the COAR event - with the selected topics for the tutorials being as good a hint on the way things are evolving around repositories as the workshops themselves.

ORCID was big on the Author ID tutorial, to the extent that the timeschedule for the activity had to be updated on the spot in order to make room for the large number of questions and reflections prompted by the ORCID presentation. I'd like to address one of these questions more thoroughly here, namely the reluctant attitude some very qualified colleagues show towards ORCID due to the fact that the initiative seems very much publisher-driven - this making it probably not that interesting for the scholarly community.

This is again about the antagonism between publishers and the academia, and about whether both communities may at some point overcome such antagonism - real or perceived, it does not make much difference - in order to jointly work for pursuing a common benefit. This discussion is certainly interesting since it goes to the heart of a critical issue that has traditionally prevented a deeper implementation of Open Access, namely the fact that both publishers and Open Access community see each other as "the enemy". Mike Taylor - to mention just one inspiring example - regularly writes in an eloquent fashion about the reasons why the scholarly community may consider publishers to be the enemy of knowledge dissemination. However, same way as a certain degree of (informal) agreement was reached at the COAR event that the fight between advocates of Green and Gold OA is a pointless diversion of energy and will only harm their common objective, it could very much be argued that making emphasis on the differences and the misbehaviours over the good practices in collaboration may result in blocking win-win cooperation opportunities.

It is true that publishers such as Elsevier and databases such as the TR Web of Science or Scopus are a big driver behind ORCID - although the fact that over 130,000 researchers worldwide have chosen to individually register their ORCIDs as of May 3rd should not be overlooked either. It is evident too that a widely implemented successful persistent author identifier scheme will benefit publishers very much - but it will benefit institutions and especially authors even more. There was again an agreement at the author ID tutorial that this is something that needs to be done, and when examining the wide range of previous attempts to achieve the goal of author and work identification and disambiguation, it becomes clear that having publishers involved in the initiative provides it a significantly larger chance of succeeding.

I have repeatedly written here about the encouraging effort the EC-funded PEER project did in bringing together publishers and Open Access repositories and how advisable it would be to try to further explore opportunities for collaboration - some of which are indeed being exploited, see for instance Wiley's direct involvement in the JISC-funded PREPARDE project for research data publishing. ORCID is certainly one of these opportunities and with all due respect to constructive dissent, it would be an exercise in shortsightedness to let it slip away.