Categories: Home | General | Requirements Report | Design
|The I-WIRE Project - A Repository Enhancement Project|
Uniquely identifying an author to ensure publication lists are complete and correct is not straight forward. When you consider real life situations such as change of name and email address due to marriage, then an email address alone as the unique identifier for a person is not robust enough and could result in incomplete lists being found.
Add other factors such as Institutional policies on the re-use of email addresses for leavers, after a quarantine period, and you get another scenario where search results can be incorrect because they span more than one person.
There are ways around this, for example, using or introducing a genuinely unique identifier that will - given the right supporting processes - survive name and email address changes, and even cope with people who have left the Institution and may return at a later date.
However, this approach does introduce challenges of its own:
Generally, such a unique identifier will be internal to the institution's systems and processes, and not something that is recognisable to the person it identifies, let alone anyone else. We may not even want to expose such an identifier to the outside world. Therefore, we need to continue capturing the user's preferred search parameter such as email address, and translate it to the unique identifier behind the scenes before running a search.
For the scenarios where an email address has changed, this approach copes nicely as the search will return all the publications associated with that person, regardless of what their email adress was at the time. The draw back to this approach, however, when you consider what the user sees in their search results, is that they may be confused that the email address they entered doesn't appear in some of the results. And of course, this approach requires us to populate the unique identifier against all publication records in the repository, which may call for a retrospective population exercise.
For the scenarios where email addresses are re-used after a quarantine period (we don't want to be allocating JonesK947...), this approach presents a bigger challenge. When the search returns more than one unique identifier against the email address, the user is going to have to select the right one, and we may need to present additional data to the user to help them make this decision, such as the associated school. However, a machine interface is unlikely to be able to process this additional interim step unless we make this selection process the part of a very well defined interface.
I'd be interested to hear from anyone who is tackled this area.
© Tracey Andrews. Powered by Apache Roller 4.0.1-dev.
|« May 2013|