People Australia started out as a project to create an accurate, canonical listing of all the published authors in Australia. It is part of the smarts behind the National Library of Australia’s advanced search, Trove. It is fantastic! Now it is starting to morph into something bigger, where every researcher in Australia will be listed on it.
People Australia started as a way to provide information about people in Australia (in a similar way that Picture Australia provides access to pictures by, of or about Australia). At its core is the Name Authority File, which tries to provide clear information about all Australian authors, so that librarians can catalog books.
Besides the Name Authority File, People Australia pulls in data from several other encyclopedic resources:
- Australia Dancing
- Biographical information about Australian dancers, artistic directors, etc.
- Australian Dictionary of Biography Online
- Over 10,000 scholarly biographies of dead people who were significant in Australian history.
- Australian Mines Atlas
- I have no idea why this is in there.* It contains authoritative information about mining and minerals in Australia.
- Australian Women’s Register
- Biographical data about Australian women and their organisations.
- Collections Australia Network
- Public gateway to museums and art galleries across Australia including the small to medium regional institutions.
- Dictionary of Australian Artists Online
- Almost 8,000 biographies of Australian artists.
- Libraries Australia
- The home of the the Australian National Bibliographic Database (ANBD), which records the location details of over 42 million items held in most Australian academic, research, national, state, public and special libraries, and which depends upon the Name Authority File, as mentioned above.
- Music Australia
- Almost 5,000 detailed biographies of Australian musicians, performers, composers, groups and ensembles, festivals and organisations.
* Actually, I do know why the Mines Atlas and Collections Australia are in there. It is because they hold canonical information about organisations, and organisations are collections of people for a common purpose.
Those playing along at home may have noticed that we aren’t talking about many people yet. Trove can provide information about 880,000 people (give or take a few) over the whole of recorded history in Australia.
But I think that it is going to get much bigger than that.
One of the big collections missing from People Australia is Australian Research Online. Australian Research Online provides access to 400,000 Australian research outputs, including theses; preprints; postprints; journal articles; book chapters; music recordings and pictures. Basil Dewhurst is now working on the Australian Research Data Commons Party Infrastructure Project, which is aiming to adapt the People Australia infrastructure to enable authority control for research outputs and for data.
It is a logical extension to add this data to People Australia. The Name Authority File lists people who have written and published books. Australian Research Online lists people who have written and published scholarly articles.
While the data probably gets a little bit dirtier with each data collection that is added, it also gets much more comprehensive. And the tools for disambiguating, merging and splitting entries get better as they are run across different collections.
It might take a little while, because Australian Research Online currently doesn’t disambiguate very well (which is why they are doing it, I guess). And it doesn’t have much historical information. In fact, most of the people in Australian Research Online are alive, which will probably change the balance of People Australia a bit in that regard.
And that is where I’m starting to get an itchy feeling up my spine.
If they are amalgamated, every published academic in Australia will be listed in People Australia.
Lots of little bits of information published in a wide variety of places are pretty harmless. An aggregated collection of published information in one place is a much more powerful beast. That is, after all, why we use it.
From a privacy point of view, it is important to remember that all of the information in People Australia is published information. Nothing private is included.
So, technically, there is no privacy issue here because all the data is already published. However, it would be good to try to embed the national privacy principles (public sector and private sector) in the ethos of the collection. And I’m pretty sure that they are, as far as that is practical.
This may not happen. There may be very sound policy reasons for not amalgamating them. But People Australia, at its heart, is an aggregator and the drive around the world is to aggregate where possible.
After that, what else might go into People Australia?
From an institutional point of view, almost any authoritative collection of published information could be added to People Australia. Those collections will come from any group of people who have set up an archive or museum, codified their information and put it on-line in an ordered way. So, for example, I can easily imagine a “Cricket Australia”, since cricket is famous for collecting biographical statistics about players. But just about any museum or archive will fit: “Racing Australia”, “Film Australia”, “Jewish Australia”, “Chinese Australia”, “Politicians Australia”, etc.
The big leap will come when commercial organisations understand the value of this sort of rigor. I would love to see “Journalists Australia”, for a start. And then I would like to see those journalists link the people in their stories back to People Australia, where relevant. That way, their newspaper articles could be harvested and collected with some degree of accuracy.
Aside from that, I’ve talked previously about how German biographies on Wikipedia are being cross-linked to Deutsche Bibliothek’s Personennamendatei (PND). I think that is a great thing, and would like to see it replicated here.
In that same post, I mentioned how people like Tim Sherratt are starting to build ways to access the data if you are not an institution. Wragge’s Identity Browser is designed for finding just one name and the link that will point to it. This is handy if you are writing a blog post about someone, for example, and want to link to them. Tim has described how it all works in I link therefore I am.
Filed under: Uncategorized | Tagged: metadata, Trove, Australian Research Online, disambiguation, cross-linking | 1 Comment »