Background image

Genealogy and open data – Finland, Europe and the US

November 30th, 2011

Flo Apps’ Nomen est omen, which shows you what Finnish public databases tell about your family name, has proven to receive a steady stream of visitors since its launch. Started as a trial in 2009 to show how public online data can be combined and used in novel, informative ways, it has been regularly updated and has won several prizes, both nationally and internationally.

Earlier this year we did some investigations on how Nomen est omen could use other (European) sources. A search in the Netherlands revealed that with pages such as the Dutch Civil Register and the Corpus of Dutch family names, the service can be spread out to other countries too.

I recently met several members of the Van Harlingen Historical Society in Montgomery Township in New Jersey, a group of volunteers who try to preserve the heritage of early Dutch and British settlers in the tri-state region of New Jersey, New York and Pennsylvania. Talking about Nomen est omen made me curious what kind of public information is available in the United States and, especially, how “open” this information would be – keeping in mind that the US was/is one of the trailblazers in the open data scene.

I took two routes in my search; one was through genealogy websites and the other through, the central website of the US government concerning open data sets.

Genealogy sites are currently booming in the US, which is not surprising when you take into consideration the incredible diversity of people that have arrived over the past several centuries. Specialized sites such as those of the family provide all the support you can imagine (albeit against payment) and also the Mormons’ (free) sites and the (partially free) Ellis Island/Port of New York register provide a multitude of information about family names and backgrounds. However, none of these can be considered as particularly open in terms of fetching the information and linking it together with other databases of your own choice, as was done with Nomen est omen.

The above leads us to the sources upon which these databases are built on: where do and get their data from? In the case of the US this can be both private, crowd sourced (see, purchased information as well as data from the US federal government and the 50 states of the US. For the latter two the US Census Bureau is the place to be for a lot of the information and their services include helpful tools such as the DataFerrett analysis and extraction tool which allows you to fetch the data you want to use for your purposes. While these datasets provide a multitude of possibilities, they can however pose the occasional privacy concerns, as was shown with the California Birth Index, where identity theft has been an issue. Open data and privacy are walking a fine line.

Tags: , ,

Leave a Reply

Your email address will not be published. Required fields are marked *

Latest comments

Tapio Nurminen on Vuoden 2013 kuntakartta SVG-muodossa

Kartta on vapaasti hyödynnettävissä, mutta lisenssiehdot kannattaa varmistaa Kuntaliitolta.

Arvi Leino on Vuoden 2013 kuntakartta SVG-muodossa

Kartalle on latauslinkki. Onko kartta vapaasti hyödynnettävissä esim. CC By 4.0 käyttöluvalla? ht...