Genealogic Database based on the records of parish registers

The Eu­ro­pe­an di­o­cese ar­chi­ves bare a gi­ant in­for­ma­tion trea­sure, the al­most com­plete pop­u­la­tion der­i­va­tion and re­la­tion­ship plan since the be­gin­ning of reg­is­tra­tion in the 14th/15th cen­tu­ry. To sift and to se­cure it is a ma­jor so­cial task for the up­com­ing de­cades.

The cease­less race against the con­stant pro­gress of natural decay within the storage media forces us to react quickly and make use of both modern data storage techniques as well as massive personal resources to stay ahead of final loss by steadily copying and reorganizing the data stock. While in former times this has been done by hand writing nowadays and in the future electronic copying onto the currently technologically optimal medium replaces manual information backup.

However the first hard step remains transforming the hand writings into an electronic data representation. Because of the poor quality of the input data there is no real chance of applying automatizations like OCR. Therefore there is no alternative to the typing-by-hand input method.

The overall reward for the effort is a massive improvement of opportunities to do inquiry and research while at the same time significantly reducing the mechanical impact on the precious books.

Project Aims

In cooperation with the Passau Diocese archive (Map of parishes) we developed a software system to sys­tem­at­i­cal­ly build up a population database from their stored church registers (Excerpt 100kB)

We hereby concentrated onto the following aspects:

  • High efficient input assistance, because altogether there are about 6000 church register books with over 4 million entries.

  • A flexibly rearrangeable user interface, on order to be able to register the strongly varying data records comfortably and efficiently.

  • Person linkage, i.e. while identifying individual appearances of the same person at varying places within the registers (maybe with slightly different name spelling or date of birth) are and acquiring thereby a logical correlation between the entries is an important offline task (unfortunately not yet finished but finally under construction).

As a result of the person linkage an almost full-automatic family tree compilation is in sight. Arbitrary queries over the data pool are possible. The archivists benefit already from the recorded entries, because for the first time they can do searching for persons without knowing about their habitat region.

Technical details

The program is almost platform independent because it is implemented exclusively in the Java programming language (so far tested under UNIX and Windows NT/2000/XP). It can be field operated within network or on single user systems. The only requirement is access to a fully operational ANSI compliant database system.

The Passau archivists do use a UNIX network (Solaris 2.6) with around 10 clients in operation, five of them are exclusively used for data input. As of now they have an Oracle 9i database server. The system can be held working seated upon PostgreSQL databases meanwhile.

Project state

In the first phase of the project the marriages from the church registers had been recorded. The input of the entire register entries has also reached an impressive amount while currently the progress has come to a standstill due to limitations in personal resources.

Instead the archive staff concentrates upon the digitization of the sources self by making photographic scans with an industry size book scanner. Future input turns shall then be done by using those scans. That way uplift and deposition of the touchy books can become limited considerably easening their long term conservation.

The database content has already been used for medical research purposes. To track cases of the early onset form of Alzheimer's disease (supposed to be of genetic etiology) the content of the population database was evaluated for relationships of persons and for generations of family trees. These investigations were possible due to the fact that the second patient of Dr. Alzheimer and many of his descents lived in the former diocese Passau. As a result also AD-suspicious deaths within that family tree could be found.

Project future

The time necessary to put all registers into the database can not easily be estimated. However we expect it to take at least 10 years of working time for a whole archivist team.

We hope to widen the project to cover other dioceses (also in Austria), too. However, there are no concrete plans to do so yet. We do not plan to offer the data by internet access, because neither the legal nor the financial situation allow us to do so.


To re­vive the in­put prog­ress that has fell asleep we work hard at build­ing a de­cen­tral­ized ver­sion of the soft­ware sys­tem i.e. we ex­tract a ver­sion that gives the abil­ity to start data in­put from an empty Post­gre­SQL data­base which can be­come lat­er on merged into the to­tal stock. This piece of Soft­ware will be shipped to­gether with a set of scans on a re­mov­able me­di­um to a cus­tom­er The main aim is to of­fer in­ter­ested per­sons and or­ga­ni­za­tions the pos­si­bil­ity to take part in the in­put pro­cess. In re­turn those may keep their re­sults for fur­ther re­search and us­age for free.

Project staff

Project partners