Posted by Elisabeth Heigl on 4. May 2020

Generic Models and what they do

Transkribus in practice/Models & Training

Release 1.10.1

In a previous post we talked about the differences between special models and generic models. Special models should always be the first choice if your material includes a limited number of writers. If your material is very diverse – for example, if the writer changes frequently in a bundle of handwritings – it makes sense to train a generic model.

The following articles are based on our experiences with the training of a generic model for the Responsa of the Greifswald Law Faculty, in which about 1000 different writer’s hands were trained.

But first: What should a generic HTR model be able to do? The most important point has already been said: It should be able to handle a variety of different writer’s hands. But it should also be able to “read” different fonts (alphabets) and languages and be able to interpret abbreviations. Below are a few typical examples of such challenges from our collection.

Different writer’s hands in one script:

Abbreviations:

Different languages in one script: