Category Archives

67 Articles

Posted by Anna Brandt on

Transcribing without layout analysis?

Release 1.10.1

We have emphasized in previous posts how important LA is. Without it, an HTR model, no matter how good it is, has no chance of transcribing a text properly. The steps of automatic LA (or a P2PaLA model) and HTR are usually initiated separately. Now we noticed that when an HTR model runs over a completely new or unedited page, the program automatically executes an LA.

This LA runs with the default settings of CITLab-Advanced LA. On pure pages, fewer lines have to be merged and sometimes more than one text region is recognized.

But it also means that only horizontal text is recognized. We had the same problem with our P2PaLA models. Everything that is slanted or vertical cannot be recognized this way. To do this, the LA must be initiated manually, with the setting ‘Text Orientation’ set to ‘Heterogeneous’.

Interestingly, the HTR results are better with this method than with an HTR that has been run over a corrected layout analysis. We have calculated the CER for some pages to show this.

Thus this method is a very good alternative, especially for pages with an uncomplicated layout. You save time, because you only have to initiate one process, and in the end you have a better result.

Posted by Elisabeth Heigl on

Advanced Compare

Release 1.10.1

In contrast to the visualization of the errors via the tool “Compare Text Versions” the ordinary “Compare” gives us the same validation results as numerical values.

In addition to the word error rate, we also get the somewhat more conclusive character error rate (CER). Furthermore, in the “Advanced Compare” we can have these results calculated for the whole document or for specific pages in it – always provided that the selected pages have a GT version. Because in Advanced Compare the GT is automatically set as reference.

So select the model to be validated and start the calculation. The result gives you not only the average value for the whole document, but also the corresponding values for each individual page. And that makes the Advanced Compare the most important validation tool in systematic analysis when developing models.

In our rather complex model training for the Spruchakten (over 1000 writer’s hands from more than 150 years) we worked with separate small test sets. On them we could validate our new models over and over again via the Advanced Compare and analyse the results thoroughly. In this way not only average improvements or worsening could be traced in detail. We were also able to identify particular exceptions, such as individual concept fonts or particularly “smeared” ones, which worsened the otherwise good overall result. In addition, we were able to create many graphics from the numerical material, which helped us – and now you – to better understand certain phenomena and developments.

Tips & Tools
You can also download the validation results of the Advanced Compare as an Excel spreadsheet to your computer. To do so, you can select a folder under the result display where you want to save the document. Then click on the button “Download XLS”. Do not just press Enter – otherwise you will have to start all over again.

Posted by Elisabeth Heigl on

Compare Text Versions

Release 1.10.1

So, a new HTR model has run over a page and you want to have a first overview on how the model has read? Go to the tool option “Compute Accuracy”, enter the corresponding reference (GT) and hypothesis (HTR Text) and take a look at the validation tool „Compare Text Versions“:

The Text Compare visualizes the comparison of HTR and GT version directly in the text. A word with an error appears red marked and crossed out, behind it you see in green the correct version from the GT. So the text Compare basically shows the word error rate (WER). But above all it allows us to quickly recognize which mistakes exactly were made. So we can also see, for example, that many of the errors are actually minor mistakes, which don’t really bother us when reading and searching for words. In our example here we see a WER of 15%.

Posted by Dirk Alvermann on

Use Case: Extend and improve existing HTR-Models

Release 1.10.1

In the last post we described that a base model can pass on everything it has “learned” to the new HTR model. With additional ground truth, the new model can then extend and improve its capabilities.

Here is a typical use case: In our subproject on the Assessor Relations of the Wismar Tribunal we train a model with eight different writers. The train set contains 150,000 words, the CER was 4.09% in the last training. However, the average CER for some writers was much higher than for others.

So we decided to do an experiment. We added 10,000 words of new GT for two of the obvious writers (Balthasar and Engelbrecht)and used the Base Model as well as its Training and Validation Set for the new training.

As a result, the new model had an average CER of 3.82% – it had improved. But what is remarkable is that not only the CER of the two writers for which we had added new GT was improved – in both cases up to 1%. Also the reliability of the model applied to the other writers did not suffer, but was reduced as well.

Posted by Dirk Alvermann on

On the Shoulders of Giants: Training with Base Models

Release 1.10.1

If you want to develop generic HTR models, there is no way around working with base models. When training with base models, each training session for a model is based on an existing model, i.e. a base model. This is usually the last HTR model that was trained in the corresponding project.

Base models “remember” what they have already “learned”. Therefore each new training session improves the quality of the model (theoretically). The new model learns from its predecessor and thus becomes better and better. Therefore, training with Base Models is also particularly suitable for large generic models that are continuously developed over a long period of time.

To carry out training with Base Model, you simply select a specific Base Model in the training tool – in addition to the usual settings. Then, from the HTR Model Data tab, insert the Train Set and the Validation Set (called Test Set in earlier Trankribus versions) of the base model, as well as the new Training and Validation Set. Additionally you can add more new Ground Truth and then start the training.

Posted by Elisabeth Heigl on

Validation possibilities

Release 1.10.1

There are different ways to measure the accuracy of our HTR-models in Transkribus. Three Compare tools calculate the results and present them in different ways. In all three cases the hypothesis (HTR version) of a text is compared with a corresponding reference (correct version, i.e. GT) of the same text.

The first tool which shows the most immediate result is “Compare Text Versions“. It visualizes the validation for the currently opened page in the text itself. Here we can see exactly which mistakes the HTR has made at which points.

The standard “Compare” gives us these same validation results as numerical values. Among other things, it calculates the average word error rate (WER), the character error rate (CER) and the respective accuracy rates. (If someone knows what the bag tokens are about, he/she is welcome to write us a comment). In the “Compare” we also have the possibility to run the “Advanced Compare“, which allows us to perform the corresponding calculations for the whole document or only for certain pages.

We already have presented the validation tool “Compare Sample” briefly in another post to show how to create Test Samples. The actual Sample Compare then predicts how a model will potentially read on a Test Sample that has been created for this purpose.

Posted by Dirk Alvermann on

Generic Models and what they do

Release 1.10.1

In a previous post we talked about the differences between special models and generic models. Special models should always be the first choice if your material includes a limited number of writers. If your material is very diverse – for example, if the writer changes frequently in a bundle of handwritings – it makes sense to train a generic model.

The following articles are based on our experiences with the training of a generic model for the Responsa of the Greifswald Law Faculty, in which about 1000 different writer’s hands were trained.

But first: What should a generic HTR model be able to do? The most important point has already been said: It should be able to handle a variety of different writer’s hands. But it should also be able to “read” different fonts (alphabets) and languages and be able to interpret abbreviations. Below are a few typical examples of such challenges from our collection.

Different writer’s hands in one script:

Abbreviations:

Different languages in one script:

Posted by Dirk Alvermann on

Breaking the rules – the problem with concept writings

Release 1.10.1

Concept scripts are were used when a scribe quickly creates created a draft that is later on “written in the clean”. In the case of the Spruchakten, these are mainly the drafts to the judgments that were sent away later. The concept scripts were usually written very quickly and “sloppy”. Often letters are omitted or word endings “swallowed”. Even for humans conceptual writings are not easy to decipher  – for the machine they are a particular challenge.

To train an HTR model for reading concept scripts, you proceed in a similar way to training a model that is to interpret abbreviations. In both cases, the HTR model must be enabled to read something that is not really there – namely missing letters and syllables. To achieve this we must break our first rule: “We transcribe as ground truth only what is really written on paper”. Instead, we have to include all skipped letters and missing word endings etc. in our transcription. Otherwise we will not get a sensible and searchable HTR result in the end.

In our experiments with concept writings we tried at first to train special HTR models for concept scripts. The success was rather small. Finally, we decided to train concept scripts – similar to abbreviations – directly within our generic model. In doing so, we checked again and again whether the “wrong ground truth” that we produce in the process worsened the overall result of our HTR model. Surprisingly, the breaking of the transcription rule had no negative effect on the quality of the model. But this could also happen due to the sheer amount of ground truth used in our case (about 400,000 words).

HTR models are therefore able to distinguish concept writings from fair copies and interpret them accordingly – within certain limits. Below you can see a comparison of the HTR result with the GT for a typical concept script from our material.

Posted by Elisabeth Heigl on

Language Models

Release 1.10.1

We talked about the use of dictionaries in an earlier post and mentioned that the better an HTR model is (CER below 7%), the less useful a dictionary is for the HTR result.
This is different when using Language Models, which are available in Transkribus since December 2019. Like dictionaries, Language Models are generated from the ground truth used in each HTR training. Unlike dictionaries, Language Models do not aim at identifying individual words. Instead, they determine the probability of a word sequence or the frequent combination of words and expressions in a particular context.
Unlike dictionaries, the use of Language Models always leads to much better HTR results. In our tests, the average CER improved by as much as 1% compared to the HTR result without the Language Model – consistently, on all test sets.

Tips & Tools: The Language Model can be selected when configuring the HTR. Unlike dictionaries, Language Models and HTR model cannot be freely combined. Each HTR model has its uniquely genereated Language Model and only this one can be used.

Posted by Anna Brandt on

Tools in the Layout tab

Release 1.10.

The layout tab has two more tools, which we did not mention in our last post. They are especially useful for correcting the layout and save you from annoying detail work.
The first one corrects the Reading Order. If one or more text regions are selected, this tool automatically arranges the child shapes, in this case the lines and baselines, according to their position in the coordinate system of the page. So the reading order starts at the top left and continues counting from there to the bottom right.  In the example below, a TR was split but the RO of the marginal notes got mixed up. This tool saves you now from renaming each BL individually.

The second tool (“assign child shapes”) helps to assign the BL to the correct TR. This can happen when cutting text regions or baselines that run through multiple TRs. Each BL then has to be marked in the layout tab and moved to the correct TR. For assigning them automatically to the corresponding TR you just select the TR where the BLs belong to and start the tool.