One Article

Posted by Anna Brandt on

Baselines

Release 1.7.1

The Baseline is the most important reference point for text recognition. The segmentation of a text into lines can in most cases be done automatically with the help of CITlab Advanced LA. However, there might be cases where you either immediately decide to draw the baselines manually or at least want to make manual corrections. Here are a few practical tips:
The baseline should always be positioned exactly under the “middle band” of the line, i.e. where “a” “o” “m” “v” etc. touch the base.
If you add the baseline manually which you can do very quickly with a little practice, you should never move too far from the bottom of the characters (not further than one or two linewidths of the writing) no matter in which direction. The baseline consists of individual points that you set yourself when adding manually; the setting is completed with a double-click or Enter on the last point. Baselines can also be drawn vertically. In an image and even a text region, you can also combine different line directions (e.g. the typical “postcard layout”).

Problems with automatic line detection occur frequently when either the word spacing varies significantly or becomes particularly large, or if the line orientation is changed (curved lines). In such cases, the Baseline may be split into subsections containing individual words. This has no consequences for the text recognition and thus for the later full text search, because the entire text can still be captured. However, those who value a perfect layout of their full text that reflects the original text must correct this. The correction of the Baselines is not always necessary, but you have to pay attention to the Reading Order, otherwise uncertainties may arise in the later transcript. Such “torn” Baselines can be merged again easily with the Merge-Tool.

 

Tips & Tools
What if the text is upside down?
The CITlab Advanced LA cannot correctly capture the Baseline of an upside down line. Baselines always work in the reading direction. If you want to detect upside-down lines or set them manually, you either have to rotate the image or draw the Baseline at the top of the middle band (against the reading direction) from right to left. In both cases, Transkribus will rotate the image during transcription in the readable direction.