Anna Brandt


Posted by Anna Brandt on

Structural Tagging

How structural tagging is done exactly is explained in this Wiki. In contrast to “textual” tagging you can tag all structures, for example text regions, baselines or tables. In our case, only the text regions are tagged, because we use structure tagging to train a P2PaLA model.

When you create your training material and decide where to position the specific structural elements, you should stick to your choices. For example: for us a “paragraph” is always the TR at the top in the middle, the core so to speak; “marginalia” are all the notes on the left side of the image, separated from the “paragraph”.  With this you can divide the images into ‘types’, i.e. groups of images in which all TRs with the same tags are always in a certain coordinate area of the page.

Tips & Tools
There are three ways to set the corresponding tag. First by right-clicking on the marked area and then assigning a tag via “assign structure type”. Or you can choose the area “Structural” in the tab “Metadata”, where the existing structure types are displayed. There you can also define shortcuts for tags that you are using a lot: click on the button “Customize” and enter a number from one to nine in the column “Shortcut”. Then the shortcut is displayed in the tab, it is always Ctrl+Alt+Number.

Posted by Anna Brandt on

layout tab

Release 1.7.1

If you correct the automatic layout analysis, you can do this directly in the image or navigate via the layout tab on the left side. There are all shapes, like the textregions and the baselines, displayed with their position in the image and their structural tags. It is possible to delete or move shapes. In the image you can always see where you are at the moment, which element is currently marked – thus what you can change.

If you want to merge two baselines, just mark them in the layout tab instead of trying to hit the thin line in the image.

The navigation in the tab is especially useful if you want to see the complete image in the right window. This way you keep a better overview, because everything in the image and in the tab will be changed at the same time.

Tips & Tools
You can change the reading order of the baselines in the layout tab either by moving the lines or by clicking and changing the number in the column “Reading Order”.

Posted by Anna Brandt on

Feedback

The blog “Rechtsgeschiedenis” (Otto Vervaart/Utrecht) has given a detailed discussion about the project ‘Rechtssprechung im Ostseeraum’ and our blog. It describes our work with Transkribus, the project itself, as well as the page where we present the results and the blog – a good overview from a user’s perspective.

Posted by Anna Brandt on

Collaboration – Versions Management

Release 1.7.1

The second important element for organized collaboration is the version management of Transkribus. In the toolbar it seems rather inconspicuous, but it is enormously important. Transkribus stores a version of the currently edited page each time it is saved. It contains the current status of the layout work and content processing.

These versions are provided with an “edit status” so that they can be easier distinguished. A newly uploaded Document contains only pages with the edit status “new”. As soon as you edit a page, the edit status automatically changes to “in progress”. The three other status options – “done”, “final” and “Ground Truth” – can only be set manually.

The logical time to set such a “higher” status depends on the agreements within the team. We use versions management mostly during the production of training material – Ground Truth. All pages that have a finished layout analysis are set to “done” so that the transcribers and editors know that this page can now be finished by them. This status will not be changed until the page has a 100% secure transcription. Then it will be set to “Ground Truth” or “final”. All pages with the status “GT” will later be used as training material for HTR models, while the pages with edit status “final” will be used to create the test sets.

Each collaborator can access and edit or delete all versions of a page at any time. The edit status helps him to find the desired version faster. In addition to the edit status, the last editor and the save time are displayed for each version. If the version was edited with an automatic process (layout analysis or HTR), this is also commented. Thus, the processing steps are traceable in detail.

Tips & Tools
You can have multiple versions with the same status.
You can set any version to any other status – except to “New”.
You can delete single or multiple versions – except final versions, which cannot be deleted.

Posted by Anna Brandt on

Train sets & test sets (for Beginners)

Release 1.7.1

When we train an HTR model, we create training sets and test sets, all based on Ground Truth. In the next posts on this topic you will learn more about it, especially that both sets must must not be mixed together. But what exactly is the difference between the two and what are they used for?

Training and test sets are very similar in the choice of material they contain. The material in both sets should come from the same handwritings and be at the same status (GT). The difference is how Transkribus uses it to create a new model: The training set is learned by the program in a hundred (or more) rounds (epochs). Imagine writing a test a hundred times – for practice purposes, so to speak. Every time you write the test, after going through all the pages, you get the solution and can look at your mistakes. Then you start again with the same exercise. Of course you’ll get better and better. The same way does Transkribus learn a bit more with each pass.

After each round in the training set, the learned skills are checked on the test set. Imagine your test again. This time you write the test, get the grade, but they don’t tell you what you did wrong. So Transkribus goes through the same pages many times, but can never see the right solution. The model has to fall back on the previously learned training and you can see how well it has studied.

So if there were the same pages in the test set as in training, Transkribus could “cheat”. It would already know the pages, have practised on them a hundred times and seen the solution a hundred times. This is the reason why the CER (Character Error Rate) in the training set is almost always lower than in the test set. This is best seen in the “learning curve” of a model.

Posted by Anna Brandt on

Toolbar – the most important tools and how to use them #2

Release 1.7.1

Correcting layouts

If the basic text regions are drawn, they can be edited. If you select one of the text regions, the other tools on the toolbar will be enabled.

With 1 you can add one or more points to the selected shape (TR or BL!). All shapes consist of dots and straight lines connecting them. You can edit the shape by moving the dots. You can use this tool to make a polygon out of the basic text region, whatever fits best to the text block. Press 2 to remove a dot from the selected shape. This tool is especially useful for correcting or shortening baselines.

This is especially useful if you have split elements. With 3,4 and 5 it is possible to cut a selected shape. This is also possible for both text regions and baselines: 3 cuts horizontally, 4 vertically. With 5 you draw your own line, which does not necessarily have to be horizontal or vertical.

The last important tool (red circle) is the Merge tool. This is especially important if the automatic LA has split baselines in the image. You can  use Merge to reassemble all shapes. So baselines with baselines and text regions with text regions. To do this you have to mark the corresponding shapes, which you can do directly in the image or in the layout tab.

 

Tips & Tools
When splitting, note that the TR and BL can only be cut where they have lines. It is not possible to cut through the dots.
Be aware that when you split a shape Transkribus will automatically change the Reading Order. For example, if two TRs are made from one, a new reading order is started in each TR.

Posted by Anna Brandt on

Reading Order

Release 1.7.1

The Reading Order displays the order in which the HTR will read the lines in an image. This RO is created automatically during the layout analysis, but can also be changed manually later. With the automatic LA, the RO is determined by the coordinates of the lines in the image: the top line, which is furthest to the left, is number one, and so on.

If the writing in the image is not completely horizontal or if baselines are split, this can cause errors in the Reading Order. If you correct the LA, you should always look at the RO again, otherwise the transcribed text gets confused and makes little sense. To change the RO you can either click on the circles at the lines where the line numbers appear and correct them directly. Or you can change the RO by selecting the corresponding line in the layout tab and moving it with the mouse. If the later full text is to make sense at first glance, such corrections are essential. After all, the RO determines the context of the content. If the HTR-Result of the document is only to be used for a full text search and is not to be displayed in structured full text, the RO is less relevant.

 

Tips & Tools
If you want to move a line forward or backward, the numbers of the following lines will change automatically. Sometimes it is necessary to calculate a bit beforehand which number will be the correct one.
Very important: When the author writes an increasing line from left to right – which happens very, very often – and when the baseline is split on the LA, the second half of the split BL will have the smaller number. If you want to merge these baselines with the Merge Tool, you have to look at the RO first. If the RO is wrong, Transkribus will merge it with a loop according to their coordinates. This baseline can no longer be interpreted by the HTR.
Edit: This problem was solved with the version 1.8.0. The problem now only occurs with vertically recognized lines.

Posted by Anna Brandt on

Toolbar – the most important tools and how to use them #1

Release 1.7.1

Creating Layouts

This is how the toolbar looks like with a new image. After you have run the CITlab Advanced LA, the other tools will be enabled. If the layout is to be done manually, the two tools in the upper circles are particularly important. TR means text region. This is the first layout element that has to be created for a page. It defines which areas of the image have text and which do not. If the text does not fit correctly into a text region, you first roughly draw the TR and later adjust it. Then you can draw the baselines with “BL”. Among the lower tools, only the green, semicircular arrow is important. This is the “undo”-function; as the name suggests, it is used to undo actions.

Tips & Tools
“Item visibility” is a function that makes structure of the document more transparent for you. If it is enabled, a box appears in which you can select what items should be visible in the current image. Textregions and Baselines are the most important elements, not only while editing the layout, but also during the later transcription. These two boxes are always checked in the default setting. If the display of the Baselines annoys you, you should deactivate it manually. Another important feature for correcting the layout is the Lines Reading Order, i.e. the order in which the lines are read later by the HTR. When the Reading Order is displayed, you can easily see whether the layout analysis has worked reliably. However, this display is mostly distracting while transcribing. In this case you should hide it again.

Posted by Anna Brandt on

Baselines

Release 1.7.1

The Baseline is the most important reference point for text recognition. The segmentation of a text into lines can in most cases be done automatically with the help of CITlab Advanced LA. However, there might be cases where you either immediately decide to draw the baselines manually or at least want to make manual corrections. Here are a few practical tips:
The baseline should always be positioned exactly under the “middle band” of the line, i.e. where “a” “o” “m” “v” etc. touch the base.
If you add the baseline manually which you can do very quickly with a little practice, you should never move too far from the bottom of the characters (not further than one or two linewidths of the writing) no matter in which direction. The baseline consists of individual points that you set yourself when adding manually; the setting is completed with a double-click or Enter on the last point. Baselines can also be drawn vertically. In an image and even a text region, you can also combine different line directions (e.g. the typical “postcard layout”).

Problems with automatic line detection occur frequently when either the word spacing varies significantly or becomes particularly large, or if the line orientation is changed (curved lines). In such cases, the Baseline may be split into subsections containing individual words. This has no consequences for the text recognition and thus for the later full text search, because the entire text can still be captured. However, those who value a perfect layout of their full text that reflects the original text must correct this. The correction of the Baselines is not always necessary, but you have to pay attention to the Reading Order, otherwise uncertainties may arise in the later transcript. Such “torn” Baselines can be merged again easily with the Merge-Tool.

 

Tips & Tools
What if the text is upside down?
The CITlab Advanced LA cannot correctly capture the Baseline of an upside down line. Baselines always work in the reading direction. If you want to detect upside-down lines or set them manually, you either have to rotate the image or draw the Baseline at the top of the middle band (against the reading direction) from right to left. In both cases, Transkribus will rotate the image during transcription in the readable direction.

Posted by Anna Brandt on

What you should know about Collections & Documents

Release 1.7.1

Collections and Documents are the two most important categories in which you can organize and manage material in Transkribus. A Collection is nothing else than a kind of directory in which you store Documents that belong together. It is important to know that some tools that Transkribus provides do not work beyond the boundaries of a Collection. This includes a tag search, which is an important tool for those who want to tag their HTR results.
Documents are parts of the Collection, e.g. a bundle of letters or a record or even
a single piece of writing. In our project a Document is always a record. Documents can therefore contain many pages. They usually are uploaded into Transkribus via private FTP or directly from a local folder. You cannot upload single images, but only images that are contained in a folder.

Once uploaded, the possibility to edit the individual pages of a Document is limited. Using the document manager, you can move or delete individual pages within the Document, you can even add more pages. However, once images are uploaded, they can no longer be edited or rotated. This means: before uploading you should check if the images are aligned correctly and if the Document is complete.
Thus in our project, Documents are only compiled and uploaded once they have been edited in the Goobi metadata editor, checked for completeness and received structure and metadata. This ensures that when the HTR results are re-imported to Goobi later, they are actually transferred to an identical document structure.

 

Tips & Tools
Documents can be distributed between different Collections at any time. This is done by linking or duplicating. In the first scenario, each change to the Document, no matter in which Collection it is made, is transferred to all Collections it is linked to. The second scenario creates actually two unique Documents that can also be edited independently of each other.