Posted by Anna Brandt on

Tagging Tools

Release 1.11.0

In a previous post we already wrote about our experiences with structure tagging and described the tools that go with it. But for most users (e.g. in edition projects) enriching texts with additional content information is even more important. To add tags to a transcription you can use the tagging tools in the tab “Metadata”/”Textual” in Transkribus.

Here you can see the available tags as well as those that have already been applied to the text of the page. With the Customize button you can create your own tags or add shortcuts to existing tags, just like with structure tagging. The shortcuts allow for easier and faster tagging in the transcript. If you want to do without shortcuts, you have to mark the respective words in the text (not in the image) and select the desired tag with a right click. Of course a word can be tagged several times.

These tags should not be confused with the so-called TextStyles (for example, crossed out or superscript words). They are not accessible below the tags but via the toolbar at the bottom of the text window.

Posted by Dirk Alvermann on

Tagging: what for? – when and why tagging makes sense

Tagging allows – in addition to content indexing by HTR – systematic indexing of the text by the later user. In contrast to an HTR model that does its work independently, tagging has to be done mostly by hand, which means that it requires a lot of effort. Therefore, a realistic effort analysis should be carried out before developing far-reaching plans regarding tagging.

Due to the amount of material processed in our project, we primarily use tagging where it helps us in the practical work on the text. This is the case with structure tagging, where the layout analysis is improved with the help of the tagging and the P2PaLA developed from it, and then of course also with the tagging of textstyles in case of deletions and blackening. This is where tagging is basically used “area-wide” by us. A fixed component of our transcription rules is also the use of the “unclear” tag for passages that cannot be read correctly by the transcriber. In this case, the tag is used more for internal team communication.

For the systematic preparation of texts for which an HTR has already been performed, we are experimenting with the “person” and “place” tags in order to offer systematic indexing, at least in this limited form.