What you should know about Collections & Documents

Release 1.7.1

Collections and Documents are the two most important categories in which you can organize and manage material in Transkribus. A Collection is nothing else than a kind of directory in which you store Documents that belong together. It is important to know that some tools that Transkribus provides do not work beyond the boundaries of a Collection. This includes a tag search, which is an important tool for those who want to tag their HTR results.
Documents are parts of the Collection, e.g. a bundle of letters or a record or even
a single piece of writing. In our project a Document is always a record. Documents can therefore contain many pages. They usually are uploaded into Transkribus via private FTP or directly from a local folder. You cannot upload single images, but only images that are contained in a folder.

Once uploaded, the possibility to edit the individual pages of a Document is limited. Using the document manager, you can move or delete individual pages within the Document, you can even add more pages. However, once images are uploaded, they can no longer be edited or rotated. This means: before uploading you should check if the images are aligned correctly and if the Document is complete.
Thus in our project, Documents are only compiled and uploaded once they have been edited in the Goobi metadata editor, checked for completeness and received structure and metadata. This ensures that when the HTR results are re-imported to Goobi later, they are actually transferred to an identical document structure.


Tips & Tools
Documents can be distributed between different Collections at any time. This is done by linking or duplicating. In the first scenario, each change to the Document, no matter in which Collection it is made, is transferred to all Collections it is linked to. The second scenario creates actually two unique Documents that can also be edited independently of each other.