P2PaLA or structure training

Release 1.9.1

Page-to-page layout analysis (P2PaLA) is a form of layout analysis that can be trained for indivudual models- similar to the HTR. You can train structure models either to recognize textregions only or textregions with baselines. They therefore fulfill the same functions as the standard layout analysis (CITlabAdvanced). The P2PaLA is particularly suitable if a document has many pages with mixed layouts. The standard layout analysis usually recognizes just one TR – and this can lead to problems with the reading order of the text.

With the help of a structure training, the layout analysis can learn where and how many TRs it should recognize.

The CITlab Advanced LA often had problems to identify the text regions in a differentiated way on our material. That’s why we experimented with P2PaLA early on in our project. First, we tried out structural models that exclusively set text regions (Main text, marginal notes, footnotes etc.). Within the TRs thus generated, we worked with the usual line detection. However, the results were not always satisfactory.

The BLs were often too short (at the beginning or the end of the line) or were torn many times – even on pages with simple layouts. Therefore we trained another one with an included recognition of the BLs, based on our already working P2PaLA model. Our newest model recognizes all ‘simple’ pages almost without any errors. For pages with very complex layouts, the results still have to be corrected, but with much less effort than before.