D4.2 – Interactive Text Mining Environment (v.1)
This document presents the first version of InTaVia's interactive, visual environment for text mining. It targets the usage scenario of comparative analysis of annotation data generated by machine learning algorithms for named entity recognition on a large corpus of short prosopographical texts. The data considered has the form of classification tags on text segments. For its visual analysis two interactive tools are presented: the previously published AnnoXplorer and the newly developed Performancer. These tools are integrated in an interface where the former serves as a detail view on the data from token to text level, and the latter serves as an overview from text to corpus level. Both tools are agnostic with respect to the nature of the annotations' source (human or algorithmic) and the meaning or purpose of the classification tags, which leaves room for further prospective application scenarios. They both support the visual comparison of annotations from multiple sources and are designed to handle long texts and large corpora.
This deliverable is currently classified as confidential and is not available for download.