DE  EN  
 TextGrid  >  About TextGrid

TextGrid Roadmap for Version 1.0

This roadmap shows the intended steps towards a stable and productive Version 1.0. Furthermore, it provides an overview of the features of TextGrid and includes development priorities and use cases.

Table of Contents:

  1. TextGrid Version 1.0
  2. TextGrid as a Virtual Research Environment
  3. TextGrid Laboratory
  4. TextGrid Repository
  5. Use Cases
  6. Support, Tutorials and Educational Opportunities
  7. Outlook, Perspectives
  8. Timetable



1. TextGrid Version 1.0

After five years of development, the TextGrid consortium will publish TextGrid version 1.0 in June 2011. Conceptual considerations and development work have influenced this version:

  • application scenarios of the humanities disciplines have guided the range of functions and service capability,
  • extensive evaluations of demand plus feedback of the user community are adding and extending the concepts as well as functioning as an effective corrective,
  • intensive beta tests allow bug fixes and target-oriented further development,
  • considerations in organisation and research policy provide sustainability,
  • current developments in the grid technology context are being implemented.

Version 1.0 is an important milestone for the project as TextGrid will leave the beta status behind and start a stable productive operation. Users who are planning to use TextGrid in their research projects are being provided with an extensively tested and well-engineered version that can reliably accomplish defined tasks. Nevertheless, development work within the project's framework is still continuing. Existing functions and services are being optimised in cooperation with users and new functions are being implemented and integrated. In addition, as an open platform TextGrid offers the possibility to integrate bespoke services in order to adjust and extend its functionality to project-specific needs and, therefore, to participate in its development. At the end of the project in May 2012, TextGrid will change to a community-based business model with a corresponding legal form, and, in this way, it can be supported, developed and extended furthermore.

In the following, the prospective scope of services of version 1.0 plus concrete fields of application and use cases will be presented as well as an overall outlook.

To top




2. TextGrid as a Virtual Research Environment

TextGrid is supported by researchers from infrastructure facilities, universities and research institutions who are contributing their expertise and IT knowledge to the project in equal measure. TextGrid is, in coordination with further academic communities, aiming for an independent legal form that should on the one hand represent the interests of the arts and humanities within the research community in terms of access to infrastructure and Virtual Research Environments and on the other hand provide for a sustainable organisational structure from a research political point of view.

TextGrid addresses itself to users who are in need of tools and services for the itemisation, description, annotation, exploitation, evaluation and publication of cultural artefacts, especially of texts, images, manuscripts, music and other objects and sets great store on long-term archiving and sustainable usage of research data.

TextGrid is establishing a Virtual Research Environment for the arts and humanities based on grid technologies – an approach that is breaking new ground in Germany and internationally. For this reason, the TextGrid initiative clearly stands out from other tools and research environments that are already existing or being developed. The associated achievement of compatibility and sustainability considering technological aspects as well as organisational and research political considerations is another important point.

TextGrid has an integrated approach: The entire research process is mapped virtually as TextGrid includes work organisation, communication, tools, access to data and content, standards, representation of interests and research policy.

The user accesses the Virtual Research Environment using two main components:

TextGrid Laboratory

  • Access point to the Virtual Research Environment
  • Portable software, open source, comprehensive documentation for users and developers
  • Educational opportunities, demos and tutorials
  • Provides existing tools and services, as well as new ones within an intuitively operable software
  • Is enhanced continuously
  • Its conceptual design permits the integration of tools and services via open interfaces (SOAP/REST)

TextGrid Repository

  • long-term archive for research data in the humanities that is embedded in the grid infrastructure
  • Guarantees long-term availability, accessibility and long-term usability of research data
  • Cooperation with WissGrid, concepts for long-term archiving of research data together with other academic grid communities
  • Is enhanced continuously

To top




3. TextGrid Laboratory

The TextGridLab is a software that allows the user to access the Virtual Research Environment from their own computer. Because the program is provided as an executable file, no installation (with changes of the registry and the user profile) is necessary, the Lab can be stored on and even be started from a removable medium (e.g. a USB flash drive) on any computer. The sole technical prerequisite for the utilisation of the portable software is the installation of Java version 6 on the computer plus access to the internet.

The TextGridLab performs two tasks: On the one hand it organises the access to the administrative Infrastructure in order to work cooperatively and distributed among different sites over the grid. On the other hand it contains a basic set of Tools that allow text researchers to work in the digital medium.

Administrative Infrastructure

Efficient rights management is a basic prerequisite for collaborative work in a Virtual Research Environment. It manages who is allowed to access projects and files, with due regard to whether, for example, a document is visible to all users, or a single user has worked on a certain document during a particular period of time. The Lab’s User and Project Management provides project-specific roles (e.g. manager, editor, observer) which imply the user’s access rights. A Versioning Function permits a document “freeze” at any time and keeps record of the project’s progress continuously. In addition, it facilitates the fallback to earlier stages of the document.

A possible starting point for a user is the Project Browser, a Navigator that gives an overview of the research network, grants access to project materials and permits their clearly arranged administration.

TextGrid provides an internal Search function for a particular project or the entire network. Besides a full-text search, it is possible to involve metadata and search texts encoded in structured data using the search interface. TextGrid aligns itself to the standards of the Text Encoding Initiative (TEI): Via an adapter, texts encoded in XML can be converted to base line encoding developed by TextGrid and that is TEI conformant. This facilitates a cross-project search in semantically tagged text segments (e.g. lemmas in dictionaries, postal addressees in letters or stage directions in a drama).

A browser-based portal functions as an additional access point to the data stored in the TextGrid repositories (see section concerning TextGridRep).

In order to manage great quantities of project materials of different types (text documents, images, programs etc.), the TextGridLab provides a powerful tool to administrate Metadata, that allows the generation of a user-defined configurable and extensible project-specific metadata profile besides the core set of description data predetermined by TextGrid.

While in common operating systems documents are organised in directories, in the TextGrid-Lab they are allocated in so-called Aggregations. They are virtual directories whose functionalities were adapted to the demands of a cooperatively used Virtual Research Environment.

Tools in the Lab

As TextGrid aligns itself to the non-proprietary and media-independent data format XML for scientific text processing, the XML Editor is the centre piece of the TextGridLab. The editor permits the interactive editing and validating of XML documents. The other tools have been selected in order to cover the operational procedures of the central disciplines in text research, namely edition philology and linguistics, in the Virtual Research Environment.

The Text Image Link Editor permits the linkage of text segments of a transcription to corresponding image sections in e.g. a digitised manuscript, to prepare digital editions that show the synoptic display of transcription and manuscript aligned in sections.

The Dictionary Search Tool serves as interface for meta requests via the reference books in the “Trierer Wörterbuchnetz” (Trier Dictionary Network) including the Grimm’sche Dictionary, the Middle High German Dictionary, the Goethe Dictionary and numerous dialect dictionaries. The incorporation of the Dictionary Network as a web service is a good example for the integration of existing web sites or software solutions into the Lab.

The Workflow Editor is designed for the automatic analysis and processing of larger amounts of text. It provides a set of standard workflows, including a Lemmatising function plus several Sorting options, which facilitate numerous linguistic workflows. For more complex and more specific tasks, projects can define their own workflows. External programs can be accessed with the Workflow Manager as well (via a web-service interface), as it already occurs with the word processing software TUSTEP at present.

As XML is used as the Lab’s standard data format, a component transforming XML files via the established programming language XSLT must not be missing.

The “virtual tool box” is completed by a publication tool: The Web Publisher formats XML texts to be displayed in a web browser and makes them available for all users on a TextGrid server. In this way results can be made accessible to a wide community continuously during the scientific process.

After the release of version 1.0, the tools of the TextGridLab will be enhanced continuously, and additional components will be added step by step. In doing so, both the range of functions will be extended for the core disciplines edition philology and linguistics, as well as new specific tools are being developed for musicology, art history and classical philology (gloss edition).

The tools of the TextGridLab in its basic version already offer a large functional range for numerous standard tasks and are appropriate to join the world of digital text processing and analysis.

For many projects this basic tool box will have to be extended by more specific tools: Thus the TextGrid project explicitly decided to develop open source software, so the source code of the Lab along with its technical documentation may be downloaded freely. In this context, project-specific customising can both be upgrades of particular tools or components of the infrastructure as well as the integration of new programs or the linkage to external web sites. As a community-based project TextGrid counts on a self-perpetuating dynamics to be, meaning that project-specific upgrades are provided to the entire community for utilisation as well as further development.

To top




4. TextGrid Repository

The TextGridRep is the second main component of the Virtual Research Environment and permits the long-term storage and sustainable usage of research data of the arts and humanities.

While being imported to the TextGridRep, objects are semantically indexed and metadata are automatically attached to them. With the aid of the TextGridLab the data stored in the grid can be edited, administrated, assigned to projects and provided with further metadata with regards to content. The users themselves can decide with whom they want to share their data in which way by using the detailed rights management (see above). In addition, they can publish their results and research data in the repository.

Starting with version 1.0, archives and institutions are enabled to import large amounts of data to the repository using a special interface and thereby to validate the metadata automatically at the same time.

Published data is available via a browser-based portal that offers a fast search for published research data. Besides detailed search options, there will exist several modes of visualisation. Alongside an open interface for external / individual portal solutions is intended.

To make the long-term availability of research data possible, TextGrid offers data protection with redundant storage and backups on a regular basis for a period of ten years (as recommended in the guidelines of the German Research Foundation [DFG]). In the future, data storage for a longer term and higher levels of data protection like redundant disc space at distributed locations or tape backups will be offered likewise – associated with higher costs. The sustainable addressability of published and long-term archived data is achieved via persistent identifiers whose long-term availability is guaranteed by the European Persistent Identifier Consortium (EPIC).

Long-term archiving services of higher quality that are being developed by the research association WissGrid can presumably be offered in 2012. The services being developed within the framework of this project (e.g. format validation and extraction of metadata) plus the recommendations for long-term archiving of research data will be adapted to the special need of the arts and humanities and integrated in the Virtual Research Environment TextGrid. The enhancement of the current repository infrastructure referring to this is planned for the period after the release of version 1.0 as of summer 2011.

To top




5. Use Cases

The TextGrid project already sets value on comparing notes with the academic communities during its conception. For example, one of the work packages takes care of training courses, workshops and the preparation of online tutorials so as to inform users about the current development status continuously and to familiarise them the new system These platforms are also useful for receiving criticism and suggestions for improvement from the communities which can be incorporated as part of the iterative design cycle.

As fruit of this strategy several research projects have already taken TextGrid in account concerning conception of content and formal application. As soon as version 1.0 is re¬leased, these projects can immediately start to work in the Virtual Research Environment. The project "Blumenbach-Online", an undertaking of the "Akademie der Wissenschaften" (Academy of Sciences and Humanities) in Göttingen and the Institute for History of Science and Humanities of the University of Göttingen, will create a re-edition of the naturalist's opus. In this process Blumenbach's collection of natural historic objects is going to be reconstructed and linked to the writings and letters in an online portal – a mission that can only be realised with a highly specific and substantial metadata description.

The project "Archaeo 18", that is planning to reconstruct the founding of archaeology as a scientific discipline with the help of Christian Gottlob Heyne's lectures, resides at the University of Göttingen as well. During the edition of numerous students' notes and the reconstruction of their overall context, the object administration in TextGrid will prove itself, in addition, the digital medium will show its potential for the description and evaluation of references amongst the texts in form of hypertext.

In close collaboration with the project "Historisch-kritische Edition von Goethes Faust" (historical-critical edition of Goethe's Faust), that is conducted by the Freies Deutsches Hochstift Frankfurt, the Klassik Stiftung Weimar and the University of Würzburg, a profile of requirements has been defined for the Text Image Link Editor. In version 1.0 all demands will be met so that this project can effectively start to work in the TextGridLab, too.

Version 1.0 will be available for the academic community as well as its upgrades.

To top




6. Support, Tutorials and Educational Opportunities

For TextGrid it is very important that the users can easily handle the TextGridLab. A demo movie available on the TextGrid website gives a first overview of the tools the TextGridLab provides. Online tutorials illustrate the tools’ functionalities in a more detailed way, currently these explanations are offered in text and video format for the basic functions.  Until the release workshop in July 2011, further tutorials for the particular tools will be designed, and the help function of the TextGridLab will be made available in a more comprehensive way.

To be able to respond to questions of users promptly, an e-mail address (support(at)textgrid.de) plus a feedback form have been generated which allow users to contact the support staff of TextGrid directly with their problems and requests. If required, a consultation by phone can be offered as well according to prior agreement, in which an experienced staff member can guide the user through the desired tools of the TextGridLab online.

Beyond that, in the context of conferences, training courses are offered whose content is tailored to the particular needs of the participants. Such tailored interactive workshops can also be offered, e.g. to colleagues of a research institute. For this purpose, an appropriate institution simply has to provide a classroom for the training course, e.g. a computer lab that offers enough space and computers for the participants plus a data projector.

To top




7. Outlook, Perspectives

With the release of version 1.0, the Virtual Research Environment TextGrid will become effectively utilisable. Further activities in the project will concentrate on the enhancement and improvement of the established infrastructure and the completion of the tool kit offered in the TextGridLab.

Although these aims will be achieved at the end of the second project phase in summer 2012, TextGrid will not exist in its finally completed version. Thus one essential mission of the remaining project duration will be to transfer TextGrid as smoothly as possible from the project status to a permanent organisational form. Then, the tasks of this organisation will be to provide for the availability of the data and the technical infrastructure, to organise project-specific further developments of the TextGridLab and its tools and to introduce new users to the work in the Virtual Research Environment. The fundament of this new organisational form will be a legal status that allows supervising the utilisation of the TextGrid infrastructure by different user groups as well as the continuity of the required administrative and technical staff size.

TextGridLab

Version 2.0 that is scheduled for spring 2012 will be enhanced by subject-specific tools and services. For example, an XML Editor to input and display notes in MEI is currently being developed for musicologists, and the classical philologists will be provided a gloss editor. With Digilib an already established tool to provide and annotate image data for art historians will be integrated in the TextGrid environment. With LEXUS and COSMAS the tools for linguists will be supplemented by two large data bases, furthermore, a collationer will be provided to compare two or more XML documents. A Dictionary Link Editor extends the functionality of the Dictionary Service by enabling the linkage of lemmas in different dictionaries.

An enhanced version of the OCR tool OCRopus for automatic character recognition of black letter will be integrated in TextGrid 2.0 as well. In addition, the TextGridLab will receive a Bibliography Tool to include existing data sets plus to edit and administrate bibliographic data. A Text Text Link Editor serves as input assistance for links in XML files and links elements in Textgrid documents via their URI fragment. Furthermore, the second TextGrid version will contain a typesetting program being developed by the XML Print project, that is funded by the German Research Foundation (DFG), to print texts with complex layout requirements based on XML data in the format PDF-A which is important for archiving purposes.

TextGridRep

One of TextGrid's most important missions is the sustainability of research data in the TextGridRep. Besides the utilisation of open standards and established formats, the achievement of this objective is supported by the integration of the WissGrid service framework for long-term archiving mentioned above. Alongside the modular architecture of the TextGrid infrastructure permits the implementation of alternative solutions for a sustainable and future-proof access to the resources. The matter of interoperability is highly prioritised so that the TextGridRep can be linked to other repositories – as part of a network or digital ecosystem of research repositories in Germany and Europe. Furthermore, a rights management is planned to be enhanced by a component that allows a licence-based authorisation. Hereby it is seen to that resources can be used as well that are not subject to an open access licence.

To top




8. Timetable

Version 1.0 and its further development pursue the following scheme.

February 15, 2011Feature Freeze: Start of the test phase involving the user community
June 1, 2011TextGrid version 1.0
July 12-13, 2011TextGrid-Tage 2011 (TextGrid Days): Release workshop in Göttingen
Autumn 2011Transfer of TextGrid to a sustainable legal form
January 2012Integration of the WissGrid service framework for long-term archiving of research data (see above)
Spring 2012TextGrid version 2.0
May 2012End of the TextGrid project: Community-based continuation in sustainable legal form

If you are interested in using TextGrid for your work, you can obtain more information at our website: www.textgrid.de
You can contact us via e-mail: info(at)textgrid.de