Outcomes of the OpenMinTeD Interoperability Workshop
On 12 November, OpenMinTeD’s specification Working Groups (WP5; task 5.2) met for the first time in person. This one-day workshop was attended by 30 participants with wide-ranging expertise in the many faces of TDM interoperability (both project-internal participants and invited external experts). The workshop aimed at discussion of interoperability issues for OpenMinTeD’s TDM workflows.
The morning was devoted to a general overview of all working groups in T5.2 and more detailed presentations about scope and issues from each of the four working groups:
Language and Knowledge Resources
IPR and Copyright
Annotations and Workflow
In the afternoon the TDM scenarios, which had been prepared as concrete cases capturing at least the most important T5.2 interoperability issues, were discussed in two separate break-out sessions. The discussions took place in a pleasant and cooperative atmosphere, in which each participant’s expertise was tapped. As a result, open questions were highlighted and next steps were recommended. Of course, the scenarios had their own shortcomings, which did not always make comprehension easy and drove experts to distraction. But all in all this was an integral part of what transpired to be a constructive discussion.
Each working group had its own specific issues to discuss, but there was enough overlap to enable effective breakouts combining more than one WG.
My own working group, WG2, concentrates on tackling content within the TDM context. Its aim is to arrive at a practically motivated interoperability specification for OpenMinTeD TDM workflows with respect to the operationalisation of external knowledge resource content. Knowledge/content is either contained within resources such as e.g. lexicons, term banks, ontologies, thesauri, dictionaries and annotated corpora, or produced by text mining tools/services (e.g. POS tags, dependency relations, etc.)
Harmonization of resource-specific knowledge into standardised data categories and an interchange format is the way forward to ensuring that OpenMinTeD can make full use of this information in its TDM workflows.
In our break-out we tried to get closer to this notion of harmonization. As a result of our discussions we identified a bottom-up approach that starts with simple lists of descriptors for linguistic/terminological/ontological knowledge. Incremental extension will take place within the specification phase based on practical needs, adding complexity where needed.
This will involve further discussions between OpenMinTed partners and external experts, which will continue over the next few months. Our next face to face meeting will be held in conjunction with LREC 2016.
Maybe we’ll see you there!
Dr. Wim Peters is a Research Fellow at the Natural Language Processing group of the Department of Computer Science at the University of Sheffield. He is the convenor of OpenMinTeD’s T5.2 expert group on Language and Knowledge Resources. For more information, or to become part of the expert group, you can contact him by email.