Master Thesis: Step I: Selection of Annotation Methods to be investigated

In the blog “Way of Working”, I have set out the steps I’m going to take. As can be seen in Fig. 1 Way of Working, the first step is “Selection of Annotation Methods to be investigated”. As input for this step I take the FHIES Article by Van Gorp et al (2012) and the already mentioned Skype meeting I had with dr. Van Gorp. The output of this step are the different methods in the 2 categories: the desktop tools GEM Cutter II and EuGENia executed via SHARE, and the Web Annotation tools, like Diigo.

In this blog, I will summarize the FHIES article and my Skype meeting.

Summary of FHIES article

The actual title of the article is ‘MDE support for process-oriented health information systems: from theory to practice’. Van Gorp, Vanderfeesten, Dalinghuis, Mengerink, van der Sanden and Kubben have written this article in 2012 and presented it at the Foundations of Health Information Engineering and Systems (FHIES) International Symposium of 2012. “The paper leverages model-driven engineering techniques to improve (the use of) health information systems that are process-oriented” (Van Gorp et al (2012)).

In the paper Van Gorp et al discuss Model Driven Engineering (MDE) technology and its potential to improve workflow management and decision support in the healthcare sector. MDE is defined as “a software engineering method to generate or configure powerful tools with the use of explicit modelling language and model transformation definitions” (Van Gorp et al (2012)).

The focus of the paper is specifically on tool-supported derivation of the formal models from unstructured text. This means Van Gorp et al have evaluated the tool support for systematically deriving clinical guideline models from medical literature. MDE tools are chosen because:

“They are known to excel in the linkage of models at various levels of abstraction.

They are known to support the co-evolution of conceptual models with derived software systems” (Van Gorp et al (2012)).

Related tool support is performed by the Yale Center for Medical Informatics. Therefore Van Gorp et al have investigated the GEM Cutter II. This method has not been used in this project because it does not fit the aims set for the annotation tool:

An annotation-based model extraction infrastructure based on Eclipse ECore and EMF (industry-standards for metamodeling in MDE)
Support of visual models (e.g. flowcharts)
Adaptability of metamodels (useful for clinical guidelines, but also for CPRs, CPAs, etc).

The last point already hints to it, in the paper clinical guidelines are taken as an example to show the relevance of annotation-based model extraction. However, the last point also mentions the adaptability of metamodels. This means that another aim is to make the tool support also useful for other process-oriented medical texts. Because there are lots of different types of texts, Van Gorp et al have made a classification for all process-oriented models in the Health Informatics literature. They have identified two dimensions:

patient scope: 1 individual patient, multiple patients of 1 care group, and any type of patient;
provider scope: 1 organization or multiple organizations working as 1 (virtual) organization/network, and multiple organizations.

Next to this they have identified 6 main ways used in literature to describe medical texts in these 6 categories:
(1) Clinical guidelines (CGs), (2) Clinical Protocols (CPs), (3) Care Pathways (CPAs),

(4) Individual Care Pathways (ICPs), (5) Assigned Pathways (APs), and (6) Reference Pathways (RPAs).

These dimensions and their categories are visualized in Fig. 1 ‘2D Classification of Process Oriented Care Descriptions’.

Next, Van Gorp et al state that they will investigate “metamodel-based language support for each class of process descriptions” (in the future) and they discuss their first practical results: “the application of model-driven engineering techniques for managing better the relation between clinical guidelines, clinical protocols and their derived applications” (Van Gorp et al (2012)). They state for this relation not only computerized transformations are needed, but also consensus building by various medical specialists. This is because flowcharts are often used for documenting guidelines, but their modelling basis is not that well-founded, so by strengthening this, new opportunities arise.

MDE Support for Clinical Guidelines

The 2D classification is extended with time as a next step to show the need for formally interconnecting the models from the 2D space (Fig. 2). In today’s HIS architectures the theoretical instantiation, specialization, and update-of links are, in general, largely implicit. Explicit links could help to better analyse the relation between the different documents in practice: i.e. “how evidence-based descriptions of optimal medical care (CGs) relate to nurse management systems (CPRs) and patient logistic systems (CPAs and ICPs) and planning systems (APs)”. Therefore, metamodels that enable the formal representation of all artefacts in this space are needed. After this, the links between the models and their individual elements can be provided conveniently.

From this Van Gorp et al state the following hypothesis: “MDE techniques can primarily contribute to the better management of related artefacts over time.”

Fundamental steps that are needed to enable MDE support for any cell of fig.2:

Analyses of which modelling languages are relevant to the problem at hand. This means a metamodel (i.e. abstract syntax, that defines the language definition) has to be made
Put MDE techniques to action.

The case for Clinical Guidelines:

1. CG tend to be documented in plain text, but in many cases also formalized with flowchart

notation. These, however, tend to be published only as images and many times only a

subset of the flowchart notation was used. This means a need to enrich the model elements

with additional metadata was discovered.
--> Eclipse Epsilon suite has been used to define the abstract and concrete syntax of

the newly developed flowchart-based language for CG modelling.

- The metamodel structure/abstract syntax is defined by classes and associations.
§ There are 2 types of nodes: the classes action and decision
§ There is an attribute added that enables to associate one or more medical

papers to a flowchart. (useful for traceability))

- The concrete syntax is defined by means of annotations.

2. A) Generation of a special purpose flowchart editor based on metamodel definition

--> chosen format is Eclipse Modeling Framework (EMF)

B) Development of a prototypical code generator for translating the flowchart

models into source code files for mobile Android devices

The special purpose, metamodel based editor (described above) provides a promising basis for storing CG in a shared repository. “However, one crucial step has been overlooked so far: the primary publication artefact of a clinical guideline is still plain text.” Unfortunately, no generic MDE tool infrastructure to ease the development of the text annotation component has been found. Therefore, Van Gorp et al implemented an ad-hoc Java Swing application to annotate. However, “generic support for building interactive text to model derivation tools is needed.”

Van Gorp et al emphasize that grammar-based automatic text-to-model transformation tools are largely irrelevant in this model mining context since the input texts do not adhere to grammatical rules. This means that grammar-based tools use a certain fixed structure of a sentence to form a model from the text. Since medical text are not always build up in the same way, nor is the structure of every sentence always the same, this means that grammar-based tools are not useful in this case.

Model-Driven, Evidence-based, Development of CDS Apps

As mentioned before, Van Gorp et al (2012) have developed a prototypical tool-chain to show how MDE techniques can help to transform plain text clinical guidelines (CGs) into flowcharts and a Clinical Decision Support App (CDS), which can be used by medical practitioners.

Steps of the chain:

Medical specialists annotate scientific CGs. Context: in their continuous learning process or during a regularly planned literature review cycle within a hospital. Annotations are stored in a computer-interpretable form.

Guideline annotations are transformed into a flowchart skeleton model.

The flowchart is manually refined.

The flowchart is transformed into a CDS app.

Fig. 5 illustrates the steps.

The currently used annotation tool annotates the following:

Title: Green

Observations: Yellow

Actions/Treatments: Red

Explanatory elements: Blue

By clicking compile the representation is translated into the flowchart EMF format of the flowchart editor.

Final flowchart:

Edges in the figure are still created manually

Explanations are not shown but these could be added

App:

Initially, it represents a searchable list of guidelines (title of CG)

After selecting a CG, the App shows the question at the root node of the decision tree. The user can answer with yes (“on”) or no (“off”)

The next questions follow until there is evidence for a certain action.

The medical specialist can get more background information by pressing “Info”.

Observations

Van Gorp et al conclude that “the implementation of the prototype has confirmed the confidence in the potential of MDE techniques for the development of better process-oriented health information systems.” Furthermore, the following observations were made:

“MDE technology was particularly strong for generating a specialized editor: the Eugenia component of the Eclipse Epsilon suite has supported best so far.

The use of advanced features of MDE code generators such as Acceleo would have slowed the project down rather than facilitated it.

The MDE community has overlooked the support for extracting models from annotated texts.”

Annotation Tool Recommendations

With regard to annotation tools, Van Gorp et al make the remark that an extension of the Epsilon platform seems like a good idea to them. This would mean that the text annotation tool is generated from a metamodel. Now Eugenia transforms a metamodel specification into a visual model editor. Van Gorp et al propose a new functionality for the Epsilon platform: Generated Annotation Editor. This editor should contain “a palette with buttons for creating specific annotations in the text” which is shown next to the palette. By mapping the buttons directly to element attributes of the corresponding model it becomes possible to annotate the metamodel definitions in such a way that the Eugenia can automatically generate the button’s behaviour.

Van Gorp at al leave it open if the annotation editor is a separate tool or if it is integrated with the model editor. Either way, they propose to store the complete texts and the begin and end indices of annotations inside the EMF based output model.

Conclusions

Based on a thorough literature study, this paper presented a clarification and novel classification of the existing process-like descriptions in the healthcare domain in order to derive support for these processes through model-driven engineering techniques.

2 dimensions to distinguish types of descriptions: # organizations, # patients.

Tool chain developed for CG to illustrate how MDE techniques can enable the stepwise development of mobile clinical decision support apps.

This paper motivates a previously undocumented need for tools to extract models interactively from annotated texts. à Extension of Epsilon (state-of-the-art MDE toolsuite) so that an annotation-based extraction tool can be used for formalisms other than flowcharts.

Skype Meeting with Dr. Van Gorp

After reading the article I had a Skype meeting with Dr. Van Gorp. He answered my questions concerning the above summarized article and we talked about the direction of my literature study.

Concerning the direction of my literature study (as mentioned in the blog ‘Back on Track’) I will investigate which annotation method is most useful in light of the research described in the FHIES article. These are: GEM Cutter II, EuGENia, SHARE (a cloud environment), and Web Annotation Tools.

Topics discussed:

GEM Cutter II: this annotation method needs lots of specific parameters. The environment is too complex and detailed for the project (described in FHIES). GEM Cutter II is specifically designed for CGs, so not for other process-oriented care documents. This is mentioned in the article with the phrase: based on one metamodel. A method which is accessible/adjustable to any modelling language is wanted. (Important to understand: investigate (e.g. article about study with GEM and CGs).)

ECore and EMF: these are technologies which are used to work with metamodels. It is not necessarily to understand these completely, because EuGENia hides lots of their functionality.

EuGENia: this method works on ECore and EMF. There has just been the release of a new version. As mentioned in the article, it might be a good idea to enrich the metamodel with annotation options linked to visual elements. This indicates an editor will have to be designed for this. It is unknown if an annotation tool can be added in the EuGENia program next to the model editor tool. (Important to understand: investigate thoroughly (e.g. tutorial, minor project).)

SHARE: this is a cloud environment. It can be found on the platform of Van Gorp. (Important to investigate thoroughly (e.g. articles, online environment).)

Web Annotation programs: these programs work completely online. This is thus different from desk top programs, like Qiqqa (mentioned in another blog). (Important to investigate and get full understanding of term (e.g. websites).)

Metamodel: model of a model or a modelling language. This means the structure, semantics (what the models and programs written in the language mean and how they behave) and constraints for a family of models are described (Mellor et al (2004)). The MDA describes a metamodel as abstract syntax and a data model to store, manipulate and interchange models.

Overview of current annotation tools: it is important to make an overview of the currently existing annotation tools with their properties. Input and Output needed and provided are important factors to keep track of.

Also focus on different process-oriented medical texts: in the end is important that the annotation tool can be used for different types of medical texts, i.e. not only applicable for Clinical Guidelines.

Annotation tool criteria: the criteria on which to compare the different annotation tools will be mainly based on recommendations/wishes of dr. Kubben. It is an option to ask his colleagues, but this is to be decided upon later.

Concluding

In the literature study I will investigate 4 different annotation methods: GEM Cutter II, EuGENia, SHARE, and Web Annotation tools. These methods will be compared and eventually the method which is most useful as an annotation tool in the tool chain described in the article (from medical text to CDS app) will be recommended.

Master Thesis

Total Pageviews

Sunday, 2 December 2012

Step I: Selection of Annotation Methods to be investigated