Nedbox: iMinds’ Language-Enriching Technology Offers Extra Learning Support

Foreign language speakers now have a new tool available to independently practice their knowledge of Dutch: NedBox. Thanks to newspaper articles, exercises and TV fragments from Thuis and Het Journaal, the learning platform offers a rich experience which allows users to set their own rhythm and preferences. NedBox also features automatic language-enriching technology – developed by iMinds researchers at ITEC - KU Leuven and MMLab - Ghent University – which enables language learners to seek support in the form of highlighted word meanings, photos and example sentences.

Additional Info Tailored to Foreign Language Speakers

There is no shortage of online tools and e-courses for people who want to learn Dutch by themselves. Often, however, these lack additional information on specific words and names. Result: language learners go looking for more information online and often end up on websites using language not or less suitable for learners - therefore not getting sufficient help.
NedBox does offer the possibility to find context-sensitive information in Dutch within the same training platform. “Our ‘enrichment pipeline’ can automatically show relevant images, word explanations, synonyms and example sentences. The tool was built by iMinds researchers and uses lots of pictures and very clear language, fully tailored to the target group of foreign language speakers,” says Mariet Schiepers of CTO (the KU Leuven center for language and education) and NedBox project coordinator. 

Structuring and Data-Linking Electronic Texts

To be able to automatically enrich electronic texts with added information, these have to be structured. Scientists from two iMinds research groups have joined forces on a combined linguistic and semantic structuring. 

Researchers of iMinds - ITEC - KU Leuven were responsible for the linguistic aspect. “We developed a technology which assigns the right type of word (annotation) and identifies its dictionary form (lemmatization),” Hans Paulussen (iMinds - ITEC - KU Leuven) explains. For example: the Dutch word “was” is annotated as being a “verb” or “noun” and in the first case, the tool also identifies its lemma “zijn”.

The linguistic structuring paved the way for semantic annotation by iMinds - MMLab - Ghent University. “Words are first assessed within their context and then given their appropriate meaning,” Gerald Haesendonck (iMinds - MMLab - Ghent University) explains. When confronted with the name ‘Frank Vandenbroucke’ for example, the tool examines whether the text is talking about the sportsman or the politician. “Our software then matches this meaning with a number of ‘linked data’ sources. Think of those as some kind of Wikipedia for computers. The system understands the meaning of the words and presents relevant additional data.”


Training Dictionary for Foreign Language Speakers

The text enrichments offered by NedBox are thus gathered from a number of data sources. “We included DBpedia and Cornetto, but also GeoNames, a data source with geological data useful for information on specific place names,” Gerald Haesendonck continues.
Another important source was ‘Thematische woordenschat Nederlands voor anderstaligen’ by Intertaal. Hans Paulussen: “This information is ideally suited for language learners. We again structured the data and converted everything to a ‘linked data’ set.”


Attention for User Experience

With NedBox, the user in is control: he can choose the topics and exercises based on his interests and needs. He can also decide when to ask for support in the form of subtitles, hints or highlighted word meanings. 

The website has an accessible, intuitive interface. NedBox is aimed at the broadest possible group of foreign language speaking adults: both high and low-skilled, both digitally literate and illiterate. The two iMinds research groups have in the past few years built substantial expertise on automatic text enrichment for various target groups, for example through the cooperative research project iRead+.

“NedBox was extensively tested by 275 test subjects, spread over multiple test groups. The tests showed that the learning experience is more user-driven and able to keep the language learner interested for a long time. They also showed that the platform is easily used by the different target groups,” Mariet Schiepers concludes.