Dandelion plugin for Elasticsearch

The purpose of the Dandelion Analysis Plugin is to allow Elasticsearch users to take advantage of the Dandelion entity extraction service, which performs very well even on short texts.

It provides an analyzer that enables semantic searches based on the entities extracted from the input texts, simplifying the search for specific concepts.

For example, if you are looking for documents about Leonardo Da Vinci and you search for “the italian painter Leonardo” you will also find all the documents mentioning only “Da Vinci”. In fact, the system recognizes the same entity in these two expressions, even if the terms are totally different.

Dandelion analyzer is also able to match lots of expressions with their acronyms (e.g. National Football League ↔︎ NFL), but the most powerful feature is probably the multi language option.

It allows you to quickly find mentions of specific concepts in all the supported languages (about 50, some still in beta), by annotating the documents at index time. For example, if you query for “The Mona Lisa” you will also retrieve documents mentioning “La Gioconda” (Italian), “La Joconde” (French) etc.

This feature takes advantage of Wikipedia services, therefore the supported languages for each entity depend also on Wikipedia availability.

Of course, to use this plugin you will need a Dandelion authorization token: Sign up to get a free trial token

For more details and for installation and usage information visit https://github.com/ZarHenry96/elasticsearch-dandelion-plugin

Leave a Reply