Language Detection API reference

Language Detection API is a simple language identification API; it is a tool that may be useful when dealing with texts, so we decided to open it to all our users. It currently supports 96 languages.

Endpoint

https://api.dandelion.eu/datatxt/li/v1

We support both GET and POST methods to query the API.

Parameters

Remember to authenticate yourself specifying the token parameter (or the legacy $app_id and $app_key pair). See the API doc about authentication for any questions.

text|html|html_fragment required

These parameters define how you send to the Language Detection API the text for which you want the language to be recognized. Only one of them can be used in each request, following these guidelines:

use "text" when you have plain text that doesn't need any pre-processing;
use "html" when you have an HTML document and you want the Language Detection API to work on its main content. It will use an AI algorithm to extract the relevant part of the document to work on; in this case, the main content will also be returned by the API to allow you to properly use the annotation offsets;
use "html_fragment" when you have an HTML snippet and you want the Language Detection API to work on its content. It will remove all HTML tags before analyzing it.

Type

string

clean optional

Set this parameter to true if you want the text to be cleaned from urls, email addresses, hashtags, and more, before being processed.

Type	boolean
Default value	false
Accepted values	true \| false

Hint: Keep-alive
If you need to send many requests to the server api, it is suggested to use keep-alive to avoid the connection overhead. To know how to enable it, please refer to your http client documentation (eg: python-requests, ruby, php).

Response

The response is structured in JSON as follow:

{
  "timestamp": "Date and time of the response generation process",
  "time": "Time elapsed for generating the response (milliseconds)",
  "detectedLangs": [
    {
      "lang": "ISO 639-1 code of the detected language",
      "confidence": "Accuracy of the language detection",
    }
  ]
}

For more information about status codes and error handling please refer to the dandelion generic API documentations. The cost of each request can be found in the response headers as described here.

Example

Request

https://api.dandelion.eu/datatxt/li/v1/?text=I%20am%20a%20mighty%20pirate%20von%20Deutschland&token=<YOUR_TOKEN>

Response

Connection: keep-alive
Content-Length: 2748
Content-Type: application/json;charset=UTF-8
Date: Wed, 21 Oct 2015 16:29:37 GMT
Server: Apache-Coyote/1.1
X-DL-units: 0.1
X-DL-units-left: 999.9
X-DL-units-reset: 2015-10-22 00:00:00 +0000

{
  "timestamp": "2015-10-21T16:29:37",
  "time": 1,
  "text": "The annotated text. Present only if the 'html' parameter has been used",
  "detectedLangs": [
    {
      "lang": "de",
      "confidence": 0.5714284059202707
    },
    {
      "lang": "en",
      "confidence": 0.42857127625587477
    }
  ]
}

Language Detection API reference

Endpoint

Parameters

Response

Example

Request

Response

Contact Us

About Us