Support

Text Classification API reference

This API classifies short documents into a set of user-defined classes. It's a very powerful and customizable tool for text classification, and defining your own models will take you just a couple of minutes. Curious? Read below how to call the Text Classification API, or discover how to define custom models.

Pay attention
Don't forget that the Text Classification API is optimized for short texts! Need to use it on long texts? write us

Endpoint

https://api.dandelion.eu/datatxt/cl/v1

We support both GET and POST methods to query the API.

Parameters

Remember to authenticate yourself specifying the token parameter (or the legacy $app_id and $app_key pair). See the API doc about authentication for any questions.

text|url|html|html_fragment required
These parameters define how you send to the Text Classification API the text you want to classify. Only one of them can be used in each request, following these guidelines:
  • use "text" when you have plain text that doesn't need any pre-processing;
  • use "url" when you have an URL and you want the Text Classification API to work on its main content; il will fetch the URL for you, and use an AI algorithm to extract the relevant part of the document to work on; in this case, the main content will also be returned by the API to allow you to properly use the annotation offsets;
  • use "html" when you have an HTML document and you want Text Classification API to work on its main content, similarly to what the "url" parameter does.
  • use "html_fragment" when you have an HTML snippet and you want the Text Classification API to work on its content. It will remove all HTML tags before analyzing it.
Type string
model required
The unique ID of the model you want to use. If you want to learn how to manage your custom models, please refer to User-defined classifiers.
min_score optional
return those categories that get a score above this threshold. There is not a gold-value for such parameter that works for every model, moreover it really depends on your use-case. Start experimenting with 0.25 and increase / decrease it depending on the results.
Default value 0.0
Accepted values 0.0 .. 1.0

Did you know?
You can use the Entity Extraction API's parameters as well, prefixing them with nex. (e.g: nex.min_confidence)

Advanced parameters

Looking for some advanced parameter? Show me more

max_annotations optional
The Text Classification API uses the Entity Extraction API under the hood. With this parameter you can limit the number of annotations to be used for classifying the text, using only the top-most entities by their confidence.
Default value +inf
Accepted values 1 .. +inf
include optional
Returns more information about the classification process:
  • "score_details": we added this parameter for debug purposes: it will output, for each entity in the model categories, a weight value that represents how much they have influenced the overall score of their category. For each category, the weights sum up to 1.
    Pay attention
    This parameter can be used only by the model owner. You can share your models with other users simply sending them the model ID, but they won't be able to use include=score_details.
Default value <empty string>
Accepted values score_details
Example include=score_details
Hint: Keep-alive
If you need to send many requests to the server api, it is suggested to use keep-alive to avoid the connection overhead. To know how to enable it, please refer to your http client documentation (eg: python-requests, ruby, php).

Response

The response is structured in JSON as follow:

{
  "timestamp": "Date and time of the response generation process",
  "time": "Time elapsed for generating the response (milliseconds)",
  "lang": "The language used to classify the input text (defined in the model)",
  "categories": [
    {
      "name": "The name of the category",
      "score": "The score of the category",
      "scoreDetails": {
        "entity": "URI of the entity. Only if 'include' parameter contains 'score_details'",
        "weight": "Weight of the entity. Only if 'include' parameter contains 'score_details'",
      }
    }
  ]
}

For more information about status codes and error handling please refer to the dandelion generic API documentations. The cost of each request can be found in the response headers as described here.

Example

Request

https://api.dandelion.eu/datatxt/cl/v1/?text=Sunderland%20boss%20Gus%20Poyet%20urges%20his%20players%20to%20use%20their%20spirited%20Capital%20One%20Cup%20final%20display%20as%20a%20springboard%20to%20stay%20up &model=news_eng&min_score=0.2&include=score_details&token=<YOUR_TOKEN>

Response

Connection: keep-alive
Content-Length: 2748
Content-Type: application/json;charset=UTF-8
Date: Wed, 21 Oct 2015 16:29:37 GMT
Server: Apache-Coyote/1.1
X-DL-units: 2
X-DL-units-left: 998
X-DL-units-reset: 2015-10-22 00:00:00 +0000
{
  "timestamp": "2015-10-21T16:29:37",
  "time": 2,
  "lang": "en",
  "text": "The annotated text. Present only if the 'url' or 'html' parameters have been used",
  "url": "The actual URL from which the text has been extracted. Present only if the 'url' parameter has been used",
  "categories": [
    {
      "name": "sport",
      "score": 0.55816597,
      "scoreDetails": [
        {
          "entity": "http://en.wikipedia.org/wiki/Blackburn_Rovers_F.C.",
          "weight": 0.090656884
        },
        {
          "entity": "http://en.wikipedia.org/wiki/Football_League_Cup",
          "weight": 0.08817835
        },
        {
          "entity": "http://en.wikipedia.org/wiki/Wolverhampton_Wanderers_F.C.",
          "weight": 0.084903054
        },
        {
          "entity": "http://en.wikipedia.org/wiki/2011%E2%80%9312_Premier_League",
          "weight": 0.078713804
        },
        {
          "entity": "http://en.wikipedia.org/wiki/Chelsea_F.C.",
          "weight": 0.07867668
        },
        {
          "entity": "http://en.wikipedia.org/wiki/Newcastle_United_F.C.",
          "weight": 0.07666723
        },
        {
          "entity": "http://en.wikipedia.org/wiki/Boleyn_Ground",
          "weight": 0.07623016
        },
        {
          "entity": "http://en.wikipedia.org/wiki/Doncaster_Rovers_F.C.",
          "weight": 0.07176816
        },
        {
          "entity": "http://en.wikipedia.org/wiki/Tottenham_Hotspur_F.C.",
          "weight": 0.07080989
        },
        {
          "entity": "http://en.wikipedia.org/wiki/Luton_Town_F.C.",
          "weight": 0.07074433
        },
        {
          "entity": "http://en.wikipedia.org/wiki/2004%E2%80%9305_in_English_football",
          "weight": 0.07039765
        },
        {
          "entity": "http://en.wikipedia.org/wiki/Liverpool_F.C.",
          "weight": 0.06989409
        },
        {
          "entity": "http://en.wikipedia.org/wiki/Scunthorpe_United_F.C.",
          "weight": 0.06799107
        },
        {
          "entity": "http://en.wikipedia.org/wiki/2014_Winter_Olympics",
          "weight": 0.0043686396
        },
        {
          "entity": "http://en.wikipedia.org/wiki/2022_Winter_Olympics",
          "weight": 0.0
        }
      ]
    }
  ]
}
SpazioDati Via A. Olivetti 13, 38122, Trento (TN) -


Dandelion API built with by

privacy | tos | cookies

Contact Us

@dandelionapi

Need more info or a custom project? Write us: hello@dandelion.eu

About Us

We're a startup based in Italy, specialized in Semantics & Big Data.
Find out more about us at spaziodati.eu