What is the best language detection library or web api available? [even paid]

First of all, I have a lot of text available. Say I have 10,000 characters for each attempt. The script is based on php, but I can use whatever I want. C ++, java, no problem.

Cannot use google goi language: their usage restrictions are small.

I have 6 hours when I try to speak with something more, but not yet. Can someone point me to my best chance?

+4
source share
7 answers

There is a language detection API that provides both free and premium services.

It accepts text through GET or POST and provides JSON results with evaluations.

+6
source

Java based tools:

Apache Tika : not all "language profiles", but you can add them yourself

public String detectLangTika(String text) throws SystemException { LanguageIdentifier li = new LanguageIdentifier(text); if (li.isReasonablyCertain()) return li.getLanguage(); else throw new Exception("Tika lang detection not reasonably certain"); } 

language-detection : A lot of language profiles are great for me.

  DetectorFactory.loadProfile(new File(LangDetector.class.getClassLoader().getResource("profiles").toURI())); public String detectLangLD(String text) throws SystemException { Detector detector; String lang; try { detector = DetectorFactory.create(); detector.append(text); lang = detector.detect(); } catch (LangDetectException e) { throw new SystemException("LangDetector Failure", e); } return lang; } 

The most accurate tool was the Google API lang discovery , which was discontinued and replaced by the paid Google Translate API.

+7
source

A little late, but I wrote this library (and I am implementing a free API service without any restrictions).

https://github.com/crodas/LanguageDetector

+1
source

If you are ready to give python go ... nltk . And I hope you have passed this .

0
source

You can use Rosoku. It discovers 230 different languages. You can try it through Amazon AWS Market at Rosoka Cloud

You pay for the time you use.

-1
source

There is another freemium API here: language detection API

You can easily check the endpoints on this page.

it accepts GET and POST requests (for longer input) and has a JSON response with this structure:

 { language: "eng", isReliable: "true", confidence: "0.9979894639898946" } 

Disclaimer: I provide this API.

-1
source

I would recommend using languagelayer.com , they offer a free RESTful JSON API web service that can detect about 170 languages. Batch requests are also offered.

The GET API request (POST welcomed) looks something like this:

 https://apilayer.net/api/detect ? access_key = YOUR_ACCESS_KEY & query = I like apples and oranges 

And here is the JSON answer:

 { "success": true, "results": [ { "language_code": "en", "language_name": "English", "probability": 83.896703655741, "percentage": 100, "reliable_result": true } ] } 

5,000 monthly requests are free, if you need more (like I did), then the cheapest subscription is $ 4.99 per month for 50,000 requests. (More info here )

-1
source

All Articles