How to Detect Language for a String in PHP

For me it happens pretty other to have to validate if a text string is in English or to detect the language. Most of the algorithms are based on the probability of appearance of sequences of letters. For example the sequence of letters “the” is more frequent in English than in French. However there are not so many implementations of such NLP algorithms in php. One of the options is the Text LanguageDetect pear package. It can be used directly if is installed as a PEAR package or downloaded and used as a separate library.

It’s very easy to use it:

Will return the following array of probabilities. Note the order array is sorted so the first element represents the most probable language($result[0]):

In case you have to use it on an environment where it is not available as a pear package you have to download it from the up mentioned link and to unzip it in the location from where the script is run. If you want to put them in a separate directory instead of leaving them in the root folder of the application you need to change the LanguageDetect.php file accordingly.

In the original file:

Make the following modifications(considering the new location is “lib/languagedetect” ):

One thought on “How to Detect Language for a String in PHP

Leave a Reply

Your email address will not be published. Required fields are marked *