Home>

There is an array of strings

[
   'Something in English and something in Russian',
   'Something in English',
   'Something in English'
]

How can I test these lines and get only the one that only has English?

Is it possible to solve this problem using regular expressions?

indicate in the question (in tags) the programming language

MaxU2022-02-14 08:52:23

In general, most likely not. If you try, you can come up with phrases that will look identical or almost identical in several languages ​​at once.

andreymal2022-02-14 09:42:01

Possible duplicate of the question: Determining the language of the text

Wiktor Stribiżew2022-02-14 12:09:25
  • Answer # 1

    In python, you can do this:

    from langdetect import detect
    lang= detect("Ein, zwei, drei, vier")
    print language
    #output:de
    

    or you can do this:

    from textblob import TextBlob
    b= TextBlob("bonjour")
    b.detect_language()
    

    Oh, good to be reminded. The main thing I remembered before was about these libraries, but recently there was also a question about languages, but I didn’t remember about them, I had to fence bicycles.

    CrazyElf2022-02-14 10:03:06

    why are you writing in second python?

    Эникейщик2022-02-14 13:06:07
  • Answer # 2

    In python, you can do this:

    from langdetect import detect
    lang= detect("Ein, zwei, drei, vier")
    print language
    #output:de
    

    or you can do this:

    from textblob import TextBlob
    b= TextBlob("bonjour")
    b.detect_language()
    

    Oh, good to be reminded. The main thing I remembered before was about these libraries, but recently there was also a question about languages, but I didn’t remember about them, I had to fence bicycles.

    CrazyElf2022-02-14 10:03:06

    why are you writing in second python?

    Эникейщик2022-02-14 13:06:07
  • Answer # 3

    In php you can do this.

    $array= [
       'Something in English and something in Russian',
       'Something in English',
       'Something in English'
    ];
    foreach($array as $str) {
        if (!preg_match('/[^A-Za-z0-9 ]/', $str))
        {
           echo $str; //string contains only English letters and numbers
        }
    }
    

    Latin only, not English :) In Italian, too, for example, only less. Plus, your regular expression does not match the comment at all)) You have ^ -negation)

    Roman Grinyov2022-02-14 08:11:27

    @RomanGrinyov, negation is fine.

    Qwertiy2022-02-14 09:46:51

    @Qwertiy why? In the string, let's say, there is no A-Za-z0-9 , preg_match will return false, negation in the condition ! will result in true and will display what is under the comment "the string contains only English letters and numbers"? ..

    Roman Grinyov2022-02-14 12:51:20

    But no, everything is correct: if there is at least one character that is NOT a Latin letter, or a number or a space, then "the string contains only English letters and numbers" ...

    Roman Grinyov2022-02-14 13:20:36

    According to my logic, I would write such an expression through a regular expression: /^[A-Za-z0-9 ]+$/.

    Roman Grinyov2022-02-14 13:22:54
  • Answer # 4

    In php you can do this.

    $array= [
       'Something in English and something in Russian',
       'Something in English',
       'Something in English'
    ];
    foreach($array as $str) {
        if (!preg_match('/[^A-Za-z0-9 ]/', $str))
        {
           echo $str; //string contains only English letters and numbers
        }
    }
    

    Latin only, not English :) In Italian, too, for example, only less. Plus, your regular expression does not match the comment at all)) You have ^ -negation)

    Roman Grinyov2022-02-14 08:11:27

    @RomanGrinyov, negation is fine.

    Qwertiy2022-02-14 09:46:51

    @Qwertiy why? In the string, let's say, there is no A-Za-z0-9 , preg_match will return false, negation in the condition ! will result in true and will display what is under the comment "the string contains only English letters and numbers"? ..

    Roman Grinyov2022-02-14 12:51:20

    But no, everything is correct: if there is at least one character that is NOT a Latin letter, or a number or a space, then "the string contains only English letters and numbers" ...

    Roman Grinyov2022-02-14 13:20:36

    According to my logic, I would write such an expression through a regular expression: /^[A-Za-z0-9 ]+$/.

    Roman Grinyov2022-02-14 13:22:54