I did not find a good solution for this, I solved it using the following:
1) filter by lang attribute equal to "en".
2) I found that several non-English languages are still in English tweets. So, I downloaded the Spanish, Dutch and Indonesian word lists and checked the number of non-English words in tweets. More than 1, and I cast it as non-English.
3) I think I need to filter Portuguese as well, I need to research this.
Drew
source share