Faculty and Evolution of Language

Chomsky and Piraha Language

Statistical (or neural) Language Models

Noam Chomsky makes a strong argument about the limitations of statistical language models to model grammar (Syntatic Structures):

It is natural to understand "possible" as meaning "highly probable" and to assume that the linguist's sharp distinction between grammatical and ungrammatical' is motivated by a feeling that since the `reality' of language is too complex to be
described completely, he must content himself with a schematized version replacing "zero probability, and all extremely low probabilities, by impossible, and all higher probabilities by possible." We see, however, that this idea is quite incorrect, and that a structural analysis cannot he understood as a schematic summary developed by sharpening the blurred edges in the full statistical picture. If we rank the sequences of a given length in order of statistical approximation to English, we will find both grammatical and ungrammatical sequences scattered throughout the list; there appears to be no particular relation between order of approximation and grammaticalness. Despite the undeniable interest and importance of semantic and statistical studies of language, they appear to have no direct relevance to the problem of determining or characterizing the set of grammatical utterances. I think that we are forced to conclude that grammar is autonomous and independent of meaning, and that probabilistic models give no particular 
There are strong counter-arguments to this:

An empirical counter-argument would be the sentences generated by modern deep neural language models:

