The present paper is a reply to the article Perspektivy korpusové lingvistiky: deskripce, nebo explanace by František Štícha (2015) which is a critique of recent studies by Radek Čech (2014) and Jan Chromý (2014). It is shown that Štícha’s argumentation is based on an inaccurate reading of the two criticized studies. Also, Štícha’s conception of corpus linguistics as a discipline which aims to capture the morphological and syntactical norm of well-educated people is rather limited. This narrow-minded view seems to be another reason of Štícha’s misunderstanding of the criticized papers.
This article engages in polemic with two papers on the status and prospects of corpus linguistics that were recently published by two Czech linguists in the journal Naše řeč (Our Language). These linguists claim that corpus linguistics relies too heavily on description, in general, and doesn’t provide sufficiently rigorous explanations. In contrast, the present author argues that working with large corpora (billions of tokens) does not necessarily lead to mere descriptions of language phenomena. Rather, descriptions based on large corpora facilitate rigorous explanations of grammatical phenomena. In addition, the author argues that until data-based descriptions became an integral part of work in the natural sciences, philosophically based explanations did not fully succeed at enabling us to understand the physical world. Language is a part of the natural world, and satisfactory grammatical explanations of natural languages require much more empirical evidence than could be obtained in the past without electronic corpora. Several examples of empirical evidence and their critical relevance to linguistic analysis are cited.