Summary


This course is an introduction to the statistical properties of languages and their origins, which are often related to the need to reduce the cognitive effort of speakers or listeners. The course is primarily relevant to anybody interested in how languages (and animal communication) are and why. In this course, students will learn about a myriad of statistical laws of language beyond the scope of traditional courses on information retrieval and how to analyze them. Crucially, they will also learn on their potential explanations how they may result from general principles of human cognition. Students will learn about the mathematical and computational models that have been developed to explain these regularities. During this journey, students will enrich their current knowledge with concepts and tools from linguistics, biology, cognitive science, information theory and multidisciplinary physics. The course is also relevant for researchers interested in evaluating or adapting algorithms, machine learning methods,…based on the real statistical properties of language. As these regularities are often the result of reducing the cognitive effort of language users, the course is also relevant to researchers interested in developing resources or systems that are easier to use or understand by humans or interested in developing language processing tools that exploit the real constraints of the human brain.