## 2021 |

Ferrer-i-Cancho, R; Gómez-Rodríguez, C; Esteban, J L Bounds of the variation of the sum of edge lengths in linear arrangements of trees Journal Article Journal of Statistical Mechanics, pp. 023403, 2021. Abstract | Links | BibTeX | Tags: network science @article{Ferrer2020a, title = {Bounds of the variation of the sum of edge lengths in linear arrangements of trees}, author = {R Ferrer-i-Cancho and C Gómez-Rodríguez and J L Esteban}, url = {https://arxiv.org/abs/2006.14069}, doi = {10.1088/1742-5468/abd4d7}, year = {2021}, date = {2021-01-01}, journal = {Journal of Statistical Mechanics}, pages = {023403}, abstract = {A fundamental problem in network science is the normalization of the topological or physical distance between vertices, that requires understanding the range of variation of the unnormalized distances. Here we investigate the limits of the variation of the physical distance in linear arrangements of the vertices of trees. In particular, we investigate various problems on the sum of edge lengths in trees of a fixed size: the minimum and the maximum value of the sum for specific trees, the minimum and the maximum in classes of trees (bistar trees and caterpillar trees) and finally the minimum and the maximum for any tree. We establish some foundations for research on optimality scores for spatial networks in one dimension.}, keywords = {network science}, pubstate = {published}, tppubtype = {article} } A fundamental problem in network science is the normalization of the topological or physical distance between vertices, that requires understanding the range of variation of the unnormalized distances. Here we investigate the limits of the variation of the physical distance in linear arrangements of the vertices of trees. In particular, we investigate various problems on the sum of edge lengths in trees of a fixed size: the minimum and the maximum value of the sum for specific trees, the minimum and the maximum in classes of trees (bistar trees and caterpillar trees) and finally the minimum and the maximum for any tree. We establish some foundations for research on optimality scores for spatial networks in one dimension. |

Alemany-Puig, L; Esteban, J L; Ferrer-i-Cancho, R Minimum projective linearizations of trees in linear time Journal Article under review, 2021. Abstract | Links | BibTeX | Tags: network science, word order @article{Alemany2021a, title = {Minimum projective linearizations of trees in linear time}, author = {L Alemany-Puig and J L Esteban and R Ferrer-i-Cancho}, url = {https://arxiv.org/abs/2102.03277}, year = {2021}, date = {2021-01-01}, journal = {under review}, abstract = {The minimum linear arrangement problem (MLA) consists of finding a mapping π from vertices of a graph to integers that minimizes the sum of dependency distances. For trees, various algorithms are available to solve the problem in polynomial time; the best known runs in subquadratic time in n=|V|. There exist variants of the MLA in which the arrangements are constrained to certain classes of projectivity. Iordanskii, and later Hochberg and Stallmann (HS), put forward O(n)-time algorithms that solve the problem when arrangements are constrained to be planar. We also consider linear arrangements of rooted trees that are constrained to be projective. Gildea and Temperley (GT) sketched an algorithm for the projectivity constraint which, as they claimed, runs in O(n) but did not provide any justification of its cost. In contrast, Park and Levy claimed that GT's algorithm runs in O(n log d_max) where dmax is the maximum degree but did not provide sufficient detail. Here we correct an error in HS's algorithm for the planar case, show its relationship with the projective case, and derive an algorithm for the projective case that runs undoubtlessly in O(n)-time.}, keywords = {network science, word order}, pubstate = {published}, tppubtype = {article} } The minimum linear arrangement problem (MLA) consists of finding a mapping π from vertices of a graph to integers that minimizes the sum of dependency distances. For trees, various algorithms are available to solve the problem in polynomial time; the best known runs in subquadratic time in n=|V|. There exist variants of the MLA in which the arrangements are constrained to certain classes of projectivity. Iordanskii, and later Hochberg and Stallmann (HS), put forward O(n)-time algorithms that solve the problem when arrangements are constrained to be planar. We also consider linear arrangements of rooted trees that are constrained to be projective. Gildea and Temperley (GT) sketched an algorithm for the projectivity constraint which, as they claimed, runs in O(n) but did not provide any justification of its cost. In contrast, Park and Levy claimed that GT's algorithm runs in O(n log d_max) where dmax is the maximum degree but did not provide sufficient detail. Here we correct an error in HS's algorithm for the planar case, show its relationship with the projective case, and derive an algorithm for the projective case that runs undoubtlessly in O(n)-time. |

Ferrer-i-Cancho, R; Gómez-Rodríguez, C Anti dependency distance minimization in short sequences. A graph theoretic approach Journal Article Journal of Quantitative Linguistics, 28 (1), pp. 50-76, 2021, (published online in 2019). Abstract | Links | BibTeX | Tags: word order @article{Ferrer2019a, title = {Anti dependency distance minimization in short sequences. A graph theoretic approach}, author = {R Ferrer-i-Cancho and C Gómez-Rodríguez}, url = {https://arxiv.org/abs/1906.05765}, doi = {10.1080/09296174.2019.1645547}, year = {2021}, date = {2021-01-01}, journal = {Journal of Quantitative Linguistics}, volume = {28}, number = {1}, pages = {50-76}, abstract = {Dependency distance minimization (DDm) is a word order principle favouring the placement of syntactically related words close to each other in sentences. Massive evidence of the principle has been reported for more than a decade with the help of syntactic dependency treebanks where long sentences abound. However, it has been predicted theoretically that the principle is more likely to be beaten in short sequences by the principle of surprisal minimization (predictability maximization). Here we introduce a simple binomial test to verify such a hypothesis. In short sentences, we find anti-DDm for some languages from different families. Our analysis of the syntactic dependency structures suggests that anti-DDm is produced by star trees.}, note = {published online in 2019}, keywords = {word order}, pubstate = {published}, tppubtype = {article} } Dependency distance minimization (DDm) is a word order principle favouring the placement of syntactically related words close to each other in sentences. Massive evidence of the principle has been reported for more than a decade with the help of syntactic dependency treebanks where long sentences abound. However, it has been predicted theoretically that the principle is more likely to be beaten in short sequences by the principle of surprisal minimization (predictability maximization). Here we introduce a simple binomial test to verify such a hypothesis. In short sentences, we find anti-DDm for some languages from different families. Our analysis of the syntactic dependency structures suggests that anti-DDm is produced by star trees. |

## 2020 |

Ferrer-i-Cancho, R; Gómez-Rodríguez, Esteban J L C; Alemany-Puig, L The optimality of syntactic dependency distances Journal Article pp. under review, 2020. Abstract | Links | BibTeX | Tags: network science, word order @article{Ferrer2020b, title = {The optimality of syntactic dependency distances}, author = {R Ferrer-i-Cancho and Esteban J L C. {Gómez-Rodríguez} and L Alemany-Puig}, url = {https://arxiv.org/abs/2007.15342}, year = {2020}, date = {2020-01-01}, pages = {under review}, abstract = {It is often stated that human languages, as other biological systems, are shaped by cost-cutting pressures but, to what extent? Attempts to quantify the degree of optimality of languages by means of an optimality score have been scarce and focused mostly on English. Here we recast the problem of the optimality of the word order of a sentence as an optimization problem on a spatial network where the vertices are words, arcs indicate syntactic dependencies and the space is defined by the linear order of the words in the sentence. We introduce a new score to quantify the cognitive pressure to reduce the distance between linked words in a sentence. The analysis of sentences from 93 languages representing 19 linguistic families reveals that half of languages are optimized to a 70% or more. The score indicates that distances are not significantly reduced in a few languages and confirms two theoretical predictions, i.e. that longer sentences are more optimized and that distances are more likely to be longer than expected by chance in short sentences. We present a new hierarchical ranking of languages by their degree of optimization. The statistical advantages of the new score call for a reevaluation of the evolution of dependency distance over time in languages as well as the relationship between dependency distance and linguistic competence. Finally, the principles behind the design of the score can be extended to develop more powerful normalizations of topological distances or physical distances in more dimensions.}, keywords = {network science, word order}, pubstate = {published}, tppubtype = {article} } It is often stated that human languages, as other biological systems, are shaped by cost-cutting pressures but, to what extent? Attempts to quantify the degree of optimality of languages by means of an optimality score have been scarce and focused mostly on English. Here we recast the problem of the optimality of the word order of a sentence as an optimization problem on a spatial network where the vertices are words, arcs indicate syntactic dependencies and the space is defined by the linear order of the words in the sentence. We introduce a new score to quantify the cognitive pressure to reduce the distance between linked words in a sentence. The analysis of sentences from 93 languages representing 19 linguistic families reveals that half of languages are optimized to a 70% or more. The score indicates that distances are not significantly reduced in a few languages and confirms two theoretical predictions, i.e. that longer sentences are more optimized and that distances are more likely to be longer than expected by chance in short sentences. We present a new hierarchical ranking of languages by their degree of optimization. The statistical advantages of the new score call for a reevaluation of the evolution of dependency distance over time in languages as well as the relationship between dependency distance and linguistic competence. Finally, the principles behind the design of the score can be extended to develop more powerful normalizations of topological distances or physical distances in more dimensions. |

Gómez-Rodríguez, C; Christiansen, M H; Ferrer-i-Cancho, R Memory limitations are hidden in grammar Journal Article pp. under review, 2020. Abstract | Links | BibTeX | Tags: network science, word order @article{Gomez2019a, title = {Memory limitations are hidden in grammar}, author = {C Gómez-Rodríguez and M H Christiansen and R Ferrer-i-Cancho}, url = {https://arxiv.org/abs/1908.06629}, year = {2020}, date = {2020-01-01}, pages = {under review}, abstract = {The ability to produce and understand an unlimited number of different sentences is a hallmark of human language. Linguists have sought to define the essence of this generative capacity using formal grammars that describe the syntactic dependencies between constituents, independent of the computational limitations of the human brain. Here, we evaluate this independence assumption by sampling sentences uniformly from the space of possible syntactic structures. We find that the average dependency distance between syntactically related words, a proxy for memory limitations, is less than expected by chance in a collection of state-of-the-art classes of dependency grammars. Our findings indicate that memory limitations have permeated grammatical descriptions, suggesting that it may be impossible to build a parsimonious theory of human linguistic productivity independent of non-linguistic cognitive constraints.}, keywords = {network science, word order}, pubstate = {published}, tppubtype = {article} } The ability to produce and understand an unlimited number of different sentences is a hallmark of human language. Linguists have sought to define the essence of this generative capacity using formal grammars that describe the syntactic dependencies between constituents, independent of the computational limitations of the human brain. Here, we evaluate this independence assumption by sampling sentences uniformly from the space of possible syntactic structures. We find that the average dependency distance between syntactically related words, a proxy for memory limitations, is less than expected by chance in a collection of state-of-the-art classes of dependency grammars. Our findings indicate that memory limitations have permeated grammatical descriptions, suggesting that it may be impossible to build a parsimonious theory of human linguistic productivity independent of non-linguistic cognitive constraints. |

Alemany-Puig, L; Ferrer-i-Cancho, R Fast calculation of the variance of edge crossings in random linear arrangements Journal Article pp. under review, 2020. Abstract | Links | BibTeX | Tags: @article{Alemany2019b, title = {Fast calculation of the variance of edge crossings in random linear arrangements}, author = {L Alemany-Puig and R Ferrer-i-Cancho}, url = {https://arxiv.org/abs/2003.03258}, year = {2020}, date = {2020-01-01}, pages = {under review}, abstract = {The interest in spatial networks where vertices are embedded in a one-dimensional space is growing. Remarkable examples of these networks are syntactic dependency trees and RNA structures. In this setup, the vertices of the network are arranged linearly and then edges may cross when drawn above the sequence of vertices. Recently, two aspects of the distribution of the number of crossings in uniformly random linear arrangements have been investigated: the expectation and the variance. While the computation of the expectation is straightforward, that of the variance is not. Here we present fast algorithms to calculate that variance in arbitrary graphs and forests. As for the latter, the algorithm calculates variance in linear time with respect to the number of vertices. This paves the way for many applications that rely on an exact but fast calculation of that variance. These algorithms are based on novel arithmetic expressions for the calculation of the variance that we develop from previous theoretical work.}, keywords = {}, pubstate = {published}, tppubtype = {article} } The interest in spatial networks where vertices are embedded in a one-dimensional space is growing. Remarkable examples of these networks are syntactic dependency trees and RNA structures. In this setup, the vertices of the network are arranged linearly and then edges may cross when drawn above the sequence of vertices. Recently, two aspects of the distribution of the number of crossings in uniformly random linear arrangements have been investigated: the expectation and the variance. While the computation of the expectation is straightforward, that of the variance is not. Here we present fast algorithms to calculate that variance in arbitrary graphs and forests. As for the latter, the algorithm calculates variance in linear time with respect to the number of vertices. This paves the way for many applications that rely on an exact but fast calculation of that variance. These algorithms are based on novel arithmetic expressions for the calculation of the variance that we develop from previous theoretical work. |

Alemany-Puig, L; Ferrer-i-Cancho, R Edge crossings in random linear arrangements Journal Article Journal of Statistical Mechanics, (2), pp. 023403, 2020. Abstract | Links | BibTeX | Tags: network science, word order @article{Alemany2018a, title = {Edge crossings in random linear arrangements}, author = {L Alemany-Puig and R Ferrer-i-Cancho}, doi = {10.1088/1742-5468/ab6845}, year = {2020}, date = {2020-01-01}, journal = {Journal of Statistical Mechanics}, number = {2}, pages = {023403}, abstract = {In spatial networks vertices are arranged in some space and edges may cross. When arranging vertices in a 1D lattice edges may cross when drawn above the vertex sequence as it happens in linguistic and biological networks. Here we investigate the general problem of the distribution of edge crossings in random arrangements of the vertices. We generalize the existing formula for the expectation of this number in random linear arrangements of trees to any network and derive an expression for the variance of the number of crossings in an arbitrary layout relying on a novel characterization of the algebraic structure of that variance in an arbitrary space. We provide compact formulae for the expectation and the variance in complete graphs, complete bipartite graphs, cycle graphs, one-regular graphs and various kinds of trees (star trees, quasi-star trees and linear trees). In these networks, the scaling of expectation and variance as a function of network size is asymptotically power-law-like in random linear arrangements. Our work paves the way for further research and applications in one-dimension or investigating the distribution of the number of crossings in lattices of higher dimension or other embeddings.}, keywords = {network science, word order}, pubstate = {published}, tppubtype = {article} } In spatial networks vertices are arranged in some space and edges may cross. When arranging vertices in a 1D lattice edges may cross when drawn above the vertex sequence as it happens in linguistic and biological networks. Here we investigate the general problem of the distribution of edge crossings in random arrangements of the vertices. We generalize the existing formula for the expectation of this number in random linear arrangements of trees to any network and derive an expression for the variance of the number of crossings in an arbitrary layout relying on a novel characterization of the algebraic structure of that variance in an arbitrary space. We provide compact formulae for the expectation and the variance in complete graphs, complete bipartite graphs, cycle graphs, one-regular graphs and various kinds of trees (star trees, quasi-star trees and linear trees). In these networks, the scaling of expectation and variance as a function of network size is asymptotically power-law-like in random linear arrangements. Our work paves the way for further research and applications in one-dimension or investigating the distribution of the number of crossings in lattices of higher dimension or other embeddings. |

Alemany-Puig, L; Mora, M; Ferrer-i-Cancho, R Reappraising the distribution of the number of edge crossings of graphs on a sphere Journal Article Journal of Statistical Mechanics, pp. 083401, 2020. Abstract | Links | BibTeX | Tags: network science @article{Alemany2018b, title = {Reappraising the distribution of the number of edge crossings of graphs on a sphere}, author = {L Alemany-Puig and M Mora and R Ferrer-i-Cancho}, url = {https://arxiv.org/abs/2003.03353}, doi = {10.1088/1742-5468/aba0ab}, year = {2020}, date = {2020-01-01}, journal = {Journal of Statistical Mechanics}, pages = {083401}, abstract = {Many real transportation and mobility networks have their vertices placed on the surface of the Earth. In such embeddings, the edges laid on that surface may cross. In his pioneering research, Moon analyzed the distribution of the number of crossings on complete graphs and complete bipartite graphs whose vertices are located uniformly at random on the surface of a sphere assuming that vertex placements are independent from each other. Here we revise his derivation of that variance in the light of recent theoretical developments on the variance of crossings and computer simulations. We show that Moon’s formulae are inaccurate in predicting the true variance and provide exact formulae.}, keywords = {network science}, pubstate = {published}, tppubtype = {article} } Many real transportation and mobility networks have their vertices placed on the surface of the Earth. In such embeddings, the edges laid on that surface may cross. In his pioneering research, Moon analyzed the distribution of the number of crossings on complete graphs and complete bipartite graphs whose vertices are located uniformly at random on the surface of a sphere assuming that vertex placements are independent from each other. Here we revise his derivation of that variance in the light of recent theoretical developments on the variance of crossings and computer simulations. We show that Moon’s formulae are inaccurate in predicting the true variance and provide exact formulae. |

Corral, A; Serra, I; Ferrer-i-Cancho, R Distinct flavors of Zipf's law and its maximum likelihood fitting: Rank-size and size-distribution representations Journal Article Physical Review E, pp. 052113, 2020. Abstract | Links | BibTeX | Tags: Zipf's law for word frequencies @article{Corral2019a, title = {Distinct flavors of Zipf's law and its maximum likelihood fitting: Rank-size and size-distribution representations}, author = {A Corral and I Serra and R Ferrer-i-Cancho}, url = {https://arxiv.org/abs/1908.01398}, doi = {10.1103/PhysRevE.102.052113}, year = {2020}, date = {2020-01-01}, journal = {Physical Review E}, pages = {052113}, abstract = {In the last years, researchers have realized the difficulties of fitting power-law distributions properly. These difficulties are higher in Zipf's systems, due to the discreteness of the variables and to the existence of two representations for these systems, i.e., two versions about which is the random variable to fit. The discreteness implies that a power law in one of the representations is not a power law in the other, and vice versa. We generate synthetic power laws in both representations and apply a state-of-the-art fitting method (based on maximum-likelihood plus a goodness-of-fit test) for each of the two random variables. It is important to stress that the method does not fit the whole distribution, but the tail, understood as the part of a distribution above a cut-off that separates non-power-law behavior from power-law behavior. We find that, no matter which random variable is power-law distributed, the rank-size representation is not adequate for fitting, whereas the representation in terms of the distribution of sizes leads to the recovery of the simulated exponents, may be with some bias.}, keywords = {Zipf's law for word frequencies}, pubstate = {published}, tppubtype = {article} } In the last years, researchers have realized the difficulties of fitting power-law distributions properly. These difficulties are higher in Zipf's systems, due to the discreteness of the variables and to the existence of two representations for these systems, i.e., two versions about which is the random variable to fit. The discreteness implies that a power law in one of the representations is not a power law in the other, and vice versa. We generate synthetic power laws in both representations and apply a state-of-the-art fitting method (based on maximum-likelihood plus a goodness-of-fit test) for each of the two random variables. It is important to stress that the method does not fit the whole distribution, but the tail, understood as the part of a distribution above a cut-off that separates non-power-law behavior from power-law behavior. We find that, no matter which random variable is power-law distributed, the rank-size representation is not adequate for fitting, whereas the representation in terms of the distribution of sizes leads to the recovery of the simulated exponents, may be with some bias. |

## 2019 |

Ferrer-i-Cancho, R; Bentz, C; Seguin, C Optimal coding and the origins of Zipfian laws Journal Article Journal of Quantitative Linguistics, pp. in press, 2019. Abstract | Links | BibTeX | Tags: information theory, Zipf's law for word frequencies, Zipf's law of abbreviation @article{Ferrer2019c, title = {Optimal coding and the origins of Zipfian laws}, author = {R Ferrer-i-Cancho and C Bentz and C Seguin}, url = {https://arxiv.org/abs/1906.01545}, doi = {10.1080/09296174.2020.1778387}, year = {2019}, date = {2019-01-01}, journal = {Journal of Quantitative Linguistics}, pages = {in press}, abstract = {The problem of compression in standard information theory consists of assigning codes as short as possible to numbers. Here we consider the problem of optimal coding -- under an arbitrary coding scheme -- and show that it predicts Zipf's law of abbreviation, namely a tendency in natural languages for more frequent words to be shorter. We apply this result to investigate optimal coding also under so-called non-singular coding, a scheme where unique segmentation is not warranted but codes stand for a distinct number. Optimal non-singular coding predicts that the length of a word should grow approximately as the logarithm of its frequency rank, which is again consistent with Zipf's law of abbreviation. Optimal non-singular coding in combination with the maximum entropy principle also predicts Zipf's rank-frequency distribution. Furthermore, our findings on optimal non-singular coding challenge common beliefs about random typing. It turns out that random typing is in fact an optimal coding process, in stark contrast with the common assumption that it is detached from cost cutting considerations. Finally, we discuss the implications of optimal coding for the construction of a compact theory of Zipfian laws and other linguistic laws.}, keywords = {information theory, Zipf's law for word frequencies, Zipf's law of abbreviation}, pubstate = {published}, tppubtype = {article} } The problem of compression in standard information theory consists of assigning codes as short as possible to numbers. Here we consider the problem of optimal coding -- under an arbitrary coding scheme -- and show that it predicts Zipf's law of abbreviation, namely a tendency in natural languages for more frequent words to be shorter. We apply this result to investigate optimal coding also under so-called non-singular coding, a scheme where unique segmentation is not warranted but codes stand for a distinct number. Optimal non-singular coding predicts that the length of a word should grow approximately as the logarithm of its frequency rank, which is again consistent with Zipf's law of abbreviation. Optimal non-singular coding in combination with the maximum entropy principle also predicts Zipf's rank-frequency distribution. Furthermore, our findings on optimal non-singular coding challenge common beliefs about random typing. It turns out that random typing is in fact an optimal coding process, in stark contrast with the common assumption that it is detached from cost cutting considerations. Finally, we discuss the implications of optimal coding for the construction of a compact theory of Zipfian laws and other linguistic laws. |

Ferrer-i-Cancho, Ramon SyntaxFest 2019 Invited talk - Dependency distance minimization: facts, theory and predictions Inproceedings Proceedings of the First Workshop on Quantitative Syntax (Quasy, SyntaxFest 2019), pp. 1–1, Association for Computational Linguistics, Paris, France, 2019. @inproceedings{Ferrer2019e, title = {SyntaxFest 2019 Invited talk - Dependency distance minimization: facts, theory and predictions}, author = {Ramon Ferrer-i-Cancho}, url = {https://www.aclweb.org/anthology/W19-7901}, doi = {10.18653/v1/W19-7901}, year = {2019}, date = {2019-01-01}, booktitle = {Proceedings of the First Workshop on Quantitative Syntax (Quasy, SyntaxFest 2019)}, pages = {1--1}, publisher = {Association for Computational Linguistics}, address = {Paris, France}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |

Ferrer-i-Cancho, R SyntaxFest 2019 Invited talk - Dependency distance minimization: facts, theory and predictions Inproceedings Proceedings of the First Workshop on Quantitative Syntax (Quasy, SyntaxFest 2019), pp. 1–1, Association for Computational Linguistics, Paris, France, 2019. Links | BibTeX | Tags: network science, word order @inproceedings{Ferrer2019f, title = {SyntaxFest 2019 Invited talk - Dependency distance minimization: facts, theory and predictions}, author = {R {Ferrer-i-Cancho}}, url = {https://www.aclweb.org/anthology/W19-7901}, doi = {10.18653/v1/W19-7901}, year = {2019}, date = {2019-01-01}, booktitle = {Proceedings of the First Workshop on Quantitative Syntax (Quasy, SyntaxFest 2019)}, pages = {1--1}, publisher = {Association for Computational Linguistics}, address = {Paris, France}, keywords = {network science, word order}, pubstate = {published}, tppubtype = {inproceedings} } |

Hernández, A; i Cancho, Ferrer R Lingüística cuantitativa. La estadística de las palabras Book EMSE EDAPP y Prisanoticas Colecciones, 2019, (English title: Quantitative linguistics. The statistics of words). BibTeX | Tags: @book{Hernandez2019a, title = {Lingüística cuantitativa. La estadística de las palabras}, author = {A Hernández and R Ferrer i Cancho}, year = {2019}, date = {2019-01-01}, publisher = {EMSE EDAPP y Prisanoticas Colecciones}, series = {Grandes ideas de las matemáticas}, note = {English title: Quantitative linguistics. The statistics of words}, keywords = {}, pubstate = {published}, tppubtype = {book} } |

Casas, B; Hernández-Fernández, A; Catal`a, N; Ferrer-i-Cancho, R; Baixeries, J Polysemy and brevity versus frequency in language Journal Article Computer Speech and Language, 58 , pp. 19 – 50, 2019. Abstract | Links | BibTeX | Tags: @article{Casas2019b, title = {Polysemy and brevity versus frequency in language}, author = {B Casas and A Hernández-Fernández and N Catal`a and R Ferrer-i-Cancho and J Baixeries}, doi = {10.1016/j.csl.2019.03.007}, year = {2019}, date = {2019-01-01}, journal = {Computer Speech and Language}, volume = {58}, pages = {19 -- 50}, abstract = {The pioneering research of G. K. Zipf on the relationship between word frequency and other word features led to the formulation of various linguistic laws. The most popular is Zipf’s law for word frequencies. Here we focus on two laws that have been studied less intensively: the meaning-frequency law, i.e. the tendency of more frequent words to be more polysemous, and the law of abbreviation, i.e. the tendency of more frequent words to be shorter. In a previous work, we tested the robustness of these Zipfian laws for English, roughly measuring word length in number of characters and distinguishing adult from child speech. In the present article, we extend our study to other languages (Dutch and Spanish) and introduce two additional measures of length: syllabic length and phonemic length. Our correlation analysis indicates that both the meaning-frequency law and the law of abbreviation hold overall in all the analyzed languages.}, keywords = {}, pubstate = {published}, tppubtype = {article} } The pioneering research of G. K. Zipf on the relationship between word frequency and other word features led to the formulation of various linguistic laws. The most popular is Zipf’s law for word frequencies. Here we focus on two laws that have been studied less intensively: the meaning-frequency law, i.e. the tendency of more frequent words to be more polysemous, and the law of abbreviation, i.e. the tendency of more frequent words to be shorter. In a previous work, we tested the robustness of these Zipfian laws for English, roughly measuring word length in number of characters and distinguishing adult from child speech. In the present article, we extend our study to other languages (Dutch and Spanish) and introduce two additional measures of length: syllabic length and phonemic length. Our correlation analysis indicates that both the meaning-frequency law and the law of abbreviation hold overall in all the analyzed languages. |

Heesen, R; Hobaiter, C; Ferrer-i-Cancho, R; Semple, S Linguistic laws in chimpanzee gestural communication Journal Article Proceedings of the Royal Society B: Biological Sciences, 286 , pp. 20182900, 2019. Abstract | Links | BibTeX | Tags: Menzerath's law, Zipf's law of abbreviation @article{Heesen2019a, title = {Linguistic laws in chimpanzee gestural communication}, author = {R Heesen and C Hobaiter and R Ferrer-i-Cancho and S Semple}, doi = {10.1098/rspb.2018.2900}, year = {2019}, date = {2019-01-01}, journal = {Proceedings of the Royal Society B: Biological Sciences}, volume = {286}, pages = {20182900}, abstract = {Studies testing linguistic laws outside language have provided important insights into the organization of biological systems. For example, patterns consistent with Zipf's law of abbreviation (which predicts a negative relationship between word length and frequency of use) have been found in the vocal and non-vocal behaviour of a range of animals, and patterns consistent with Menzerath's law (according to which longer sequences are made up of shorter constituents) have been found in primate vocal sequences, and in genes, proteins and genomes. Both laws have been linked to compression-the information theoretic principle of minimizing code length. Here, we present the first test of these laws in animal gestural communication. We initially did not find the negative relationship between gesture duration and frequency of use predicted by Zipf's law of abbreviation, but this relationship was seen in specific subsets of the repertoire. Furthermore, a pattern opposite to that predicted was seen in one subset of gestures-whole body signals. We found a negative correlation between number and mean duration of gestures in sequences, in line with Menzerath's law. These results provide the first evidence that compression underpins animal gestural communication, and highlight an important commonality between primate gesturing and language.}, keywords = {Menzerath's law, Zipf's law of abbreviation}, pubstate = {published}, tppubtype = {article} } Studies testing linguistic laws outside language have provided important insights into the organization of biological systems. For example, patterns consistent with Zipf's law of abbreviation (which predicts a negative relationship between word length and frequency of use) have been found in the vocal and non-vocal behaviour of a range of animals, and patterns consistent with Menzerath's law (according to which longer sequences are made up of shorter constituents) have been found in primate vocal sequences, and in genes, proteins and genomes. Both laws have been linked to compression-the information theoretic principle of minimizing code length. Here, we present the first test of these laws in animal gestural communication. We initially did not find the negative relationship between gesture duration and frequency of use predicted by Zipf's law of abbreviation, but this relationship was seen in specific subsets of the repertoire. Furthermore, a pattern opposite to that predicted was seen in one subset of gestures-whole body signals. We found a negative correlation between number and mean duration of gestures in sequences, in line with Menzerath's law. These results provide the first evidence that compression underpins animal gestural communication, and highlight an important commonality between primate gesturing and language. |

Ferrer-i-Cancho, R The sum of edge lengths in random linear arrangements Journal Article Journal of Statistical Mechanics, pp. 053401, 2019. Abstract | Links | BibTeX | Tags: network science @article{Ferrer2018a, title = {The sum of edge lengths in random linear arrangements}, author = {R Ferrer-i-Cancho}, doi = {10.1088/1742-5468/ab11e2}, year = {2019}, date = {2019-01-01}, journal = {Journal of Statistical Mechanics}, pages = {053401}, abstract = {Spatial networks are networks where nodes are located in a space equipped with a metric. Typically, the space is two-dimensional and until recently and traditionally, the metric that was usually considered was the Euclidean distance. In spatial networks, the cost of a link depends on the edge length, i.e. the distance between the nodes that define the edge. Hypothesizing that there is pressure to reduce the length of the edges of a network requires a null model, e.g. a random layout of the vertices of the network. Here we investigate the properties of the distribution of the sum of edge lengths in random linear arrangement of vertices, that has many applications in different fields. A random linear arrangement consists of an ordering of the elements of the nodes of a network being all possible orderings equally likely. The distance between two vertices is one plus the number of intermediate vertices in the ordering. Compact formulae for the 1st and 2nd moments about zero as well as the variance of the sum of edge lengths are obtained for arbitrary graphs and trees. We also analyze the evolution of that variance in Erdős–Rényi graphs and its scaling in uniformly random trees. Various developments and applications for future research are suggested.}, keywords = {network science}, pubstate = {published}, tppubtype = {article} } Spatial networks are networks where nodes are located in a space equipped with a metric. Typically, the space is two-dimensional and until recently and traditionally, the metric that was usually considered was the Euclidean distance. In spatial networks, the cost of a link depends on the edge length, i.e. the distance between the nodes that define the edge. Hypothesizing that there is pressure to reduce the length of the edges of a network requires a null model, e.g. a random layout of the vertices of the network. Here we investigate the properties of the distribution of the sum of edge lengths in random linear arrangement of vertices, that has many applications in different fields. A random linear arrangement consists of an ordering of the elements of the nodes of a network being all possible orderings equally likely. The distance between two vertices is one plus the number of intermediate vertices in the ordering. Compact formulae for the 1st and 2nd moments about zero as well as the variance of the sum of edge lengths are obtained for arbitrary graphs and trees. We also analyze the evolution of that variance in Erdős–Rényi graphs and its scaling in uniformly random trees. Various developments and applications for future research are suggested. |

## 2018 |

Casas, B; Catal`a, N; Ferrer-i-Cancho, R; Hernández-Fernández, A; Baixeries, J The polysemy of the words that children learn over time Journal Article Interaction Studies, 19 , pp. 389 – 426, 2018. Abstract | Links | BibTeX | Tags: @article{Casas2019a, title = {The polysemy of the words that children learn over time}, author = {B Casas and N Catal`a and R Ferrer-i-Cancho and A Hernández-Fernández and J Baixeries}, doi = {10.1075/is.16036.cas}, year = {2018}, date = {2018-01-01}, journal = {Interaction Studies}, volume = {19}, pages = {389 -- 426}, abstract = {Here we study polysemy as a potential learning bias in vocabulary learning in children. Words of low polysemy could be preferred as they reduce the disambiguation effort for the listener. However, such preference could be a side-effect of another bias: the preference of children for nouns in combination with the lower polysemy of nouns with respect to other part-of-speech categories. Our results show that mean polysemy in children increases over time in two phases, i.e. a fast growth till the 31st month followed by a slower tendency towards adult speech. In contrast, this evolution is not found in adults interacting with children. This suggests that children have a preference for non-polysemous words in their early stages of vocabulary acquisition. Interestingly, the evolutionary pattern described above weakens when controlling for syntactic category (noun, verb, adjective or adverb) but it does not disappear completely, suggesting that it could result from a combination of a standalone bias for low polysemy and a preference for nouns.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Here we study polysemy as a potential learning bias in vocabulary learning in children. Words of low polysemy could be preferred as they reduce the disambiguation effort for the listener. However, such preference could be a side-effect of another bias: the preference of children for nouns in combination with the lower polysemy of nouns with respect to other part-of-speech categories. Our results show that mean polysemy in children increases over time in two phases, i.e. a fast growth till the 31st month followed by a slower tendency towards adult speech. In contrast, this evolution is not found in adults interacting with children. This suggests that children have a preference for non-polysemous words in their early stages of vocabulary acquisition. Interestingly, the evolutionary pattern described above weakens when controlling for syntactic category (noun, verb, adjective or adverb) but it does not disappear completely, suggesting that it could result from a combination of a standalone bias for low polysemy and a preference for nouns. |

Chen, X; Gómez-Rodríguez, C C; Ferrer-i-Cancho, R A dependency look at the reality of constituency Journal Article Glottometrics, 40 , pp. 104-106, 2018. Abstract | Links | BibTeX | Tags: word order @article{Chen2018a, title = {A dependency look at the reality of constituency}, author = {X Chen and C C. {Gómez-Rodríguez} and R Ferrer-i-Cancho}, url = {http://hdl.handle.net/2117/117466}, year = {2018}, date = {2018-01-01}, journal = {Glottometrics}, volume = {40}, pages = {104-106}, abstract = {A comment on "Neurophysiological dynamics of phrase-structure building during sentence processing" by Nelson et al (2017), Proceedings of the National Academy of Sciences USA 114(18), E3669-E3678.}, keywords = {word order}, pubstate = {published}, tppubtype = {article} } A comment on "Neurophysiological dynamics of phrase-structure building during sentence processing" by Nelson et al (2017), Proceedings of the National Academy of Sciences USA 114(18), E3669-E3678. |

Ferrer-i-Cancho, R; Gómez-Rodríguez, C; Esteban, J L Are crossing dependencies really scarce? Journal Article Physica A, 493 , pp. 311-329, 2018. Abstract | Links | BibTeX | Tags: network science, word order @article{Ferrer2017a, title = {Are crossing dependencies really scarce?}, author = {R Ferrer-i-Cancho and C Gómez-Rodríguez and J L Esteban}, doi = {10.1016/j.physa.2017.10.048}, year = {2018}, date = {2018-01-01}, journal = {Physica A}, volume = {493}, pages = {311-329}, abstract = {The syntactic structure of a sentence can be modelled as a tree, where vertices correspond to words and edges indicate syntactic dependencies. It has been claimed recurrently that the number of edge crossings in real sentences is small. However, a baseline or null hypothesis has been lacking. Here we quantify the amount of crossings of real sentences and compare it to the predictions of a series of baselines. We conclude that crossings are really scarce in real sentences. Their scarcity is unexpected by the hubiness of the trees. Indeed, real sentences are close to linear trees, where the potential number of crossings is maximized.}, keywords = {network science, word order}, pubstate = {published}, tppubtype = {article} } The syntactic structure of a sentence can be modelled as a tree, where vertices correspond to words and edges indicate syntactic dependencies. It has been claimed recurrently that the number of edge crossings in real sentences is small. However, a baseline or null hypothesis has been lacking. Here we quantify the amount of crossings of real sentences and compare it to the predictions of a series of baselines. We conclude that crossings are really scarce in real sentences. Their scarcity is unexpected by the hubiness of the trees. Indeed, real sentences are close to linear trees, where the potential number of crossings is maximized. |

Ferrer-i-Cancho, R; Vitevitch, M The origins of Zipf's meaning-frequency law Journal Article Journal of the American Association for Information Science and Technology, 69 , pp. 1369–1379, 2018. Abstract | Links | BibTeX | Tags: information theory, network science, Zipf's meaning-frequency law @article{Ferrer2017b, title = {The origins of Zipf's meaning-frequency law}, author = {R Ferrer-i-Cancho and M Vitevitch}, doi = {10.1002/jasist.24057}, year = {2018}, date = {2018-01-01}, journal = {Journal of the American Association for Information Science and Technology}, volume = {69}, pages = {1369--1379}, abstract = {In his pioneering research, G.K. Zipf observed that more frequent words tend to have more meanings, and showed that the number of meanings of a word grows as the square root of its frequency. He derived this relationship from two assumptions: that words follow Zipf's law for word frequencies (a power law dependency between frequency and rank) and Zipf's law of meaning distribution (a power law dependency between number of meanings and rank). Here we show that a single assumption on the joint probability of a word and a meaning suffices to infer Zipf's meaning‐frequency law or relaxed versions. Interestingly, this assumption can be justified as the outcome of a biased random walk in the process of mental exploration.}, keywords = {information theory, network science, Zipf's meaning-frequency law}, pubstate = {published}, tppubtype = {article} } In his pioneering research, G.K. Zipf observed that more frequent words tend to have more meanings, and showed that the number of meanings of a word grows as the square root of its frequency. He derived this relationship from two assumptions: that words follow Zipf's law for word frequencies (a power law dependency between frequency and rank) and Zipf's law of meaning distribution (a power law dependency between number of meanings and rank). Here we show that a single assumption on the joint probability of a word and a meaning suffices to infer Zipf's meaning‐frequency law or relaxed versions. Interestingly, this assumption can be justified as the outcome of a biased random walk in the process of mental exploration. |

Ferrer-i-Cancho, R Optimization models of natural communication Journal Article Journal of Quantitative Linguistics, 25 , pp. 207-237, 2018. Abstract | Links | BibTeX | Tags: information theory, Zipf's law for word frequencies @article{Ferrer2015b, title = {Optimization models of natural communication}, author = {R Ferrer-i-Cancho}, doi = {10.1080/09296174.2017.1366095}, year = {2018}, date = {2018-01-01}, journal = {Journal of Quantitative Linguistics}, volume = {25}, pages = {207-237}, abstract = {A family of information theoretic models of communication was introduced more than a decade ago to explain the origins of Zipf’s law for word frequencies. The family is a based on a combination of two information theoretic principles: maximization of mutual information between forms and meanings and minimization of form entropy. The family also sheds light on the origins of three other patterns: the principle of contrast; a related vocabulary learning bias; and the meaning-frequency law. Here two important components of the family, namely the information theoretic principles and the energy function that combines them linearly, are reviewed from the perspective of psycholinguistics, language learning, information theory and synergetic linguistics. The minimization of this linear function is linked to the problem of compression of standard information theory and might be tuned by self-organization.}, keywords = {information theory, Zipf's law for word frequencies}, pubstate = {published}, tppubtype = {article} } A family of information theoretic models of communication was introduced more than a decade ago to explain the origins of Zipf’s law for word frequencies. The family is a based on a combination of two information theoretic principles: maximization of mutual information between forms and meanings and minimization of form entropy. The family also sheds light on the origins of three other patterns: the principle of contrast; a related vocabulary learning bias; and the meaning-frequency law. Here two important components of the family, namely the information theoretic principles and the energy function that combines them linearly, are reviewed from the perspective of psycholinguistics, language learning, information theory and synergetic linguistics. The minimization of this linear function is linked to the problem of compression of standard information theory and might be tuned by self-organization. |

## 2017 |

Elvevåg, B; Foltz, P W; Rosenstein, M; Ferrer-i-Cancho, R; Deyne, De S; Mizraji, E; Cohen, A Thoughts About Disordered Thinking: Measuring and Quantifying the Laws of Order and Disorder Journal Article Schizophrenia Bulletin, 43 (3), pp. 509-513, 2017. @article{Elvevaag2017a, title = {Thoughts About Disordered Thinking: Measuring and Quantifying the Laws of Order and Disorder}, author = {B Elvevåg and P W Foltz and M Rosenstein and R {Ferrer-i-Cancho} and S De Deyne and E Mizraji and A Cohen}, doi = {10.1093/schbul/sbx040}, year = {2017}, date = {2017-01-01}, journal = {Schizophrenia Bulletin}, volume = {43}, number = {3}, pages = {509-513}, keywords = {}, pubstate = {published}, tppubtype = {article} } |

Bentz, C; Alikaniotis, D; Cysouw, M; Ferrer-i-Cancho, R The entropy of words - Learnability and expressivity across more than 1000 languages Journal Article Entropy, 19 (6), 2017. Abstract | Links | BibTeX | Tags: information theory @article{Bentz2017a, title = {The entropy of words - Learnability and expressivity across more than 1000 languages}, author = {C Bentz and D Alikaniotis and M Cysouw and R {Ferrer-i-Cancho}}, doi = {10.3390/e19060275}, year = {2017}, date = {2017-01-01}, journal = {Entropy}, volume = {19}, number = {6}, abstract = {The choice associated with words is a fundamental property of natural languages. It lies at the heart of quantitative linguistics, computational linguistics and language sciences more generally. Information theory gives us tools at hand to measure precisely the average amount of choice associated with words: the word entropy. Here, we use three parallel corpora, encompassing ca. 450 million words in 1916 texts and 1259 languages, to tackle some of the major conceptual and practical problems of word entropy estimation: dependence on text size, register, style and estimation method, as well as non-independence of words in co-text. We present two main findings: Firstly, word entropies display relatively narrow, unimodal distributions. There is no language in our sample with a unigram entropy of less than six bits/word. We argue that this is in line with information-theoretic models of communication. Languages are held in a narrow range by two fundamental pressures: word learnability and word expressivity, with a potential bias towards expressivity. Secondly, there is a strong linear relationship between unigram entropies and entropy rates. The entropy difference between words with and without co-textual information is narrowly distributed around ca. three bits/word. In other words, knowing the preceding text reduces the uncertainty of words by roughly the same amount across languages of the world.}, keywords = {information theory}, pubstate = {published}, tppubtype = {article} } The choice associated with words is a fundamental property of natural languages. It lies at the heart of quantitative linguistics, computational linguistics and language sciences more generally. Information theory gives us tools at hand to measure precisely the average amount of choice associated with words: the word entropy. Here, we use three parallel corpora, encompassing ca. 450 million words in 1916 texts and 1259 languages, to tackle some of the major conceptual and practical problems of word entropy estimation: dependence on text size, register, style and estimation method, as well as non-independence of words in co-text. We present two main findings: Firstly, word entropies display relatively narrow, unimodal distributions. There is no language in our sample with a unigram entropy of less than six bits/word. We argue that this is in line with information-theoretic models of communication. Languages are held in a narrow range by two fundamental pressures: word learnability and word expressivity, with a potential bias towards expressivity. Secondly, there is a strong linear relationship between unigram entropies and entropy rates. The entropy difference between words with and without co-textual information is narrowly distributed around ca. three bits/word. In other words, knowing the preceding text reduces the uncertainty of words by roughly the same amount across languages of the world. |

Ferrer-i-Cancho, R A commentary on ``The now-or-never bottleneck: a fundamental constraint on language'', by Christiansen and Chater (2016) Journal Article Glottometrics, 38 , pp. 107-111, 2017. Abstract | Links | BibTeX | Tags: network science @article{Ferrer2017d, title = {A commentary on ``The now-or-never bottleneck: a fundamental constraint on language'', by Christiansen and Chater (2016)}, author = {R Ferrer-i-Cancho}, url = {http://hdl.handle.net/2117/107857}, year = {2017}, date = {2017-01-01}, journal = {Glottometrics}, volume = {38}, pages = {107-111}, abstract = {In a recent article, Christiansen and Chater (2016) present a fundamental constraint on language, i.e. a now-or-never bottleneck that arises from our fleeting memory, and explore its implications, e.g., chunk-and-pass processing, outlining a framework that promises to unify different areas of research. Here we explore additional support for this constraint and suggest further connections from quantitative linguistics and information theory.}, keywords = {network science}, pubstate = {published}, tppubtype = {article} } In a recent article, Christiansen and Chater (2016) present a fundamental constraint on language, i.e. a now-or-never bottleneck that arises from our fleeting memory, and explore its implications, e.g., chunk-and-pass processing, outlining a framework that promises to unify different areas of research. Here we explore additional support for this constraint and suggest further connections from quantitative linguistics and information theory. |

Gómez-Rodríguez, C; Ferrer-i-Cancho, R Scarcity of crossing dependencies: a direct outcome of a specific constraint? Journal Article Physical Review E, 96 , pp. 062304, 2017. Abstract | Links | BibTeX | Tags: network science, word order @article{Gomez2016a, title = {Scarcity of crossing dependencies: a direct outcome of a specific constraint?}, author = {C Gómez-Rodríguez and R Ferrer-i-Cancho}, doi = {10.1103/PhysRevE.96.062304}, year = {2017}, date = {2017-01-01}, journal = {Physical Review E}, volume = {96}, pages = {062304}, abstract = {The structure of a sentence can be represented as a network where vertices are words and edges indicate syntactic dependencies. Interestingly, crossing syntactic dependencies have been observed to be infrequent in human languages. This leads to the question of whether the scarcity of crossings in languages arises from an independent and specific constraint on crossings. We provide statistical evidence suggesting that this is not the case, as the proportion of dependency crossings of sentences from a wide range of languages can be accurately estimated by a simple predictor based on a null hypothesis on the local probability that two dependencies cross given their lengths. The relative error of this predictor never exceeds 5% on average, whereas the error of a baseline predictor assuming a random ordering of the words of a sentence is at least six times greater. Our results suggest that the low frequency of crossings in natural languages is neither originated by hidden knowledge of language nor by the undesirability of crossings per se, but as a mere side effect of the principle of dependency length minimization.}, keywords = {network science, word order}, pubstate = {published}, tppubtype = {article} } The structure of a sentence can be represented as a network where vertices are words and edges indicate syntactic dependencies. Interestingly, crossing syntactic dependencies have been observed to be infrequent in human languages. This leads to the question of whether the scarcity of crossings in languages arises from an independent and specific constraint on crossings. We provide statistical evidence suggesting that this is not the case, as the proportion of dependency crossings of sentences from a wide range of languages can be accurately estimated by a simple predictor based on a null hypothesis on the local probability that two dependencies cross given their lengths. The relative error of this predictor never exceeds 5% on average, whereas the error of a baseline predictor assuming a random ordering of the words of a sentence is at least six times greater. Our results suggest that the low frequency of crossings in natural languages is neither originated by hidden knowledge of language nor by the undesirability of crossings per se, but as a mere side effect of the principle of dependency length minimization. |

Esteban, J L; Ferrer-i-Cancho, R A correction on Shiloach's algorithm for minimum linear arrangement of trees Journal Article SIAM Journal of Computing, 46 , pp. 1146-1151, 2017. Abstract | Links | BibTeX | Tags: @article{Esteban2015a, title = {A correction on Shiloach's algorithm for minimum linear arrangement of trees}, author = {J L Esteban and R Ferrer-i-Cancho}, doi = {10.1137/15M1046289}, year = {2017}, date = {2017-01-01}, journal = {SIAM Journal of Computing}, volume = {46}, pages = {1146-1151}, abstract = {More than 30 years ago, Shiloach published an algorithm to solve the minimum linear arrangement problem for undirected trees. Here we fix a small error in the original version of the algorithm and discuss its effect on subsequent literature. We also improve some aspects of the notation.}, keywords = {}, pubstate = {published}, tppubtype = {article} } More than 30 years ago, Shiloach published an algorithm to solve the minimum linear arrangement problem for undirected trees. Here we fix a small error in the original version of the algorithm and discuss its effect on subsequent literature. We also improve some aspects of the notation. |

Ferrer-i-Cancho, R Random crossings in dependency trees Journal Article Glottometrics, 37 , pp. 1-12, 2017. Abstract | Links | BibTeX | Tags: network science, word order @article{Ferrer2013d, title = {Random crossings in dependency trees}, author = {R Ferrer-i-Cancho}, url = {http://hdl.handle.net/2117/106079}, year = {2017}, date = {2017-01-01}, journal = {Glottometrics}, volume = {37}, pages = {1-12}, abstract = {It has been hypothesized that the rather small number of crossings in real syntactic dependency trees is a side-effect of pressure for dependency length minimization. Here we answer a related important research question: what would be the expected number of crossings if the natural order of a sentence was lost and replaced by a random ordering? We show that this number depends only on the number of vertices of the dependency tree (the sentence length) and the second moment about zero of vertex degrees. The expected number of crossings is minimum for a star tree (crossings are impossible) and maximum for a linear tree (the number of crossings is of the order of the square of the sequence length).}, keywords = {network science, word order}, pubstate = {published}, tppubtype = {article} } It has been hypothesized that the rather small number of crossings in real syntactic dependency trees is a side-effect of pressure for dependency length minimization. Here we answer a related important research question: what would be the expected number of crossings if the natural order of a sentence was lost and replaced by a random ordering? We show that this number depends only on the number of vertices of the dependency tree (the sentence length) and the second moment about zero of vertex degrees. The expected number of crossings is minimum for a star tree (crossings are impossible) and maximum for a linear tree (the number of crossings is of the order of the square of the sequence length). |

Ferrer-i-Cancho, R The placement of the head that maximizes predictability. An information theoretic approach Journal Article Glottometrics, 39 , pp. 38-71, 2017. Abstract | Links | BibTeX | Tags: information theory, word order @article{Ferrer2013f, title = {The placement of the head that maximizes predictability. An information theoretic approach}, author = {R Ferrer-i-Cancho}, url = {Ihttp://hdl.handle.net/2117/108830}, year = {2017}, date = {2017-01-01}, journal = {Glottometrics}, volume = {39}, pages = {38-71}, abstract = {The minimization of the length of syntactic dependencies is a well-established principle of word order and the basis of a mathematical theory of word order. Here we complete that theory from the perspective of information theory, adding a competing word order principle: the maximization of predictability of a target element. These two principles are in conflict: to maximize the predictability of the head, the head should appear last, which maximizes the costs with respect to dependency length minimization. The implications of such a broad theoretical framework to understand the optimality, diversity and evolution of the six possible orderings of subject, object and verb, are reviewed.}, keywords = {information theory, word order}, pubstate = {published}, tppubtype = {article} } The minimization of the length of syntactic dependencies is a well-established principle of word order and the basis of a mathematical theory of word order. Here we complete that theory from the perspective of information theory, adding a competing word order principle: the maximization of predictability of a target element. These two principles are in conflict: to maximize the predictability of the head, the head should appear last, which maximizes the costs with respect to dependency length minimization. The implications of such a broad theoretical framework to understand the optimality, diversity and evolution of the six possible orderings of subject, object and verb, are reviewed. |

## 2016 |

Ferrer-i-Cancho, R The optimality of attaching unlinked labels to unlinked meanings Journal Article Glottometrics, 36 , pp. 1-16, 2016. Abstract | Links | BibTeX | Tags: information theory, vocabulary learning @article{Ferrer2013g, title = {The optimality of attaching unlinked labels to unlinked meanings}, author = {R Ferrer-i-Cancho}, url = {http://hdl.handle.net/2117/102539}, year = {2016}, date = {2016-01-01}, journal = {Glottometrics}, volume = {36}, pages = {1-16}, abstract = {Vocabulary learning by children can be characterized by many biases. When encountering a new word, children as well as adults, are biased towards assuming that it means something totally different from the words that they already know. To the best of our knowledge, the 1st mathematical proof of the optimality of this bias is presented here. First, it is shown that this bias is a particular case of the maximization of mutual information between words and meanings. Second, the optimality is proven within a more general information theoretic framework where mutual information maximization competes with other information theoretic principles. The bias is a prediction from modern information theory. The relationship between information theoretic principles and the principles of contrast and mutual exclusivity is also shown.}, keywords = {information theory, vocabulary learning}, pubstate = {published}, tppubtype = {article} } Vocabulary learning by children can be characterized by many biases. When encountering a new word, children as well as adults, are biased towards assuming that it means something totally different from the words that they already know. To the best of our knowledge, the 1st mathematical proof of the optimality of this bias is presented here. First, it is shown that this bias is a particular case of the maximization of mutual information between words and meanings. Second, the optimality is proven within a more general information theoretic framework where mutual information maximization competes with other information theoretic principles. The bias is a prediction from modern information theory. The relationship between information theoretic principles and the principles of contrast and mutual exclusivity is also shown. |

Lozano, A; Casas, B; Bentz, C; Ferrer-i-Cancho, R Fast calculation of entropy with Zhang's estimator Incollection Knight, Macutek Kelih J E R; Wilson, A (Ed.): Issues in Quantitative Linguistics 4. Dedicated to Reinhard Köhler on the occasion of his 65th birthday, pp. 273-285, RAM-Verlag, Lüdenscheid, 2016, (No. 23 of the series ``Studies in Quantitative Linguistic''). Abstract | Links | BibTeX | Tags: information theory @incollection{Lozano2016a, title = {Fast calculation of entropy with Zhang's estimator}, author = {A Lozano and B Casas and C Bentz and R Ferrer-i-Cancho}, editor = {Macutek J E. Kelih R. Knight and A Wilson}, url = {http://hdl.handle.net/2117/100157}, year = {2016}, date = {2016-01-01}, booktitle = {Issues in Quantitative Linguistics 4. Dedicated to Reinhard Köhler on the occasion of his 65th birthday}, pages = {273-285}, publisher = {RAM-Verlag}, address = {Lüdenscheid}, abstract = {Entropy is a fundamental property of a repertoire. Here, we present an efficient algorithm to estimate the entropy of types with the help of Zhang's estimator. The algorithm takes advantage of the fact that the number of different frequencies in a text is in general much smaller than the number of types. We justify the convenience of the algorithm by means of an analysis of the statistical properties of texts from more than 1000 languages. Our work opens up various possibilities for future research.}, note = {No. 23 of the series ``Studies in Quantitative Linguistic''}, keywords = {information theory}, pubstate = {published}, tppubtype = {incollection} } Entropy is a fundamental property of a repertoire. Here, we present an efficient algorithm to estimate the entropy of types with the help of Zhang's estimator. The algorithm takes advantage of the fact that the number of different frequencies in a text is in general much smaller than the number of types. We justify the convenience of the algorithm by means of an analysis of the statistical properties of texts from more than 1000 languages. Our work opens up various possibilities for future research. |

Bentz, C; Ferrer-i-Cancho, R Zipf's law of abbreviation as a language universal Inproceedings Bentz, Christian; Jäger, Gerhard; Yanovich, Igor (Ed.): Proceedings of the Leiden Workshop on Capturing Phylogenetic Algorithms for Linguistics, University of Tübingen, 2016. Abstract | Links | BibTeX | Tags: Zipf's law of abbreviation @inproceedings{Bentz2016a, title = {Zipf's law of abbreviation as a language universal}, author = {C Bentz and R {Ferrer-i-Cancho}}, editor = {Christian Bentz and Gerhard Jäger and Igor Yanovich}, url = {https://publikationen.uni-tuebingen.de/xmlui/handle/10900/68639?locale-attribute=en}, year = {2016}, date = {2016-01-01}, booktitle = {Proceedings of the Leiden Workshop on Capturing Phylogenetic Algorithms for Linguistics}, publisher = {University of Tübingen}, abstract = {Words that are used more frequently tend to be shorter. This statement is known as Zipf’s law of abbreviation. Here we perform the widest investigation of the presence of the law to date. In a sample of 1262 texts and 986 different languages - about 13% of the world’s language diversity - a negative correlation between word frequency and word length is found in all cases. In line with Zipf’s original proposal, we argue that this universal trend is likely to derive from fundamental principles of information processing and transfer.}, keywords = {Zipf's law of abbreviation}, pubstate = {published}, tppubtype = {inproceedings} } Words that are used more frequently tend to be shorter. This statement is known as Zipf’s law of abbreviation. Here we perform the widest investigation of the presence of the law to date. In a sample of 1262 texts and 986 different languages - about 13% of the world’s language diversity - a negative correlation between word frequency and word length is found in all cases. In line with Zipf’s original proposal, we argue that this universal trend is likely to derive from fundamental principles of information processing and transfer. |

Kershenbaum, Arik; Blumstein, Daniel T; Roch, Marie A; Akçay, Çağlar; Backus, Gregory; Bee, Mark A; Bohn, Kirsten; Cao, Yan; Carter, Gerald; Cäsar, Cristiane; Coen, Michael; DeRuiter, Stacy L; Doyle, Laurance; Edelman, Shimon; Ferrer-i-Cancho, Ramon; Freeberg, Todd M; Garland, Ellen C; Gustison, Morgan; Harley, Heidi E; Huetz, Chloé; Hughes, Melissa; Bruno, Julia Hyland; Ilany, Amiyaal; Jin, Dezhe Z; Johnson, Michael; Ju, Chenghui; Karnowski, Jeremy; Lohr, Bernard; Manser, Marta B; McCowan, Brenda; Mercado, Eduardo; Narins, Peter M; Piel, Alex; Rice, Megan; Salmi, Roberta; Sasahara, Kazutoshi; Sayigh, Laela; Shiu, Yu; Taylor, Charles; Vallejo, Edgar E; Waller, Sara; Zamora-Gutierrez, Veronica Acoustic sequences in non-human animals: a tutorial review and prospectus Journal Article Biological Reviews, 91 (1), pp. 13–52, 2016. Abstract | Links | BibTeX | Tags: word order @article{Kershenbaum2016a, title = {Acoustic sequences in non-human animals: a tutorial review and prospectus}, author = {Arik Kershenbaum and Daniel T Blumstein and Marie A Roch and Çağlar Akçay and Gregory Backus and Mark A Bee and Kirsten Bohn and Yan Cao and Gerald Carter and Cristiane Cäsar and Michael Coen and Stacy L DeRuiter and Laurance Doyle and Shimon Edelman and Ramon Ferrer-i-Cancho and Todd M Freeberg and Ellen C Garland and Morgan Gustison and Heidi E Harley and Chloé Huetz and Melissa Hughes and Julia Hyland Bruno and Amiyaal Ilany and Dezhe Z Jin and Michael Johnson and Chenghui Ju and Jeremy Karnowski and Bernard Lohr and Marta B Manser and Brenda McCowan and Eduardo Mercado and Peter M Narins and Alex Piel and Megan Rice and Roberta Salmi and Kazutoshi Sasahara and Laela Sayigh and Yu Shiu and Charles Taylor and Edgar E Vallejo and Sara Waller and Veronica Zamora-Gutierrez}, doi = {10.1111/brv.12160}, year = {2016}, date = {2016-01-01}, journal = {Biological Reviews}, volume = {91}, number = {1}, pages = {13--52}, abstract = {Animal acoustic communication often takes the form of complex sequences, made up of multiple distinct acoustic units. Apart from the well‐known example of birdsong, other animals such as insects, amphibians, and mammals (including bats, rodents, primates, and cetaceans) also generate complex acoustic sequences. Occasionally, such as with birdsong, the adaptive role of these sequences seems clear (e.g. mate attraction and territorial defence). More often however, researchers have only begun to characterise – let alone understand – the significance and meaning of acoustic sequences. Hypotheses abound, but there is little agreement as to how sequences should be defined and analysed. Our review aims to outline suitable methods for testing these hypotheses, and to describe the major limitations to our current and near‐future knowledge on questions of acoustic sequences. This review and prospectus is the result of a collaborative effort between 43 scientists from the fields of animal behaviour, ecology and evolution, signal processing, machine learning, quantitative linguistics, and information theory, who gathered for a 2013 workshop entitled, ‘Analysing vocal sequences in animals’. Our goal is to present not just a review of the state of the art, but to propose a methodological framework that summarises what we suggest are the best practices for research in this field, across taxa and across disciplines. We also provide a tutorial‐style introduction to some of the most promising algorithmic approaches for analysing sequences. We divide our review into three sections: identifying the distinct units of an acoustic sequence, describing the different ways that information can be contained within a sequence, and analysing the structure of that sequence. Each of these sections is further subdivided to address the key questions and approaches in that area. We propose a uniform, systematic, and comprehensive approach to studying sequences, with the goal of clarifying research terms used in different fields, and facilitating collaboration and comparative studies. Allowing greater interdisciplinary collaboration will facilitate the investigation of many important questions in the evolution of communication and sociality.}, keywords = {word order}, pubstate = {published}, tppubtype = {article} } Animal acoustic communication often takes the form of complex sequences, made up of multiple distinct acoustic units. Apart from the well‐known example of birdsong, other animals such as insects, amphibians, and mammals (including bats, rodents, primates, and cetaceans) also generate complex acoustic sequences. Occasionally, such as with birdsong, the adaptive role of these sequences seems clear (e.g. mate attraction and territorial defence). More often however, researchers have only begun to characterise – let alone understand – the significance and meaning of acoustic sequences. Hypotheses abound, but there is little agreement as to how sequences should be defined and analysed. Our review aims to outline suitable methods for testing these hypotheses, and to describe the major limitations to our current and near‐future knowledge on questions of acoustic sequences. This review and prospectus is the result of a collaborative effort between 43 scientists from the fields of animal behaviour, ecology and evolution, signal processing, machine learning, quantitative linguistics, and information theory, who gathered for a 2013 workshop entitled, ‘Analysing vocal sequences in animals’. Our goal is to present not just a review of the state of the art, but to propose a methodological framework that summarises what we suggest are the best practices for research in this field, across taxa and across disciplines. We also provide a tutorial‐style introduction to some of the most promising algorithmic approaches for analysing sequences. We divide our review into three sections: identifying the distinct units of an acoustic sequence, describing the different ways that information can be contained within a sequence, and analysing the structure of that sequence. Each of these sections is further subdivided to address the key questions and approaches in that area. We propose a uniform, systematic, and comprehensive approach to studying sequences, with the goal of clarifying research terms used in different fields, and facilitating collaboration and comparative studies. Allowing greater interdisciplinary collaboration will facilitate the investigation of many important questions in the evolution of communication and sociality. |

Hernández-Fernández, A; Ferrer-i-Cancho, R The infochemical core Journal Article Journal of Quantitative Linguistics, 23 , pp. 133-153, 2016. Abstract | Links | BibTeX | Tags: chemical communication @article{Hernandez2016b, title = {The infochemical core}, author = {A Hernández-Fernández and R Ferrer-i-Cancho}, doi = {10.1080/09296174.2016.1142323}, year = {2016}, date = {2016-01-01}, journal = {Journal of Quantitative Linguistics}, volume = {23}, pages = {133-153}, abstract = {Vocalizations, and less often gestures, have been the object of linguistic research for decades. However, the development of a general theory of communication with human language as a particular case requires a clear understanding of the organization of communication through other means. Infochemicals are chemical compounds that carry information and are employed by small organisms that cannot emit acoustic signals of an optimal frequency to achieve successful communication. Here, we investigate the distribution of infochemicals across species when they are ranked by their degree or the number of species with which they are associated (because they produce them or are sensitive to them). We evaluate the quality of the fit of different functions to the dependency between degree and rank by means of a penalty for the number of parameters of the function. Surprisingly, a double Zipf (a Zipf distribution with two regimes, each with a different exponent) is the model yielding the best fit although it is the function with the largest number of parameters. This suggests that the worldwide repertoire of infochemicals contains a core which is shared by many species and is reminiscent of the core vocabularies found for human language in dictionaries or large corpora.}, keywords = {chemical communication}, pubstate = {published}, tppubtype = {article} } Vocalizations, and less often gestures, have been the object of linguistic research for decades. However, the development of a general theory of communication with human language as a particular case requires a clear understanding of the organization of communication through other means. Infochemicals are chemical compounds that carry information and are employed by small organisms that cannot emit acoustic signals of an optimal frequency to achieve successful communication. Here, we investigate the distribution of infochemicals across species when they are ranked by their degree or the number of species with which they are associated (because they produce them or are sensitive to them). We evaluate the quality of the fit of different functions to the dependency between degree and rank by means of a penalty for the number of parameters of the function. Surprisingly, a double Zipf (a Zipf distribution with two regimes, each with a different exponent) is the model yielding the best fit although it is the function with the largest number of parameters. This suggests that the worldwide repertoire of infochemicals contains a core which is shared by many species and is reminiscent of the core vocabularies found for human language in dictionaries or large corpora. |

Hernández-Fernández, A; Casas, B; Ferrer-i-Cancho, R; Baixeries, J Testing the robustness of laws of polysemy and brevity versus frequency Inproceedings Král, P; Martín-Vide, C (Ed.): 4th International Conference on Statistical Language and Speech Processing (SLSP 2016). Lecture Notes in Computer Science 9918, pp. 19–29, 2016. Abstract | Links | BibTeX | Tags: child language, Zipf's law of abbreviation, Zipf's meaning-frequency law @inproceedings{Hernandez2016a, title = {Testing the robustness of laws of polysemy and brevity versus frequency}, author = {A Hernández-Fernández and B Casas and R Ferrer-i-Cancho and J Baixeries}, editor = {P Král and C Martín-Vide}, doi = {10.1007/978-3-319-45925-7_2}, year = {2016}, date = {2016-01-01}, booktitle = {4th International Conference on Statistical Language and Speech Processing (SLSP 2016). Lecture Notes in Computer Science 9918}, pages = {19--29}, abstract = {The pioneering research of G.K. Zipf on the relationship between word frequency and other word features led to the formulation of various linguistic laws. Here we focus on a couple of them: the meaning-frequency law, i.e. the tendency of more frequent words to be more polysemous, and the law of abbreviation, i.e. the tendency of more frequent words to be shorter. Here we evaluate the robustness of these laws in contexts where they have not been explored yet to our knowledge. The recovery of the laws again in new conditions provides support for the hypothesis that they originate from abstract mechanisms.}, keywords = {child language, Zipf's law of abbreviation, Zipf's meaning-frequency law}, pubstate = {published}, tppubtype = {inproceedings} } The pioneering research of G.K. Zipf on the relationship between word frequency and other word features led to the formulation of various linguistic laws. Here we focus on a couple of them: the meaning-frequency law, i.e. the tendency of more frequent words to be more polysemous, and the law of abbreviation, i.e. the tendency of more frequent words to be shorter. Here we evaluate the robustness of these laws in contexts where they have not been explored yet to our knowledge. The recovery of the laws again in new conditions provides support for the hypothesis that they originate from abstract mechanisms. |

Ferrer-i-Cancho, R Non-crossing dependencies: least effort, not grammar Incollection Mehler, A; ü, A L; Banisch, S; Blanchard, P; Job, B (Ed.): Towards a theoretical framework for analyzing complex linguistic networks, pp. 203-234, Springer, Berlin, 2016. Abstract | Links | BibTeX | Tags: network science, word order @incollection{Ferrer2016d, title = {Non-crossing dependencies: least effort, not grammar}, author = {R Ferrer-i-Cancho}, editor = {A Mehler and A L ü and S Banisch and P Blanchard and B Job}, doi = {10.1007/978-3-662-47238-5_10}, year = {2016}, date = {2016-01-01}, booktitle = {Towards a theoretical framework for analyzing complex linguistic networks}, pages = {203-234}, publisher = {Springer}, address = {Berlin}, abstract = {The use of null hypotheses (in a statistical sense) is common in hard sciences but not in theoretical linguistics. Here the null hypothesis that the low frequency of syntactic dependency crossings is expected by an arbitrary ordering of words is rejected. It is shown that this would require star dependency structures, which are both unrealistic and too restrictive. The hypothesis of the limited resources of the human brain is revisited. Stronger null hypotheses taking into account actual dependency lengths for the likelihood of crossings are presented. Those hypotheses suggests that crossings are likely to reduce when dependencies are shortened. A hypothesis based on pressure to reduce dependency lengths is more parsimonious than a principle of minimization of crossings or a grammatical ban that is totally dissociated from the general and non-linguistic principle of economy.}, keywords = {network science, word order}, pubstate = {published}, tppubtype = {incollection} } The use of null hypotheses (in a statistical sense) is common in hard sciences but not in theoretical linguistics. Here the null hypothesis that the low frequency of syntactic dependency crossings is expected by an arbitrary ordering of words is rejected. It is shown that this would require star dependency structures, which are both unrealistic and too restrictive. The hypothesis of the limited resources of the human brain is revisited. Stronger null hypotheses taking into account actual dependency lengths for the likelihood of crossings are presented. Those hypotheses suggests that crossings are likely to reduce when dependencies are shortened. A hypothesis based on pressure to reduce dependency lengths is more parsimonious than a principle of minimization of crossings or a grammatical ban that is totally dissociated from the general and non-linguistic principle of economy. |

Ferrer-i-Cancho, R Kauffman's adjacent possible in word order evolution Inproceedings The evolution of language: Proceedings of the 11th International Conference (EVOLANG11), 2016. BibTeX | Tags: @inproceedings{Ferrer2016c, title = {Kauffman's adjacent possible in word order evolution}, author = {R Ferrer-i-Cancho}, year = {2016}, date = {2016-01-01}, booktitle = {The evolution of language: Proceedings of the 11th International Conference (EVOLANG11)}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |

Gustison, M L; Semple, S; Ferrer-i-Cancho, R; Bergman, T Gelada vocal sequences follow Menzerath's linguistic law Journal Article Proceedings of the National Academy of Sciences USA, 13 , pp. E2750–E2758, 2016. Abstract | Links | BibTeX | Tags: information theory, Menzerath's law @article{Gustison2016a, title = {Gelada vocal sequences follow Menzerath's linguistic law}, author = {M L Gustison and S Semple and R Ferrer-i-Cancho and T Bergman}, doi = {10.1073/pnas.1522072113}, year = {2016}, date = {2016-01-01}, journal = {Proceedings of the National Academy of Sciences USA}, volume = {13}, pages = {E2750--E2758}, abstract = {Identifying universal principles underpinning diverse natural systems is a key goal of the life sciences. A powerful approach in addressing this goal has been to test whether patterns consistent with linguistic laws are found in nonhuman animals. Menzerath’s law is a linguistic law that states that, the larger the construct, the smaller the size of its constituents. Here, to our knowledge, we present the first evidence that Menzerath’s law holds in the vocal communication of a nonhuman species. We show that, in vocal sequences of wild male geladas (Theropithecus gelada), construct size (sequence size in number of calls) is negatively correlated with constituent size (duration of calls). Call duration does not vary significantly with position in the sequence, but call sequence composition does change with sequence size and most call types are abbreviated in larger sequences. We also find that intercall intervals follow the same relationship with sequence size as do calls. Finally, we provide formal mathematical support for the idea that Menzerath’s law reflects compression—the principle of minimizing the expected length of a code. Our findings suggest that a common principle underpins human and gelada vocal communication, highlighting the value of exploring the applicability of linguistic laws in vocal systems outside the realm of language.}, keywords = {information theory, Menzerath's law}, pubstate = {published}, tppubtype = {article} } Identifying universal principles underpinning diverse natural systems is a key goal of the life sciences. A powerful approach in addressing this goal has been to test whether patterns consistent with linguistic laws are found in nonhuman animals. Menzerath’s law is a linguistic law that states that, the larger the construct, the smaller the size of its constituents. Here, to our knowledge, we present the first evidence that Menzerath’s law holds in the vocal communication of a nonhuman species. We show that, in vocal sequences of wild male geladas (Theropithecus gelada), construct size (sequence size in number of calls) is negatively correlated with constituent size (duration of calls). Call duration does not vary significantly with position in the sequence, but call sequence composition does change with sequence size and most call types are abbreviated in larger sequences. We also find that intercall intervals follow the same relationship with sequence size as do calls. Finally, we provide formal mathematical support for the idea that Menzerath’s law reflects compression—the principle of minimizing the expected length of a code. Our findings suggest that a common principle underpins human and gelada vocal communication, highlighting the value of exploring the applicability of linguistic laws in vocal systems outside the realm of language. |

Ferrer-i-Cancho, R; Gómez-Rodríguez, C Liberating language research from dogmas of the 20th century. Journal Article Glottometrics, 33 , pp. 33-34, 2016. Abstract | Links | BibTeX | Tags: word order @article{Ferrer2016a, title = {Liberating language research from dogmas of the 20th century.}, author = {R Ferrer-i-Cancho and C Gómez-Rodríguez}, url = {http://hdl.handle.net/2117/85273}, year = {2016}, date = {2016-01-01}, journal = {Glottometrics}, volume = {33}, pages = {33-34}, abstract = {A commentary on the article "Large-scale evidence of dependency length minimization in 37 languages" by Futrell, Mahowald & Gibson (PNAS 2015 112 (33) 10336-10341).}, keywords = {word order}, pubstate = {published}, tppubtype = {article} } A commentary on the article "Large-scale evidence of dependency length minimization in 37 languages" by Futrell, Mahowald & Gibson (PNAS 2015 112 (33) 10336-10341). |

Ferrer-i-Cancho, R Compression and the origins of Zipf's law for word frequencies Journal Article Complexity, 21 , pp. 409-411, 2016. Abstract | Links | BibTeX | Tags: information theory, Zipf's law for word frequencies @article{Ferrer2016b, title = {Compression and the origins of Zipf's law for word frequencies}, author = {R Ferrer-i-Cancho}, doi = {10.1002/cplx.21820}, year = {2016}, date = {2016-01-01}, journal = {Complexity}, volume = {21}, pages = {409-411}, abstract = {Here we sketch a new derivation of Zipf's law for word frequencies based on optimal coding. The structure of the derivation is reminiscent of Mandelbrot's random typing model but it has multiple advantages over random typing: (1) it starts from realistic cognitive pressures, (2) it does not require fine tuning of parameters, and (3) it sheds light on the origins of other statistical laws of language and thus can lead to a compact theory of linguistic laws. Our findings suggest that the recurrence of Zipf's law in human languages could originate from pressure for easy and fast communication.}, keywords = {information theory, Zipf's law for word frequencies}, pubstate = {published}, tppubtype = {article} } Here we sketch a new derivation of Zipf's law for word frequencies based on optimal coding. The structure of the derivation is reminiscent of Mandelbrot's random typing model but it has multiple advantages over random typing: (1) it starts from realistic cognitive pressures, (2) it does not require fine tuning of parameters, and (3) it sheds light on the origins of other statistical laws of language and thus can lead to a compact theory of linguistic laws. Our findings suggest that the recurrence of Zipf's law in human languages could originate from pressure for easy and fast communication. |

Esteban, J L; Ferrer-i-Cancho, R; Gómez-Rodríguez, C The scaling of the minimum sum of edge lengths in uniformly random trees Journal Article Journal of Statistical Mechanics, pp. 063401, 2016. Abstract | Links | BibTeX | Tags: network science @article{Esteban2016a, title = {The scaling of the minimum sum of edge lengths in uniformly random trees}, author = {J L Esteban and R Ferrer-i-Cancho and C Gómez-Rodríguez}, doi = {10.1088/1742-5468/2016/06/063401}, year = {2016}, date = {2016-01-01}, journal = {Journal of Statistical Mechanics}, pages = {063401}, abstract = {The minimum linear arrangement problem on a network consists of finding the minimum sum of edge lengths that can be achieved when the vertices are arranged linearly. Although there are algorithms to solve this problem on trees in polynomial time, they have remained theoretical and have not been implemented in practical contexts to our knowledge. Here we use one of those algorithms to investigate the growth of this sum as a function of the size of the tree in uniformly random trees. We show that this sum is bounded above by its value in a star tree. We also show that the mean edge length grows logarithmically in optimal linear arrangements, in stark contrast to the linear growth that is expected on optimal arrangements of star trees or on random linear arrangements.}, keywords = {network science}, pubstate = {published}, tppubtype = {article} } The minimum linear arrangement problem on a network consists of finding the minimum sum of edge lengths that can be achieved when the vertices are arranged linearly. Although there are algorithms to solve this problem on trees in polynomial time, they have remained theoretical and have not been implemented in practical contexts to our knowledge. Here we use one of those algorithms to investigate the growth of this sum as a function of the size of the tree in uniformly random trees. We show that this sum is bounded above by its value in a star tree. We also show that the mean edge length grows logarithmically in optimal linear arrangements, in stark contrast to the linear growth that is expected on optimal arrangements of star trees or on random linear arrangements. |

Ferrer-i-Cancho, R The meaning-frequency law in Zipfian optimization models of communication Journal Article Glottometrics, 35 , pp. 28-37, 2016. Abstract | Links | BibTeX | Tags: network science @article{Ferrer2014d, title = {The meaning-frequency law in Zipfian optimization models of communication}, author = {R Ferrer-i-Cancho}, url = {http://hdl.handle.net/2117/95871}, year = {2016}, date = {2016-01-01}, journal = {Glottometrics}, volume = {35}, pages = {28-37}, abstract = {According to Zipf's meaning-frequency law, words that are more frequent tend to have more meanings. Here it is shown that a linear dependency between the frequency of a form and its number of meanings is found in a family of models of Zipf's law for word frequencies. This is evidence for a weak version of the meaning-frequency law. Interestingly, that weak law (a) is not an inevitable of property of the assumptions of the family and (b) is found at least in the narrow regime where those models exhibit Zipf's law for word frequencies.}, keywords = {network science}, pubstate = {published}, tppubtype = {article} } According to Zipf's meaning-frequency law, words that are more frequent tend to have more meanings. Here it is shown that a linear dependency between the frequency of a form and its number of meanings is found in a family of models of Zipf's law for word frequencies. This is evidence for a weak version of the meaning-frequency law. Interestingly, that weak law (a) is not an inevitable of property of the assumptions of the family and (b) is found at least in the narrow regime where those models exhibit Zipf's law for word frequencies. |

## 2015 |

Bentz, C; Ferrer-i-Cancho, R Zipf's law of abbreviation as a language universal Inproceedings Capturing Phylogenetic Algorithms for Linguistics, Lorentz Center Workshop, Leiden, 2015. BibTeX | Tags: Zipf's law of abbreviation @inproceedings{Bentz2015a, title = {Zipf's law of abbreviation as a language universal}, author = {C Bentz and R Ferrer-i-Cancho}, year = {2015}, date = {2015-10-01}, booktitle = {Capturing Phylogenetic Algorithms for Linguistics, Lorentz Center Workshop}, address = {Leiden}, keywords = {Zipf's law of abbreviation}, pubstate = {published}, tppubtype = {inproceedings} } |

Semple, S; Ferrer-i-Cancho, R; Bergman, T; Hsu, M; Agoramoorthy, G; Gustison, M Linguistic laws in primate vocal communication Inproceedings Proceedings of the 6th European Federation for Primatology Meeting, XXII Italian Association of Primatology Congress Rome, Italy, August 25-28. Folia Primatologica 86, 357, 2015. Links | BibTeX | Tags: Menzerath's law, Zipf's law of abbreviation @inproceedings{Semple2015a, title = {Linguistic laws in primate vocal communication}, author = {S Semple and R Ferrer-i-Cancho and T Bergman and M Hsu and G Agoramoorthy and M Gustison}, doi = {10.1159/000435825}, year = {2015}, date = {2015-01-01}, booktitle = {Proceedings of the 6th European Federation for Primatology Meeting, XXII Italian Association of Primatology Congress Rome, Italy, August 25-28. Folia Primatologica 86, 357}, keywords = {Menzerath's law, Zipf's law of abbreviation}, pubstate = {published}, tppubtype = {inproceedings} } |

Corral, A; Boleda, G; Ferrer-i-Cancho, R Zipf's law for word frequencies: word forms versus lemmas in long texts Journal Article PLoS ONE, 10 , pp. e0129031, 2015. Abstract | Links | BibTeX | Tags: Zipf's law for word frequencies @article{Corral2015a, title = {Zipf's law for word frequencies: word forms versus lemmas in long texts}, author = {A Corral and G Boleda and R Ferrer-i-Cancho}, doi = {10.1371/journal.pone.0129031}, year = {2015}, date = {2015-01-01}, journal = {PLoS ONE}, volume = {10}, pages = {e0129031}, abstract = {Zipf’s law is a fundamental paradigm in the statistics of written and spoken natural language as well as in other communication systems. We raise the question of the elementary units for which Zipf’s law should hold in the most natural way, studying its validity for plain word forms and for the corresponding lemma forms. We analyze several long literary texts comprising four languages, with different levels of morphological complexity. In all cases Zipf’s law is fulfilled, in the sense that a power-law distribution of word or lemma frequencies is valid for several orders of magnitude. We investigate the extent to which the word-lemma transformation preserves two parameters of Zipf’s law: the exponent and the low-frequency cut-off. We are not able to demonstrate a strict invariance of the tail, as for a few texts both exponents deviate significantly, but we conclude that the exponents are very similar, despite the remarkable transformation that going from words to lemmas represents, considerably affecting all ranges of frequencies. In contrast, the low-frequency cut-offs are less stable, tending to increase substantially after the transformation.}, keywords = {Zipf's law for word frequencies}, pubstate = {published}, tppubtype = {article} } Zipf’s law is a fundamental paradigm in the statistics of written and spoken natural language as well as in other communication systems. We raise the question of the elementary units for which Zipf’s law should hold in the most natural way, studying its validity for plain word forms and for the corresponding lemma forms. We analyze several long literary texts comprising four languages, with different levels of morphological complexity. In all cases Zipf’s law is fulfilled, in the sense that a power-law distribution of word or lemma frequencies is valid for several orders of magnitude. We investigate the extent to which the word-lemma transformation preserves two parameters of Zipf’s law: the exponent and the low-frequency cut-off. We are not able to demonstrate a strict invariance of the tail, as for a few texts both exponents deviate significantly, but we conclude that the exponents are very similar, despite the remarkable transformation that going from words to lemmas represents, considerably affecting all ranges of frequencies. In contrast, the low-frequency cut-offs are less stable, tending to increase substantially after the transformation. |

Ferrer-i-Cancho, R; Bentz, C; Seguin, C Compression and the origins of Zipf's law of abbreviation Journal Article http://arxiv.org/abs/1504.04884, 2015. BibTeX | Tags: information theory, Zipf's law of abbreviation @article{Ferrer2015a, title = {Compression and the origins of Zipf's law of abbreviation}, author = {R Ferrer-i-Cancho and C Bentz and C Seguin}, year = {2015}, date = {2015-01-01}, journal = {http://arxiv.org/abs/1504.04884}, keywords = {information theory, Zipf's law of abbreviation}, pubstate = {published}, tppubtype = {article} } |

Ferrer-i-Cancho, R; Gómez-Rodríguez, C Crossings as a side effect of dependency lengths Journal Article Complexity, 21 , pp. 320-328, 2015. Abstract | Links | BibTeX | Tags: network science, word order @article{Ferrer2015c, title = {Crossings as a side effect of dependency lengths}, author = {R Ferrer-i-Cancho and C Gómez-Rodríguez}, doi = {10.1002/cplx.21810}, year = {2015}, date = {2015-01-01}, journal = {Complexity}, volume = {21}, pages = {320-328}, abstract = {The syntactic structure of sentences exhibits a striking regularity: dependencies tend to not cross when drawn above the sentence. We investigate two competing explanations. The traditional hypothesis is that this trend arises from an independent principle of syntax that reduces crossings practically to zero. An alternative to this view is the hypothesis that crossings are a side effect of dependency lengths, that is, sentences with shorter dependency lengths should tend to have fewer crossings. We are able to reject the traditional view in the majority of languages considered. The alternative hypothesis can lead to a more parsimonious theory of language.}, keywords = {network science, word order}, pubstate = {published}, tppubtype = {article} } The syntactic structure of sentences exhibits a striking regularity: dependencies tend to not cross when drawn above the sentence. We investigate two competing explanations. The traditional hypothesis is that this trend arises from an independent principle of syntax that reduces crossings practically to zero. An alternative to this view is the hypothesis that crossings are a side effect of dependency lengths, that is, sentences with shorter dependency lengths should tend to have fewer crossings. We are able to reject the traditional view in the majority of languages considered. The alternative hypothesis can lead to a more parsimonious theory of language. |

Ferrer-i-Cancho, R Reply to the commentary ``Be careful when assuming the obvious'', by P. Alday Journal Article Language Dynamics and Change, 5 , pp. 147-155, 2015. Links | BibTeX | Tags: word order @article{Ferrer2014e, title = {Reply to the commentary ``Be careful when assuming the obvious'', by P. Alday}, author = {R Ferrer-i-Cancho}, doi = {10.1163/22105832-00501009}, year = {2015}, date = {2015-01-01}, journal = {Language Dynamics and Change}, volume = {5}, pages = {147-155}, keywords = {word order}, pubstate = {published}, tppubtype = {article} } |

Ferrer-i-Cancho, R The placement of the head that minimizes online memory. A complex systems approach Journal Article Language Dynamics and Change, 5 , pp. 114-137, 2015. Abstract | Links | BibTeX | Tags: network science, word order @article{Ferrer2013e, title = {The placement of the head that minimizes online memory. A complex systems approach}, author = {R Ferrer-i-Cancho}, doi = {10.1163/22105832-00501007}, year = {2015}, date = {2015-01-01}, journal = {Language Dynamics and Change}, volume = {5}, pages = {114-137}, abstract = {It is well known that the length of a syntactic dependency determines its online memory cost. Thus, the problem of the placement of a head and its dependents (complements or modifiers) that minimizes online memory is equivalent to the problem of the minimum linear arrangement of a star tree. However, how that length is translated into cognitive cost is not known. This study shows that the online memory cost is minimized when the head is placed at the center, regardless of the function that transforms length into cost, provided only that this function is strictly monotonically increasing. Online memory defines a quasi-convex adaptive landscape with a single central minimum if the number of elements is odd and two central minima if that number is even. We discuss various aspects of the dynamics of word order of subject (S), verb (V) and object (O) from a complex systems perspective and suggest that word orders tend to evolve by swapping adjacent constituents from an initial or early SOV configuration that is attracted towards a central word order by online memory minimization. We also suggest that the stability of SVO is due to at least two factors, the quasi-convex shape of the adaptive landscape in the online memory dimension and online memory adaptations that avoid regression to SOV. Although OVS is also optimal for placing the verb at the center, its low frequency is explained by its long distance to the seminal SOV in the permutation space.}, keywords = {network science, word order}, pubstate = {published}, tppubtype = {article} } It is well known that the length of a syntactic dependency determines its online memory cost. Thus, the problem of the placement of a head and its dependents (complements or modifiers) that minimizes online memory is equivalent to the problem of the minimum linear arrangement of a star tree. However, how that length is translated into cognitive cost is not known. This study shows that the online memory cost is minimized when the head is placed at the center, regardless of the function that transforms length into cost, provided only that this function is strictly monotonically increasing. Online memory defines a quasi-convex adaptive landscape with a single central minimum if the number of elements is odd and two central minima if that number is even. We discuss various aspects of the dynamics of word order of subject (S), verb (V) and object (O) from a complex systems perspective and suggest that word orders tend to evolve by swapping adjacent constituents from an initial or early SOV configuration that is attracted towards a central word order by online memory minimization. We also suggest that the stability of SVO is due to at least two factors, the quasi-convex shape of the adaptive landscape in the online memory dimension and online memory adaptations that avoid regression to SOV. Although OVS is also optimal for placing the verb at the center, its low frequency is explained by its long distance to the seminal SOV in the permutation space. |

## 2014 |

Ferrer-i-Cancho, R; Hernández-Fernández, A; Baixeries, J; Ł, ; J, ; Mačutek, When is Menzerath-Altmann law mathematically trivial? A new approach Journal Article Statistical Applications in Genetics and Molecular Biology, 13 , pp. 633-644, 2014. Abstract | Links | BibTeX | Tags: genomes, Menzerath's law @article{Ferrer2012h, title = {When is Menzerath-Altmann law mathematically trivial? A new approach}, author = {R Ferrer-i-Cancho and A Hernández-Fernández and J Baixeries and Ł and J and Mačutek}, doi = {10.1515/sagmb-2013-0034}, year = {2014}, date = {2014-01-01}, journal = {Statistical Applications in Genetics and Molecular Biology}, volume = {13}, pages = {633-644}, abstract = {Menzerath’s law, the tendency of Z (the mean size of the parts) to decrease as X (the number of parts) increases, is found in language, music and genomes. Recently, it has been argued that the presence of the law in genomes is an inevitable consequence of the fact that Z=Y/X, which would imply that Z scales with X as Z∼1/X. That scaling is a very particular case of Menzerath-Altmann law that has been rejected by means of a correlation test between X and Y in genomes, being X the number of chromosomes of a species, Y its genome size in bases and Z the mean chromosome size. Here we review the statistical foundations of that test and consider three non-parametric tests based upon different correlation metrics and one parametric test to evaluate if Z∼1/X in genomes. The most powerful test is a new non-parametric one based upon the correlation ratio, which is able to reject Z∼1/X in nine out of 11 taxonomic groups and detect a borderline group. Rather than a fact, Z∼1/X is a baseline that real genomes do not meet. The view of Menzerath-Altmann law as inevitable is seriously flawed.}, keywords = {genomes, Menzerath's law}, pubstate = {published}, tppubtype = {article} } Menzerath’s law, the tendency of Z (the mean size of the parts) to decrease as X (the number of parts) increases, is found in language, music and genomes. Recently, it has been argued that the presence of the law in genomes is an inevitable consequence of the fact that Z=Y/X, which would imply that Z scales with X as Z∼1/X. That scaling is a very particular case of Menzerath-Altmann law that has been rejected by means of a correlation test between X and Y in genomes, being X the number of chromosomes of a species, Y its genome size in bases and Z the mean chromosome size. Here we review the statistical foundations of that test and consider three non-parametric tests based upon different correlation metrics and one parametric test to evaluate if Z∼1/X in genomes. The most powerful test is a new non-parametric one based upon the correlation ratio, which is able to reject Z∼1/X in nine out of 11 taxonomic groups and detect a borderline group. Rather than a fact, Z∼1/X is a baseline that real genomes do not meet. The view of Menzerath-Altmann law as inevitable is seriously flawed. |

Ferrer-i-Cancho, R Beyond description. Comment on "Approaching human language with complex networks" by Cong & Liu Journal Article Physics of Life Reviews, 11 , pp. 621-623, 2014. Links | BibTeX | Tags: network science @article{Ferrer2014g, title = {Beyond description. Comment on "Approaching human language with complex networks" by Cong & Liu}, author = {R Ferrer-i-Cancho}, doi = {10.1016/j.plrev.2014.07.014}, year = {2014}, date = {2014-01-01}, journal = {Physics of Life Reviews}, volume = {11}, pages = {621-623}, keywords = {network science}, pubstate = {published}, tppubtype = {article} } |

Ferrer-i-Cancho, R Physics of Life Reviews, 21 , pp. 218-220, 2014. Links | BibTeX | Tags: network science, word order @article{Ferrer2017c, title = {Towards a theory of word order. Comment on "Dependency distance: A new perspective on syntactic patterns in natural language" by Haitao Liu et al.}, author = {R Ferrer-i-Cancho}, doi = {10.1016/j.plrev.2017.06.019}, year = {2014}, date = {2014-01-01}, journal = {Physics of Life Reviews}, volume = {21}, pages = {218-220}, keywords = {network science, word order}, pubstate = {published}, tppubtype = {article} } |

Ferrer-i-Cancho, R Why might SOV be initially preferred and then lost or recovered? A theoretical framework Inproceedings Cartmill, E A; Roberts, S; Lyn, H; Cornish, H (Ed.): THE EVOLUTION OF LANGUAGE - Proceedings of the 10th International Conference (EVOLANG10), pp. 66-73, Wiley, Vienna, Austria, 2014, (Evolution of Language Conference (Evolang 2014), April 14-17). Abstract | Links | BibTeX | Tags: word order @inproceedings{Ferrer2014a, title = {Why might SOV be initially preferred and then lost or recovered? A theoretical framework}, author = {R Ferrer-i-Cancho}, editor = {E A Cartmill and S Roberts and H Lyn and H Cornish}, doi = {10.1142/9789814603638_0007}, year = {2014}, date = {2014-01-01}, booktitle = {THE EVOLUTION OF LANGUAGE - Proceedings of the 10th International Conference (EVOLANG10)}, pages = {66-73}, publisher = {Wiley}, address = {Vienna, Austria}, abstract = {Little is known about why SOV order is initially preferred and then discarded or recovered. Here we present a framework for understanding these and many related word order phenomena: the diversity of dominant orders, the existence of free words orders, the need of alternative word orders and word order reversions and cycles in evolution. Under that framework, word order is regarded as a multiconstraint satisfaction problem in which at least two constraints are in conflict: online memory minimization and maximum predictability.}, note = {Evolution of Language Conference (Evolang 2014), April 14-17}, keywords = {word order}, pubstate = {published}, tppubtype = {inproceedings} } Little is known about why SOV order is initially preferred and then discarded or recovered. Here we present a framework for understanding these and many related word order phenomena: the diversity of dominant orders, the existence of free words orders, the need of alternative word orders and word order reversions and cycles in evolution. Under that framework, word order is regarded as a multiconstraint satisfaction problem in which at least two constraints are in conflict: online memory minimization and maximum predictability. |

Ferrer-i-Cancho, R What if we are not at the center? Inproceedings THE EVOLUTION OF LANGUAGE - Proceedings of the 10th International Conference (EVOLANG10), Wiley, Vienna, Austria, 2014, (Evolution of Language Conference (Evolang 2014), April 14-17). BibTeX | Tags: @inproceedings{Ferrer2014b, title = {What if we are not at the center?}, author = {R Ferrer-i-Cancho}, year = {2014}, date = {2014-01-01}, booktitle = {THE EVOLUTION OF LANGUAGE - Proceedings of the 10th International Conference (EVOLANG10)}, publisher = {Wiley}, address = {Vienna, Austria}, note = {Evolution of Language Conference (Evolang 2014), April 14-17}, keywords = {}, pubstate = {published}, tppubtype = {inproceedings} } |

Ferrer-i-Cancho, R A stronger null hypothesis for crossing dependencies Journal Article Europhysics Letters, 108 , pp. 58003, 2014. Abstract | Links | BibTeX | Tags: network science, word order @article{Ferrer2014c, title = {A stronger null hypothesis for crossing dependencies}, author = {R Ferrer-i-Cancho}, doi = {10.1209/0295-5075/108/58003}, year = {2014}, date = {2014-01-01}, journal = {Europhysics Letters}, volume = {108}, pages = {58003}, abstract = {The syntactic structure of a sentence can be modeled as a tree where vertices are words and edges indicate syntactic dependencies between words. It is well known that those edges normally do not cross when drawn over the sentence. Here a new null hypothesis for the number of edge crossings of a sentence is presented. That null hypothesis takes into account the length of the pair of edges that may cross and predicts the relative number of crossings in random trees with a small error, suggesting that a ban of crossings or a principle of minimization of crossings are not needed in general to explain the origins of non-crossing dependencies. Our work paves the way for more powerful null hypotheses to investigate the origins of non-crossing dependencies in Nature.}, keywords = {network science, word order}, pubstate = {published}, tppubtype = {article} } The syntactic structure of a sentence can be modeled as a tree where vertices are words and edges indicate syntactic dependencies between words. It is well known that those edges normally do not cross when drawn over the sentence. Here a new null hypothesis for the number of edge crossings of a sentence is presented. That null hypothesis takes into account the length of the pair of edges that may cross and predicts the relative number of crossings in random trees with a small error, suggesting that a ban of crossings or a principle of minimization of crossings are not needed in general to explain the origins of non-crossing dependencies. Our work paves the way for more powerful null hypotheses to investigate the origins of non-crossing dependencies in Nature. |

Ferrer-i-Cancho, R; Liu, H The risks of mixing dependency lengths from sequences of different length Journal Article Glottotheory, 5 , pp. 143-155, 2014. Abstract | Links | BibTeX | Tags: word order @article{Ferrer2013c, title = {The risks of mixing dependency lengths from sequences of different length}, author = {R Ferrer-i-Cancho and H Liu}, doi = {10.1515/glot-2014-0014}, year = {2014}, date = {2014-01-01}, journal = {Glottotheory}, volume = {5}, pages = {143-155}, abstract = {Mixing dependency lengths from sequences of different length is a common practice in language research. However, the empirical distribution of dependency lengths of sentences of the same length differs from that of sentences of varying length. The distribution of dependency lengths depends on sentence length for real sentences and also under the null hypothesis that dependencies connect vertices located in random positions of the sequence. This suggests that certain results, such as the distribution of syntactic dependency lengths mixing dependencies from sentences of varying length, could be a mere consequence of that mixing. Furthermore, differences in the global averages of dependency length (mixing lengths from sentences of varying length) for two different languages do not simply imply a priori that one language optimizes dependency lengths better than the other because those differences could be due to differences in the distribution of sentence lengths and other factors.}, keywords = {word order}, pubstate = {published}, tppubtype = {article} } Mixing dependency lengths from sequences of different length is a common practice in language research. However, the empirical distribution of dependency lengths of sentences of the same length differs from that of sentences of varying length. The distribution of dependency lengths depends on sentence length for real sentences and also under the null hypothesis that dependencies connect vertices located in random positions of the sequence. This suggests that certain results, such as the distribution of syntactic dependency lengths mixing dependencies from sentences of varying length, could be a mere consequence of that mixing. Furthermore, differences in the global averages of dependency length (mixing lengths from sentences of varying length) for two different languages do not simply imply a priori that one language optimizes dependency lengths better than the other because those differences could be due to differences in the distribution of sentence lengths and other factors. |

## 2013 |

Ferrer-i-Cancho, R; Baixeries, J; Hernández-Fernández, A Erratum to "Random models of Menzerath-Altmann law in genomes" (BioSystems 107 (3), 167-173) Journal Article Biosystems, 111 (3), pp. 216-217, 2013. Links | BibTeX | Tags: genomes, Menzerath's law @article{Ferrer2012g, title = {Erratum to "Random models of Menzerath-Altmann law in genomes" (BioSystems 107 (3), 167-173)}, author = {R Ferrer-i-Cancho and J Baixeries and A Hernández-Fernández}, doi = {10.1016/j.biosystems.2013.01.004}, year = {2013}, date = {2013-01-01}, journal = {Biosystems}, volume = {111}, number = {3}, pages = {216-217}, keywords = {genomes, Menzerath's law}, pubstate = {published}, tppubtype = {article} } |

Baixeries, J; Hernández-Fernández, A; Forns, N; Ferrer-i-Cancho, R The parameters of Menzerath-Altmann law in genomes Journal Article Journal of Quantitative Linguistics, 20 (2), pp. 94-104, 2013. Abstract | Links | BibTeX | Tags: genomes, Menzerath's law @article{Baixeries2012b, title = {The parameters of Menzerath-Altmann law in genomes}, author = {J Baixeries and A Hernández-Fernández and N Forns and R Ferrer-i-Cancho}, doi = {10.1080/09296174.2013.773141}, year = {2013}, date = {2013-01-01}, journal = {Journal of Quantitative Linguistics}, volume = {20}, number = {2}, pages = {94-104}, abstract = {The relationship between the size of the whole and the size of the parts in language and music is known to follow the Menzerath-Altmann law at many levels of description (morphemes, words, sentences,...). Qualitatively, the law states that the larger the whole, the smaller its parts, e.g. the longer a word (in syllables) the shorter its syllables (in letters or phonemes). This patterning has also been found in genomes: the longer a genome (in chromosomes), the shorter its chromosomes (in base pairs). However, it has been argued recently that mean chromosome length is trivially a pure power function of chromosome number with an exponent of -1. The functional dependency between mean chromosome size and chromosome number in groups of organisms from three different kingdoms is studied. The fit of a pure power function yields exponents between -1.6 and 0.1. It is shown that an exponent of -1 is unlikely for fungi, gymnosperm plants, insects, reptiles, ray-finned fishes and amphibians. Even when the exponent is very close to -1, adding an exponential component is able to yield a better fit with regard to a pure power-law in plants, mammals, ray-finned fishes and amphibians. The parameters of the Menzerath-Altmann law in genomes deviate significantly from a power law with a -1 exponent with the exception of birds and cartilaginous fishes.}, keywords = {genomes, Menzerath's law}, pubstate = {published}, tppubtype = {article} } The relationship between the size of the whole and the size of the parts in language and music is known to follow the Menzerath-Altmann law at many levels of description (morphemes, words, sentences,...). Qualitatively, the law states that the larger the whole, the smaller its parts, e.g. the longer a word (in syllables) the shorter its syllables (in letters or phonemes). This patterning has also been found in genomes: the longer a genome (in chromosomes), the shorter its chromosomes (in base pairs). However, it has been argued recently that mean chromosome length is trivially a pure power function of chromosome number with an exponent of -1. The functional dependency between mean chromosome size and chromosome number in groups of organisms from three different kingdoms is studied. The fit of a pure power function yields exponents between -1.6 and 0.1. It is shown that an exponent of -1 is unlikely for fungi, gymnosperm plants, insects, reptiles, ray-finned fishes and amphibians. Even when the exponent is very close to -1, adding an exponential component is able to yield a better fit with regard to a pure power-law in plants, mammals, ray-finned fishes and amphibians. The parameters of the Menzerath-Altmann law in genomes deviate significantly from a power law with a -1 exponent with the exception of birds and cartilaginous fishes. |

Baixeries, J; Elvevåg, B; Ferrer-i-Cancho, R The evolution of the exponent of Zipf's law in language ontogeny Journal Article PLoS ONE, 8 (3), pp. e53227, 2013. Abstract | Links | BibTeX | Tags: child language, vocabulary learning, Zipf's law for word frequencies @article{Baixeries2012c, title = {The evolution of the exponent of Zipf's law in language ontogeny}, author = {J Baixeries and B Elvevåg and R Ferrer-i-Cancho}, doi = {10.1371/journal.pone.0053227}, year = {2013}, date = {2013-01-01}, journal = {PLoS ONE}, volume = {8}, number = {3}, pages = {e53227}, abstract = {It is well-known that word frequencies arrange themselves according to Zipf's law. However, little is known about the dependency of the parameters of the law and the complexity of a communication system. Many models of the evolution of language assume that the exponent of the law remains constant as the complexity of a communication systems increases. Using longitudinal studies of child language, we analysed the word rank distribution for the speech of children and adults participating in conversations. The adults typically included family members (e.g., parents) or the investigators conducting the research. Our analysis of the evolution of Zipf's law yields two main unexpected results. First, in children the exponent of the law tends to decrease over time while this tendency is weaker in adults, thus suggesting this is not a mere mirror effect of adult speech. Second, although the exponent of the law is more stable in adults, their exponents fall below 1 which is the typical value of the exponent assumed in both children and adults. Our analysis also shows a tendency of the mean length of utterances (MLU), a simple estimate of syntactic complexity, to increase as the exponent decreases. The parallel evolution of the exponent and a simple indicator of syntactic complexity (MLU) supports the hypothesis that the exponent of Zipf's law and linguistic complexity are inter-related. The assumption that Zipf's law for word ranks is a power-law with a constant exponent of one in both adults and children needs to be revised.}, keywords = {child language, vocabulary learning, Zipf's law for word frequencies}, pubstate = {published}, tppubtype = {article} } It is well-known that word frequencies arrange themselves according to Zipf's law. However, little is known about the dependency of the parameters of the law and the complexity of a communication system. Many models of the evolution of language assume that the exponent of the law remains constant as the complexity of a communication systems increases. Using longitudinal studies of child language, we analysed the word rank distribution for the speech of children and adults participating in conversations. The adults typically included family members (e.g., parents) or the investigators conducting the research. Our analysis of the evolution of Zipf's law yields two main unexpected results. First, in children the exponent of the law tends to decrease over time while this tendency is weaker in adults, thus suggesting this is not a mere mirror effect of adult speech. Second, although the exponent of the law is more stable in adults, their exponents fall below 1 which is the typical value of the exponent assumed in both children and adults. Our analysis also shows a tendency of the mean length of utterances (MLU), a simple estimate of syntactic complexity, to increase as the exponent decreases. The parallel evolution of the exponent and a simple indicator of syntactic complexity (MLU) supports the hypothesis that the exponent of Zipf's law and linguistic complexity are inter-related. The assumption that Zipf's law for word ranks is a power-law with a constant exponent of one in both adults and children needs to be revised. |

Baronchelli, A; Ferrer-i-Cancho, R; Pastor-Satorras, R; Chatter, N; Christiansen, M H Networks in cognitive science Journal Article Trends in Cognitive Sciences, 17 , pp. 348-360, 2013. Abstract | Links | BibTeX | Tags: network science @article{Baronchelli2013a, title = {Networks in cognitive science}, author = {A Baronchelli and R Ferrer-i-Cancho and R Pastor-Satorras and N Chatter and M H Christiansen}, doi = {10.1016/j.tics.2013.04.010}, year = {2013}, date = {2013-01-01}, journal = {Trends in Cognitive Sciences}, volume = {17}, pages = {348-360}, abstract = {Networks of interconnected nodes have long played a key role in Cognitive Science, from artificial neural networks to spreading activation models of semantic memory. Recently, however, a new Network Science has been developed, providing insights into the emergence of global, system-scale properties in contexts as diverse as the Internet, metabolic reactions, and collaborations among scientists. Today, the inclusion of network theory into Cognitive Sciences, and the expansion of complex-systems science, promises to significantly change the way in which the organization and dynamics of cognitive and behavioral processes are understood. In this paper, we review recent contributions of network theory at different levels and domains within the Cognitive Sciences.}, keywords = {network science}, pubstate = {published}, tppubtype = {article} } Networks of interconnected nodes have long played a key role in Cognitive Science, from artificial neural networks to spreading activation models of semantic memory. Recently, however, a new Network Science has been developed, providing insights into the emergence of global, system-scale properties in contexts as diverse as the Internet, metabolic reactions, and collaborations among scientists. Today, the inclusion of network theory into Cognitive Sciences, and the expansion of complex-systems science, promises to significantly change the way in which the organization and dynamics of cognitive and behavioral processes are understood. In this paper, we review recent contributions of network theory at different levels and domains within the Cognitive Sciences. |

Ferrer-i-Cancho, R; Debowski, L; del Martín, Moscoso Prado F Constant conditional entropy and related hypotheses Journal Article Journal of Statistical Mechanics, pp. L07001, 2013. Abstract | Links | BibTeX | Tags: information theory @article{Ferrer2013a, title = {Constant conditional entropy and related hypotheses}, author = {R Ferrer-i-Cancho and L Debowski and F Moscoso del Prado Martín}, doi = {10.1088/1742-5468/2013/07/L07001}, year = {2013}, date = {2013-01-01}, journal = {Journal of Statistical Mechanics}, pages = {L07001}, abstract = {Constant entropy rate (conditional entropies must remain constant as the sequence length increases) and uniform information density (conditional probabilities must remain constant as the sequence length increases) are two information theoretic principles that are argued to underlie a wide range of linguistic phenomena. Here we revise the predictions of these principles in the light of Hilberg's law on the scaling of conditional entropy in language and related laws. We show that constant entropy rate (CER) and two interpretations for uniform information density (UID), full UID and strong UID, are inconsistent with these laws. Strong UID implies CER but the reverse is not true. Full UID, a particular case of UID, leads to costly uncorrelated sequences that are totally unrealistic. We conclude that CER and its particular cases are incomplete hypotheses about the scaling of conditional entropies.}, keywords = {information theory}, pubstate = {published}, tppubtype = {article} } Constant entropy rate (conditional entropies must remain constant as the sequence length increases) and uniform information density (conditional probabilities must remain constant as the sequence length increases) are two information theoretic principles that are argued to underlie a wide range of linguistic phenomena. Here we revise the predictions of these principles in the light of Hilberg's law on the scaling of conditional entropy in language and related laws. We show that constant entropy rate (CER) and two interpretations for uniform information density (UID), full UID and strong UID, are inconsistent with these laws. Strong UID implies CER but the reverse is not true. Full UID, a particular case of UID, leads to costly uncorrelated sequences that are totally unrealistic. We conclude that CER and its particular cases are incomplete hypotheses about the scaling of conditional entropies. |

Ferrer-i-Cancho, R Hubiness, length, crossings and their relationships in dependency trees Journal Article Glottometrics, 25 , pp. 1-21, 2013. Abstract | Links | BibTeX | Tags: network science @article{Ferrer2013b, title = {Hubiness, length, crossings and their relationships in dependency trees}, author = {R Ferrer-i-Cancho}, url = {http://hdl.handle.net/2117/176972}, year = {2013}, date = {2013-01-01}, journal = {Glottometrics}, volume = {25}, pages = {1-21}, abstract = {Here tree dependency structures are studied from three different perspectives: their degree variance (hubiness), the mean dependency length and the number of dependency crossings. Bounds that reveal pairwise dependencies among these three metrics are derived. Hubiness (the variance of degrees) plays a central role: the mean dependency length is bounded below by hubiness while the number of crossings is bounded above by hubiness. Our findings suggest that the online memory cost of a sentence might be determined not just by the ordering of words but also by the hubiness of the underlying structure. The 2nd moment of degree plays a crucial role that is reminiscent of its role in large complex networks.}, keywords = {network science}, pubstate = {published}, tppubtype = {article} } Here tree dependency structures are studied from three different perspectives: their degree variance (hubiness), the mean dependency length and the number of dependency crossings. Bounds that reveal pairwise dependencies among these three metrics are derived. Hubiness (the variance of degrees) plays a central role: the mean dependency length is bounded below by hubiness while the number of crossings is bounded above by hubiness. Our findings suggest that the online memory cost of a sentence might be determined not just by the ordering of words but also by the hubiness of the underlying structure. The 2nd moment of degree plays a crucial role that is reminiscent of its role in large complex networks. |

Ferrer-i-Cancho, R; Hernández-Fernández, A The failure of the law of brevity in two New World primates. Statistical caveats Journal Article Glottotheory, 4 (1), 2013. Abstract | Links | BibTeX | Tags: @article{Ferrer2012a, title = {The failure of the law of brevity in two New World primates. Statistical caveats}, author = {R Ferrer-i-Cancho and A Hernández-Fernández}, doi = {10.1524/glot.2013.0004}, year = {2013}, date = {2013-01-01}, journal = {Glottotheory}, volume = {4}, number = {1}, abstract = {Parallels of Zipf’s law of brevity, the tendency of more frequent words to be shorter, have been found in bottlenose dolphins and Formosan macaques. Although these findings suggest that behavioral repertoires are shaped by a general principle of compression, common marmosets and golden-backed uakaris do not exhibit the law. However, we argue that the law may be impossible or difficult to detect statistically in a given species if the repertoire is too small, a problem that could be affecting golden backed uakaris, and show that the law is present in a subset of the repertoire of common marmosets. We suggest that the visibility of the law will depend on the subset of the repertoire under consideration or the repertoire size.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Parallels of Zipf’s law of brevity, the tendency of more frequent words to be shorter, have been found in bottlenose dolphins and Formosan macaques. Although these findings suggest that behavioral repertoires are shaped by a general principle of compression, common marmosets and golden-backed uakaris do not exhibit the law. However, we argue that the law may be impossible or difficult to detect statistically in a given species if the repertoire is too small, a problem that could be affecting golden backed uakaris, and show that the law is present in a subset of the repertoire of common marmosets. We suggest that the visibility of the law will depend on the subset of the repertoire under consideration or the repertoire size. |

Ferrer-i-Cancho, R; Hernández-Fernández, A; Lusseau, D; Agoramoorthy, G; Hsu, M J; Semple, S Compression as a universal principle of animal behavior Journal Article Cognitive Science, 37 (8), pp. 1565-1578, 2013. Abstract | Links | BibTeX | Tags: information theory, Zipf's law of abbreviation @article{Ferrer2012d, title = {Compression as a universal principle of animal behavior}, author = {R Ferrer-i-Cancho and A Hernández-Fernández and D Lusseau and G Agoramoorthy and M J Hsu and S Semple}, doi = {10.1088/1742-5468/2012/06/P06002}, year = {2013}, date = {2013-01-01}, journal = {Cognitive Science}, volume = {37}, number = {8}, pages = {1565-1578}, abstract = {A key aim in biology and psychology is to identify fundamental principles underpinning the behavior of animals, including humans. Analyses of human language and the behavior of a range of non‐human animal species have provided evidence for a common pattern underlying diverse behavioral phenomena: Words follow Zipf's law of brevity (the tendency of more frequently used words to be shorter), and conformity to this general pattern has been seen in the behavior of a number of other animals. It has been argued that the presence of this law is a sign of efficient coding in the information theoretic sense. However, no strong direct connection has been demonstrated between the law and compression, the information theoretic principle of minimizing the expected length of a code. Here, we show that minimizing the expected code length implies that the length of a word cannot increase as its frequency increases. Furthermore, we show that the mean code length or duration is significantly small in human language, and also in the behavior of other species in all cases where agreement with the law of brevity has been found. We argue that compression is a general principle of animal behavior that reflects selection for efficiency of coding.}, keywords = {information theory, Zipf's law of abbreviation}, pubstate = {published}, tppubtype = {article} } A key aim in biology and psychology is to identify fundamental principles underpinning the behavior of animals, including humans. Analyses of human language and the behavior of a range of non‐human animal species have provided evidence for a common pattern underlying diverse behavioral phenomena: Words follow Zipf's law of brevity (the tendency of more frequently used words to be shorter), and conformity to this general pattern has been seen in the behavior of a number of other animals. It has been argued that the presence of this law is a sign of efficient coding in the information theoretic sense. However, no strong direct connection has been demonstrated between the law and compression, the information theoretic principle of minimizing the expected length of a code. Here, we show that minimizing the expected code length implies that the length of a word cannot increase as its frequency increases. Furthermore, we show that the mean code length or duration is significantly small in human language, and also in the behavior of other species in all cases where agreement with the law of brevity has been found. We argue that compression is a general principle of animal behavior that reflects selection for efficiency of coding. |

Ferrer-i-Cancho, R; Forns, N; Hernández-Fernández, A; Bel-Enguix, G; Baixeries, J The challenges of statistical patterns of language: the case of Menzerath's law in genomes Journal Article Complexity, 18 (3), pp. 11-17, 2013. Abstract | Links | BibTeX | Tags: genomes, Menzerath's law @article{Ferrer2012f, title = {The challenges of statistical patterns of language: the case of Menzerath's law in genomes}, author = {R Ferrer-i-Cancho and N Forns and A Hernández-Fernández and G Bel-Enguix and J Baixeries}, doi = {10.1002/cplx.21429}, year = {2013}, date = {2013-01-01}, journal = {Complexity}, volume = {18}, number = {3}, pages = {11-17}, abstract = {The importance of statistical patterns of language has been debated over decades. Although Zipf's law is perhaps the most popular case, recently, Menzerath's law has begun to be involved. Menzerath's law manifests in language, music and genomes as a tendency of the mean size of the parts to decrease as the number of parts increases in many situations. This statistical regularity emerges also in the context of genomes, for instance, as a tendency of species with more chromosomes to have a smaller mean chromosome size. It has been argued that the instantiation of this law in genomes is not indicative of any parallel between language and genomes because (a) the law is inevitable and (b) noncoding DNA dominates genomes. Here mathematical, statistical, and conceptual challenges of these criticisms are discussed. Two major conclusions are drawn: the law is not inevitable and languages also have a correlate of noncoding DNA. However, the wide range of manifestations of the law in and outside genomes suggests that the striking similarities between noncoding DNA and certain linguistics units could be anecdotal for understanding the recurrence of that statistical law.}, keywords = {genomes, Menzerath's law}, pubstate = {published}, tppubtype = {article} } The importance of statistical patterns of language has been debated over decades. Although Zipf's law is perhaps the most popular case, recently, Menzerath's law has begun to be involved. Menzerath's law manifests in language, music and genomes as a tendency of the mean size of the parts to decrease as the number of parts increases in many situations. This statistical regularity emerges also in the context of genomes, for instance, as a tendency of species with more chromosomes to have a smaller mean chromosome size. It has been argued that the instantiation of this law in genomes is not indicative of any parallel between language and genomes because (a) the law is inevitable and (b) noncoding DNA dominates genomes. Here mathematical, statistical, and conceptual challenges of these criticisms are discussed. Two major conclusions are drawn: the law is not inevitable and languages also have a correlate of noncoding DNA. However, the wide range of manifestations of the law in and outside genomes suggests that the striking similarities between noncoding DNA and certain linguistics units could be anecdotal for understanding the recurrence of that statistical law. |

## 2012 |

Baixeries, J; Hernández-Fernández, A; Ferrer-i-Cancho, R Random models of Menzerath-Altmann law in genomes Journal Article Biosystems, 107 , pp. 167-173, 2012. Abstract | Links | BibTeX | Tags: genomes, Menzerath's law @article{Baixeries2012a, title = {Random models of Menzerath-Altmann law in genomes}, author = {J Baixeries and A Hernández-Fernández and R Ferrer-i-Cancho}, doi = {10.1016/j.biosystems.2011.11.010}, year = {2012}, date = {2012-01-01}, journal = {Biosystems}, volume = {107}, pages = {167-173}, abstract = {Recently, a random breakage model has been proposed to explain the negative correlation between mean chromosome length and chromosome number that is found in many groups of species and is consistent with Menzerath-Altmann law, a statistical law that defines the dependency between the mean size of the whole and the number of parts in quantitative linguistics. Here, the central assumption of the model, namely that genome size is independent from chromosome number is reviewed. This assumption is shown to be unrealistic from the perspective of chromosome structure and the statistical analysis of real genomes. A general class of random models, including that random breakage model, is analyzed. For any model within this class, a power law with an exponent of -1 is predicted for the expectation of the mean chromosome size as a function of chromosome length, a functional dependency that is not supported by real genomes. The random breakage and variants keeping genome size and chromosome number independent raise no serious objection to the relevance of correlations consistent with Menzerath-Altmann law across taxonomic groups and the possibility of a connection between human language and genomes through that law.}, keywords = {genomes, Menzerath's law}, pubstate = {published}, tppubtype = {article} } Recently, a random breakage model has been proposed to explain the negative correlation between mean chromosome length and chromosome number that is found in many groups of species and is consistent with Menzerath-Altmann law, a statistical law that defines the dependency between the mean size of the whole and the number of parts in quantitative linguistics. Here, the central assumption of the model, namely that genome size is independent from chromosome number is reviewed. This assumption is shown to be unrealistic from the perspective of chromosome structure and the statistical analysis of real genomes. A general class of random models, including that random breakage model, is analyzed. For any model within this class, a power law with an exponent of -1 is predicted for the expectation of the mean chromosome size as a function of chromosome length, a functional dependency that is not supported by real genomes. The random breakage and variants keeping genome size and chromosome number independent raise no serious objection to the relevance of correlations consistent with Menzerath-Altmann law across taxonomic groups and the possibility of a connection between human language and genomes through that law. |

Ferrer-i-Cancho, R; McCowan, B The span of dependencies in dolphin whistle sequences Journal Article Journal of Statistical Mechanics, pp. P06002, 2012. Abstract | Links | BibTeX | Tags: information theory @article{Ferrer2012c, title = {The span of dependencies in dolphin whistle sequences}, author = {R Ferrer-i-Cancho and B McCowan}, doi = {10.1088/1742-5468/2012/06/P06002}, year = {2012}, date = {2012-01-01}, journal = {Journal of Statistical Mechanics}, pages = {P06002}, abstract = {Long-range correlations are found in symbolic sequences from human language, music and DNA. Determining the span of correlations in dolphin whistle sequences is crucial for shedding light on their communicative complexity. Dolphin whistles share various statistical properties with human words, i.e. Zipf's law for word frequencies (namely that the probability of the ith most frequent word of a text is about i-a) and a parallel of the tendency of more frequent words to have more meanings. The finding of Zipf's law for word frequencies in dolphin whistles has been the topic of an intense debate on its implications. One of the major arguments against the relevance of Zipf's law in dolphin whistles is that it is not possible to distinguish the outcome of a die-rolling experiment from that of a linguistic or communicative source producing Zipf's law for word frequencies. Here we show that statistically significant whistle–whistle correlations extend back to the second previous whistle in the sequence, using a global randomization test, and to the fourth previous whistle, using a local randomization test. None of these correlations are expected by a die-rolling experiment and other simple explanations of Zipf's law for word frequencies, such as Simon's model, that produce sequences of unpredictable elements.}, keywords = {information theory}, pubstate = {published}, tppubtype = {article} } Long-range correlations are found in symbolic sequences from human language, music and DNA. Determining the span of correlations in dolphin whistle sequences is crucial for shedding light on their communicative complexity. Dolphin whistles share various statistical properties with human words, i.e. Zipf's law for word frequencies (namely that the probability of the ith most frequent word of a text is about i-a) and a parallel of the tendency of more frequent words to have more meanings. The finding of Zipf's law for word frequencies in dolphin whistles has been the topic of an intense debate on its implications. One of the major arguments against the relevance of Zipf's law in dolphin whistles is that it is not possible to distinguish the outcome of a die-rolling experiment from that of a linguistic or communicative source producing Zipf's law for word frequencies. Here we show that statistically significant whistle–whistle correlations extend back to the second previous whistle in the sequence, using a global randomization test, and to the fourth previous whistle, using a local randomization test. None of these correlations are expected by a die-rolling experiment and other simple explanations of Zipf's law for word frequencies, such as Simon's model, that produce sequences of unpredictable elements. |

## 2011 |

Ferrer-i-Cancho, R; del Martín, Moscoso Prado F Information content versus word length in random typing Journal Article Journal of Statistical Mechanics, pp. L12002, 2011. Abstract | Links | BibTeX | Tags: information theory @article{Ferrer2011c, title = {Information content versus word length in random typing}, author = {R Ferrer-i-Cancho and F Moscoso del Prado Martín}, doi = {10.1088/1742-5468/2011/12/L12002}, year = {2011}, date = {2011-01-01}, journal = {Journal of Statistical Mechanics}, pages = {L12002}, abstract = {Recently, it has been claimed that a linear relationship between a measure of information content and word length is expected from word length optimization and it has been shown that this linearity is supported by a strong correlation between information content and word length in many languages (Piantadosi et al 2011 Proc. Nat. Acad. Sci. 108 3825). Here, we study in detail some connections between this measure and standard information theory. The relationship between the measure and word length is studied for the popular random typing process where a text is constructed by pressing keys at random from a keyboard containing letters and a space behaving as a word delimiter. Although this random process does not optimize word lengths according to information content, it exhibits a linear relationship between information content and word length. The exact slope and intercept are presented for three major variants of the random typing process. A strong correlation between information content and word length can simply arise from the units making a word (e.g., letters) and not necessarily from the interplay between a word and its context as proposed by Piantadosi and co-workers. In itself, the linear relation does not entail the results of any optimization process.}, keywords = {information theory}, pubstate = {published}, tppubtype = {article} } Recently, it has been claimed that a linear relationship between a measure of information content and word length is expected from word length optimization and it has been shown that this linearity is supported by a strong correlation between information content and word length in many languages (Piantadosi et al 2011 Proc. Nat. Acad. Sci. 108 3825). Here, we study in detail some connections between this measure and standard information theory. The relationship between the measure and word length is studied for the popular random typing process where a text is constructed by pressing keys at random from a keyboard containing letters and a space behaving as a word delimiter. Although this random process does not optimize word lengths according to information content, it exhibits a linear relationship between information content and word length. The exact slope and intercept are presented for three major variants of the random typing process. A strong correlation between information content and word length can simply arise from the units making a word (e.g., letters) and not necessarily from the interplay between a word and its context as proposed by Piantadosi and co-workers. In itself, the linear relation does not entail the results of any optimization process. |

Hernández-Fernández, A; Baixeries, J; Forns, N; Ferrer-i-Cancho, R Size of the whole versus number of parts in genomes Journal Article Entropy, 13 , pp. 1465-1480, 2011. Abstract | Links | BibTeX | Tags: @article{Hernandez2011a, title = {Size of the whole versus number of parts in genomes}, author = {A Hernández-Fernández and J Baixeries and N Forns and R Ferrer-i-Cancho}, doi = {10.3390/e13081465}, year = {2011}, date = {2011-01-01}, journal = {Entropy}, volume = {13}, pages = {1465-1480}, abstract = {It is known that chromosome number tends to decrease as genome size increases in angiosperm plants. Here the relationship between number of parts (the chromosomes) and size of the whole (the genome) is studied for other groups of organisms from different kingdoms. Two major results are obtained. First, the finding of relationships of the kind "the more parts the smaller the whole" as in angiosperms, but also relationships of the kind "the more parts the larger the whole". Second, these dependencies are not linear in general. The implications of the dependencies between genome size and chromosome number are two-fold. First, they indicate that arguments against the relevance of the finding of negative correlations consistent with Menzerath-Altmann law (a linguistic law that relates the size of the parts with the size of the whole) in genomes are seriously flawed. Second, they unravel the weakness of a recent model of chromosome lengths based upon random breakage that assumes that chromosome number and genome size are independent. It is known that chromosome number tends to decrease as genome size increases in angiosperm plants. Here the relationship between number of parts (the chromosomes) and size of the whole (the genome) is studied for other groups of organisms from different kingdoms. Two major results are obtained. First, the finding of relationships of the kind “the more parts the smaller the whole” as in angiosperms, but also relationships of the kind “the more parts the larger the whole”. Second, these dependencies are not linear in general. The implications of the dependencies between genome size and chromosome number are two-fold. First, they indicate that arguments against the relevance of the finding of negative correlations consistent with Menzerath-Altmann law (a linguistic law that relates the size of the parts with the size of the whole) in genomes are seriously flawed. Second, they unravel the weakness of a recent model of chromosome lengths based upon random breakage that assumes that chromosome number and genome size are independent.}, keywords = {}, pubstate = {published}, tppubtype = {article} } It is known that chromosome number tends to decrease as genome size increases in angiosperm plants. Here the relationship between number of parts (the chromosomes) and size of the whole (the genome) is studied for other groups of organisms from different kingdoms. Two major results are obtained. First, the finding of relationships of the kind "the more parts the smaller the whole" as in angiosperms, but also relationships of the kind "the more parts the larger the whole". Second, these dependencies are not linear in general. The implications of the dependencies between genome size and chromosome number are two-fold. First, they indicate that arguments against the relevance of the finding of negative correlations consistent with Menzerath-Altmann law (a linguistic law that relates the size of the parts with the size of the whole) in genomes are seriously flawed. Second, they unravel the weakness of a recent model of chromosome lengths based upon random breakage that assumes that chromosome number and genome size are independent. It is known that chromosome number tends to decrease as genome size increases in angiosperm plants. Here the relationship between number of parts (the chromosomes) and size of the whole (the genome) is studied for other groups of organisms from different kingdoms. Two major results are obtained. First, the finding of relationships of the kind “the more parts the smaller the whole” as in angiosperms, but also relationships of the kind “the more parts the larger the whole”. Second, these dependencies are not linear in general. The implications of the dependencies between genome size and chromosome number are two-fold. First, they indicate that arguments against the relevance of the finding of negative correlations consistent with Menzerath-Altmann law (a linguistic law that relates the size of the parts with the size of the whole) in genomes are seriously flawed. Second, they unravel the weakness of a recent model of chromosome lengths based upon random breakage that assumes that chromosome number and genome size are independent. |

## 2010 |

Kello, C T; Brown, G D A; Ferrer-i-Cancho, R; Holden, J G; Linkenkaer-Hansen, K; Rhodes, T; Orden, Van G C Scaling laws in cognitive sciences Journal Article Trends in cognitive sciences, 14 (5), pp. 223–232, 2010. Abstract | Links | BibTeX | Tags: network science @article{Kello2010a, title = {Scaling laws in cognitive sciences}, author = {C T Kello and G D A Brown and R Ferrer-i-Cancho and J G Holden and K Linkenkaer-Hansen and T Rhodes and Van G C Orden}, doi = {10.1016/j.tics.2010.02.005}, year = {2010}, date = {2010-01-01}, journal = {Trends in cognitive sciences}, volume = {14}, number = {5}, pages = {223--232}, abstract = {Scaling laws are ubiquitous in nature, and they pervade neural, behavioral and linguistic activities. A scaling law suggests the existence of processes or patterns that are repeated across scales of analysis. Although the variables that express a scaling law can vary from one type of activity to the next, the recurrence of scaling laws across so many different systems has prompted a search for unifying principles. In biological systems, scaling laws can reflect adaptive processes of various types and are often linked to complex systems poised near critical points. The same is true for perception, memory, language and other cognitive phenomena. Findings of scaling laws in cognitive science are indicative of scaling invariance in cognitive mechanisms and multiplicative interactions among interdependent components of cognition.}, keywords = {network science}, pubstate = {published}, tppubtype = {article} } Scaling laws are ubiquitous in nature, and they pervade neural, behavioral and linguistic activities. A scaling law suggests the existence of processes or patterns that are repeated across scales of analysis. Although the variables that express a scaling law can vary from one type of activity to the next, the recurrence of scaling laws across so many different systems has prompted a search for unifying principles. In biological systems, scaling laws can reflect adaptive processes of various types and are often linked to complex systems poised near critical points. The same is true for perception, memory, language and other cognitive phenomena. Findings of scaling laws in cognitive science are indicative of scaling invariance in cognitive mechanisms and multiplicative interactions among interdependent components of cognition. |

## 2009 |

Ferrer-i-Cancho, R; Gavald`a, R The frequency spectrum of finite samples from the intermittent silence process Journal Article Journal of the American Association for Information Science and Technology, 60 (4), pp. 837-843, 2009. Links | BibTeX | Tags: Zipf's law for word frequencies @article{Ferrer2009a, title = {The frequency spectrum of finite samples from the intermittent silence process}, author = {R Ferrer-i-Cancho and R Gavald`a}, doi = {10.1002/asi.21033}, year = {2009}, date = {2009-01-01}, journal = {Journal of the American Association for Information Science and Technology}, volume = {60}, number = {4}, pages = {837-843}, keywords = {Zipf's law for word frequencies}, pubstate = {published}, tppubtype = {article} } |

Ferrer-i-Cancho, R; Forns, N The self-organization of genomes Journal Article Complexity, 15 (5), pp. 34-36, 2009. Links | BibTeX | Tags: genomes, Menzerath's law @article{Ferrer2009e, title = {The self-organization of genomes}, author = {R Ferrer-i-Cancho and N Forns}, doi = {10.1002/cplx.20296}, year = {2009}, date = {2009-01-01}, journal = {Complexity}, volume = {15}, number = {5}, pages = {34-36}, keywords = {genomes, Menzerath's law}, pubstate = {published}, tppubtype = {article} } |

Ferrer-i-Cancho, R; McCowan, B A law of word meaning in dolphin whistle types Journal Article Entropy, 11 (4), pp. 688-701, 2009. Links | BibTeX | Tags: Zipf's meaning-frequency law @article{Ferrer2009f, title = {A law of word meaning in dolphin whistle types}, author = {R Ferrer-i-Cancho and B McCowan}, doi = {10.3390/e11040688}, year = {2009}, date = {2009-01-01}, journal = {Entropy}, volume = {11}, number = {4}, pages = {688-701}, keywords = {Zipf's meaning-frequency law}, pubstate = {published}, tppubtype = {article} } |

Ferrer-i-Cancho, R; Lusseau, D Efficient coding in dolphin surface behavioral patterns Journal Article Complexity, 14 (5), pp. 23-25, 2009. Links | BibTeX | Tags: Zipf's law of abbreviation @article{Ferrer2009g, title = {Efficient coding in dolphin surface behavioral patterns}, author = {R Ferrer-i-Cancho and D Lusseau}, doi = {10.1002/cplx.20266}, year = {2009}, date = {2009-01-01}, journal = {Complexity}, volume = {14}, number = {5}, pages = {23-25}, keywords = {Zipf's law of abbreviation}, pubstate = {published}, tppubtype = {article} } |

Ferrer-i-Cancho, R; Elvevåg, B Random texts do not exhibit the real Zipf's-law-like rank distribution Journal Article PLoS ONE, 5 (4), pp. e9411, 2009. Links | BibTeX | Tags: Zipf's law for word frequencies @article{Ferrer2009b, title = {Random texts do not exhibit the real Zipf's-law-like rank distribution}, author = {R Ferrer-i-Cancho and B Elvevåg}, doi = {10.1371/journal.pone.0009411}, year = {2009}, date = {2009-01-01}, journal = {PLoS ONE}, volume = {5}, number = {4}, pages = {e9411}, keywords = {Zipf's law for word frequencies}, pubstate = {published}, tppubtype = {article} } |

## 2008 |

Ferrer-i-Cancho, R; Lorenzo, G; Longa, V Long-distance dependencies are not uniquely human Incollection Smith, A D M; Smith, K; Ferrer-i-Cancho, R (Ed.): The Evolution of Language: Proceedings of the 7th International Conference (EVOLANG7), World Scientific Press, Singapore, 2008. Abstract | Links | BibTeX | Tags: @incollection{Ferrer2008a, title = {Long-distance dependencies are not uniquely human}, author = {R Ferrer-i-Cancho and G Lorenzo and V Longa}, editor = {A D M Smith and K Smith and R Ferrer-i-Cancho}, doi = {10.1142/9789812776129_0015}, year = {2008}, date = {2008-01-01}, booktitle = {The Evolution of Language: Proceedings of the 7th International Conference (EVOLANG7)}, publisher = {World Scientific Press}, address = {Singapore}, abstract = {It is widely assumed that long-distance dependencies between elements are a unique feature of human language. Here we review recent evidence of long-distance correlations in sequences produced by non-human species and discuss two evolutionary scenarios for the evolution of human language in the light of these findings. Though applying their methodological framework, we conclude that some of Hauser, Chomsky and Fitch's central claims on language evolution are put into question to a different degree within each of those scenarios.}, keywords = {}, pubstate = {published}, tppubtype = {incollection} } It is widely assumed that long-distance dependencies between elements are a unique feature of human language. Here we review recent evidence of long-distance correlations in sequences produced by non-human species and discuss two evolutionary scenarios for the evolution of human language in the light of these findings. Though applying their methodological framework, we conclude that some of Hauser, Chomsky and Fitch's central claims on language evolution are put into question to a different degree within each of those scenarios. |

Ferrer-i-Cancho, R Information theory Incollection information P, (Ed.): The Cambridge encyclopedia of the language sciences, Cambridge University Press, 2008. BibTeX | Tags: @incollection{Ferrer2008b, title = {Information theory}, author = {R Ferrer-i-Cancho}, editor = {{information P } = keywords theory P. Colm Hogan}, year = {2008}, date = {2008-01-01}, booktitle = {The Cambridge encyclopedia of the language sciences}, publisher = {Cambridge University Press}, keywords = {}, pubstate = {published}, tppubtype = {incollection} } |

Ferrer-i-Cancho, R Network theory Incollection Hogan, Colm P P (Ed.): The Cambridge encyclopedia of the language sciences, pp. 555-557, Cambridge University Press, 2008. BibTeX | Tags: network science @incollection{Ferrer2008c, title = {Network theory}, author = {R Ferrer-i-Cancho}, editor = {P P. Colm Hogan}, year = {2008}, date = {2008-01-01}, booktitle = {The Cambridge encyclopedia of the language sciences}, pages = {555-557}, publisher = {Cambridge University Press}, keywords = {network science}, pubstate = {published}, tppubtype = {incollection} } |

Ferrer-i-Cancho, R; Fernández, Hernández A Power laws and the golden number Incollection Kelih, E; Levickij, V; Altmann, G (Ed.): Problems of text analysis, 2008. BibTeX | Tags: Zipf's law for word frequencies @incollection{Ferrer2008d, title = {Power laws and the golden number}, author = {R Ferrer-i-Cancho and A Hernández Fernández}, editor = {E Kelih and V Levickij and G Altmann}, year = {2008}, date = {2008-01-01}, booktitle = {Problems of text analysis}, keywords = {Zipf's law for word frequencies}, pubstate = {published}, tppubtype = {incollection} } |

Ferrer-i-Cancho, R Some word order biases from limited brain resources. A mathematical approach Journal Article Advances in Complex Systems, 11 (3), pp. 393-414, 2008. Links | BibTeX | Tags: word order @article{Ferrer2008e, title = {Some word order biases from limited brain resources. A mathematical approach}, author = {R Ferrer-i-Cancho}, doi = {10.1142/S0219525908001702}, year = {2008}, date = {2008-01-01}, journal = {Advances in Complex Systems}, volume = {11}, number = {3}, pages = {393-414}, keywords = {word order}, pubstate = {published}, tppubtype = {article} } |

Ferrer-i-Cancho, R Some limits of standard linguistic typology. The case of Cysouw's models for the frequencies of the six possible orderings of S, V and O Journal Article Advances in Complex Systems, 11 (3), pp. 421-432, 2008. Links | BibTeX | Tags: word order @article{Ferrer2008f, title = {Some limits of standard linguistic typology. The case of Cysouw's models for the frequencies of the six possible orderings of S, V and O}, author = {R Ferrer-i-Cancho}, doi = {10.1142/S0219525908001702}, year = {2008}, date = {2008-01-01}, journal = {Advances in Complex Systems}, volume = {11}, number = {3}, pages = {421-432}, keywords = {word order}, pubstate = {published}, tppubtype = {article} } |

Smith, Andrew D M; Smith, Kenny; Ferrer-i-Cancho, Ramon (Ed.) The Evolution of Language: Proceedings of the 7th International Conference (EVOLANG7) Book World Scientific Press, Singapore, 2008. BibTeX | Tags: evolutionary biology @book{Smith2008a, title = {The Evolution of Language: Proceedings of the 7th International Conference (EVOLANG7)}, editor = {Andrew D M Smith and Kenny Smith and Ramon Ferrer-i-Cancho}, year = {2008}, date = {2008-01-01}, publisher = {World Scientific Press}, address = {Singapore}, keywords = {evolutionary biology}, pubstate = {published}, tppubtype = {book} } |

## 2007 |

Ferrer-i-Cancho, R; Capocci, A; Caldarelli, G Spectral methods cluster words of the same class in a syntactic dependency network Journal Article International Journal of Bifurcation and Chaos, 17 (7), pp. 2453-2463, 2007. Abstract | Links | BibTeX | Tags: network science @article{Ferrer2005a, title = {Spectral methods cluster words of the same class in a syntactic dependency network}, author = {R Ferrer-i-Cancho and A Capocci and G Caldarelli}, doi = {10.1142/S021812740701852X}, year = {2007}, date = {2007-01-01}, journal = {International Journal of Bifurcation and Chaos}, volume = {17}, number = {7}, pages = {2453-2463}, abstract = {We analyze here a particular kind of linguistic network where vertices represent words and edges stand for syntactic relationships between words. The statistical properties of these networks have been recently studied and various features such as the small-world phenomenon and a scale-free distribution of degrees have been found. Our work focuses on four classes of words: verbs, nouns, adverbs and adjectives. Here, we use spectral methods sorting vertices. We show that the ordering clusters words of the same class. For nouns and verbs, the cluster size distribution clearly follows a power-law distribution that cannot be explained by a null hypothesis. Long-range correlations are found between vertices in the ordering provided by the spectral method. The findings support the use of spectral methods for detecting community structure.}, keywords = {network science}, pubstate = {published}, tppubtype = {article} } We analyze here a particular kind of linguistic network where vertices represent words and edges stand for syntactic relationships between words. The statistical properties of these networks have been recently studied and various features such as the small-world phenomenon and a scale-free distribution of degrees have been found. Our work focuses on four classes of words: verbs, nouns, adverbs and adjectives. Here, we use spectral methods sorting vertices. We show that the ordering clusters words of the same class. For nouns and verbs, the cluster size distribution clearly follows a power-law distribution that cannot be explained by a null hypothesis. Long-range correlations are found between vertices in the ordering provided by the spectral method. The findings support the use of spectral methods for detecting community structure. |

Ferrer-i-Cancho, R On the universality of Zipf's law for word frequencies Incollection Grzybek, P; Köhler, R (Ed.): Exact methods in the study of language and text. To honor Gabriel Altmann, pp. 131-140, Gruyter, Berlin, 2007. Links | BibTeX | Tags: information theory, Zipf's law for word frequencies @incollection{Ferrer2006a, title = {On the universality of Zipf's law for word frequencies}, author = {R Ferrer-i-Cancho}, editor = {P Grzybek and R Köhler}, doi = {10.1515/9783110894219.131}, year = {2007}, date = {2007-01-01}, booktitle = {Exact methods in the study of language and text. To honor Gabriel Altmann}, pages = {131-140}, publisher = {Gruyter}, address = {Berlin}, keywords = {information theory, Zipf's law for word frequencies}, pubstate = {published}, tppubtype = {incollection} } |

Ferrer-i-Cancho, R; Díaz-Guilera, A The global minima of the communicative energy of natural communication systems Journal Article Journal of Statistical Mechanics, pp. P06009, 2007. Abstract | Links | BibTeX | Tags: information theory @article{Ferrer2007a, title = {The global minima of the communicative energy of natural communication systems}, author = {R Ferrer-i-Cancho and A Díaz-Guilera}, doi = {10.1088/1742-5468/2007/06/P06009}, year = {2007}, date = {2007-01-01}, journal = {Journal of Statistical Mechanics}, pages = {P06009}, abstract = {Until recently, models of communication have explicitly or implicitly assumed that the goal of a communication system is just maximizing the information transfer between signals and 'meanings'. Recently, it has been argued that a natural communication system not only has to maximize this quantity but also has to minimize the entropy of signals, which is a measure of the cognitive cost of using a word. The interplay between these two factors, i.e. maximization of the information transfer and minimization of the entropy, has been addressed previously using a Monte Carlo minimization procedure at zero temperature. Here we derive analytically the globally optimal communication systems that result from the interaction between these factors. We discuss the implications of our results for previous studies within this framework. In particular we prove that the emergence of Zipf's law using a Monte Carlo technique at zero temperature in previous studies indicates that the system had not reached the global optimum.}, keywords = {information theory}, pubstate = {published}, tppubtype = {article} } Until recently, models of communication have explicitly or implicitly assumed that the goal of a communication system is just maximizing the information transfer between signals and 'meanings'. Recently, it has been argued that a natural communication system not only has to maximize this quantity but also has to minimize the entropy of signals, which is a measure of the cognitive cost of using a word. The interplay between these two factors, i.e. maximization of the information transfer and minimization of the entropy, has been addressed previously using a Monte Carlo minimization procedure at zero temperature. Here we derive analytically the globally optimal communication systems that result from the interaction between these factors. We discuss the implications of our results for previous studies within this framework. In particular we prove that the emergence of Zipf's law using a Monte Carlo technique at zero temperature in previous studies indicates that the system had not reached the global optimum. |

Ferrer-i-Cancho, R; Mehler, A; Pustylnikov, O; Díaz-Guilera, A Correlations in the organization of large-scale syntactic dependency networks Inproceedings Proceedings of the workshop TextGraphs-2: Graph-based Methods for Natural Language Processing at the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2007), Rochester, New York, pp. 65-72, 2007. Abstract | Links | BibTeX | Tags: network science @inproceedings{Ferrer2007b, title = {Correlations in the organization of large-scale syntactic dependency networks}, author = {R Ferrer-i-Cancho and A Mehler and O Pustylnikov and A Díaz-Guilera}, url = {https://www.aclweb.org/anthology/W07-0210}, year = {2007}, date = {2007-01-01}, booktitle = {Proceedings of the workshop TextGraphs-2: Graph-based Methods for Natural Language Processing at the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2007), Rochester, New York}, pages = {65-72}, abstract = {We study the correlations in the connectivity patterns of large scale syntactic dependency networks. These networks are induced from treebanks: their vertices denote word forms which occur as nuclei of dependency trees. Their edges connect pairs of vertices if at least two instance nuclei of these vertices are linked in the dependency structure of a sentence. We examine the syntactic dependency networks of seven languages. In all these cases, we consistently obtain three findings. Firstly, clustering, i.e., the probability that two vertices which are linked to a common vertex are linked on their part, is much higher than expected by chance. Secondly, the mean clustering of vertices decreases with their degree - this finding suggests the presence of a hierarchical network organization. Thirdly, the mean degree of the nearest neighbors of a vertex x tends to decrease as the degree of x grows - this finding indicates disassortative mixing in the sense that links tend to connect vertices of dissimilar degrees. Our results indicate the existence of common patterns in the large scale organization of syntactic dependency networks.}, keywords = {network science}, pubstate = {published}, tppubtype = {inproceedings} } We study the correlations in the connectivity patterns of large scale syntactic dependency networks. These networks are induced from treebanks: their vertices denote word forms which occur as nuclei of dependency trees. Their edges connect pairs of vertices if at least two instance nuclei of these vertices are linked in the dependency structure of a sentence. We examine the syntactic dependency networks of seven languages. In all these cases, we consistently obtain three findings. Firstly, clustering, i.e., the probability that two vertices which are linked to a common vertex are linked on their part, is much higher than expected by chance. Secondly, the mean clustering of vertices decreases with their degree - this finding suggests the presence of a hierarchical network organization. Thirdly, the mean degree of the nearest neighbors of a vertex x tends to decrease as the degree of x grows - this finding indicates disassortative mixing in the sense that links tend to connect vertices of dissimilar degrees. Our results indicate the existence of common patterns in the large scale organization of syntactic dependency networks. |

## 2006 |

Ferrer-i-Cancho, R When language breaks into pieces. A conflict between communication through isolated signals and language Journal Article Biosystems, 84 , pp. 242-253, 2006. Abstract | Links | BibTeX | Tags: information theory, network science @article{Ferrer2005e, title = {When language breaks into pieces. A conflict between communication through isolated signals and language}, author = {R Ferrer-i-Cancho}, doi = {10.1016/j.biosystems.2005.12.001}, year = {2006}, date = {2006-01-01}, journal = {Biosystems}, volume = {84}, pages = {242-253}, abstract = {Here, we study a communication model where signals associate to stimuli. The model assumes that signals follow Zipf’s law and the exponent of the law depends on a balance between maximizing the information transfer and saving the cost of signal use. We study the effect of tuning that balance on the structure of signal–stimulus associations. The model starts from two recent results. First, the exponent grows as the weight of information transfer increases. Second, a rudimentary form of language is obtained when the network of signal–stimulus associations is almost connected. Here, we show the existence of a sudden destruction of language once a critical balance is crossed. The model shows that maximizing the information transfer through isolated signals and language are in conflict. The model proposes a strong reason for not finding large exponents in complex communication systems: language is in danger. Besides, the findings suggest that human words may need to be ambiguous to keep language alive. Interestingly, the model predicts that large exponents should be associated to decreased synaptic density. It is not surprising that the largest exponents correspond to schizophrenic patients since, according to the spirit of Feinberg’s hypothesis, i.e. decreased synaptic density may lead to schizophrenia. Our findings suggest that the exponent of Zipf’s law is intimately related to language and that it could be used to detect anomalous structure and organization of the brain.}, keywords = {information theory, network science}, pubstate = {published}, tppubtype = {article} } Here, we study a communication model where signals associate to stimuli. The model assumes that signals follow Zipf’s law and the exponent of the law depends on a balance between maximizing the information transfer and saving the cost of signal use. We study the effect of tuning that balance on the structure of signal–stimulus associations. The model starts from two recent results. First, the exponent grows as the weight of information transfer increases. Second, a rudimentary form of language is obtained when the network of signal–stimulus associations is almost connected. Here, we show the existence of a sudden destruction of language once a critical balance is crossed. The model shows that maximizing the information transfer through isolated signals and language are in conflict. The model proposes a strong reason for not finding large exponents in complex communication systems: language is in danger. Besides, the findings suggest that human words may need to be ambiguous to keep language alive. Interestingly, the model predicts that large exponents should be associated to decreased synaptic density. It is not surprising that the largest exponents correspond to schizophrenic patients since, according to the spirit of Feinberg’s hypothesis, i.e. decreased synaptic density may lead to schizophrenia. Our findings suggest that the exponent of Zipf’s law is intimately related to language and that it could be used to detect anomalous structure and organization of the brain. |

Ferrer-i-Cancho, R; Lusseau, D Long-term correlations in the surface behavior of dolphins Journal Article Europhysics Letters, 74 (6), pp. 1095-1101, 2006. Abstract | Links | BibTeX | Tags: information theory @article{Ferrer2005h, title = {Long-term correlations in the surface behavior of dolphins}, author = {R Ferrer-i-Cancho and D Lusseau}, doi = {10.1209/epl/i2005-10596-9}, year = {2006}, date = {2006-01-01}, journal = {Europhysics Letters}, volume = {74}, number = {6}, pages = {1095-1101}, abstract = {Here we study the sequences of surface behavioral patterns of dolphins (Tursiops sp.) and find long-term correlations. We show that the long-term correlations are not of a trivial nature, i.e. they cannot be explained by the repetition of the same surface behavior many times in a row. Our findings suggest that dolphins have a long collective memory extending back at least to the 7-th past behavior. As far as we know, this is the first evidence of long-term correlations in the behavior of a non-human species.}, keywords = {information theory}, pubstate = {published}, tppubtype = {article} } Here we study the sequences of surface behavioral patterns of dolphins (Tursiops sp.) and find long-term correlations. We show that the long-term correlations are not of a trivial nature, i.e. they cannot be explained by the repetition of the same surface behavior many times in a row. Our findings suggest that dolphins have a long collective memory extending back at least to the 7-th past behavior. As far as we know, this is the first evidence of long-term correlations in the behavior of a non-human species. |

Ferrer-i-Cancho, R Why do syntactic links not cross? Journal Article Europhysics Letters, 76 (6), pp. 1228-1235, 2006. Abstract | Links | BibTeX | Tags: network science, word order @article{Ferrer2006d, title = {Why do syntactic links not cross?}, author = {R Ferrer-i-Cancho}, doi = {10.1209/epl/i2006-10406-0}, year = {2006}, date = {2006-01-01}, journal = {Europhysics Letters}, volume = {76}, number = {6}, pages = {1228-1235}, abstract = {Here we study the arrangement of vertices of trees in a 1-dimensional Euclidean space when the Euclidean distance between linked vertices is minimized. We conclude that links are unlikely to cross when drawn over the vertex sequence. This finding suggests that the uncommonness of crossings in the trees specifying the syntactic structure of sentences could be a side-effect of minimizing the Euclidean distance between syntactically related words. As far as we know, nobody has provided a successful explanation of such a surprisingly universal feature of languages that was discovered in the 60s of the past century by Hays and Lecerf. On the one hand, support for the role of distance minimization in avoiding edge crossings comes from statistical studies showing that the Euclidean distance between syntactically linked words of real sentences is minimized or constrained to a small value. On the other hand, that distance is considered a measure of the cost of syntactic relationships in various frameworks. By cost, we mean the amount of computational resources needed by the brain. The absence of crossings in syntactic trees may be universal just because all human brains have limited resources.}, keywords = {network science, word order}, pubstate = {published}, tppubtype = {article} } Here we study the arrangement of vertices of trees in a 1-dimensional Euclidean space when the Euclidean distance between linked vertices is minimized. We conclude that links are unlikely to cross when drawn over the vertex sequence. This finding suggests that the uncommonness of crossings in the trees specifying the syntactic structure of sentences could be a side-effect of minimizing the Euclidean distance between syntactically related words. As far as we know, nobody has provided a successful explanation of such a surprisingly universal feature of languages that was discovered in the 60s of the past century by Hays and Lecerf. On the one hand, support for the role of distance minimization in avoiding edge crossings comes from statistical studies showing that the Euclidean distance between syntactically linked words of real sentences is minimized or constrained to a small value. On the other hand, that distance is considered a measure of the cost of syntactic relationships in various frameworks. By cost, we mean the amount of computational resources needed by the brain. The absence of crossings in syntactic trees may be universal just because all human brains have limited resources. |

## 2005 |

Ferrer-i-Cancho, R Decoding least effort and scaling in signal frequency distributions Journal Article Physica A, 345 , pp. 275-284, 2005. Abstract | Links | BibTeX | Tags: information theory, Zipf's law for word frequencies @article{Ferrer2003c, title = {Decoding least effort and scaling in signal frequency distributions}, author = {R Ferrer-i-Cancho}, doi = {10.1016/j.physa.2004.06.158}, year = {2005}, date = {2005-01-01}, journal = {Physica A}, volume = {345}, pages = {275-284}, abstract = {Here, assuming a general communication model where objects map to signals, a power function for the distribution of signal frequencies is derived. The model relies on the satisfaction of the receiver (hearer) communicative needs when the entropy of the number of objects per signal is maximized. Evidence of power distributions in a linguistic context (some of them with exponents clearly different from the typical $beta approx 2$ of Zipf's law) is reviewed and expanded. We support the view that Zipf's law reflects some sort of optimization but following a novel realistic approach where signals (e.g. words) are used according to the objects (e.g. meanings) they are linked to. Our results strongly suggest that many systems in nature use non-trivial strategies for easing the interpretation of a signal. Interestingly, constraining just the number of interpretations of signals does not lead to scaling.}, keywords = {information theory, Zipf's law for word frequencies}, pubstate = {published}, tppubtype = {article} } Here, assuming a general communication model where objects map to signals, a power function for the distribution of signal frequencies is derived. The model relies on the satisfaction of the receiver (hearer) communicative needs when the entropy of the number of objects per signal is maximized. Evidence of power distributions in a linguistic context (some of them with exponents clearly different from the typical $beta approx 2$ of Zipf's law) is reviewed and expanded. We support the view that Zipf's law reflects some sort of optimization but following a novel realistic approach where signals (e.g. words) are used according to the objects (e.g. meanings) they are linked to. Our results strongly suggest that many systems in nature use non-trivial strategies for easing the interpretation of a signal. Interestingly, constraining just the number of interpretations of signals does not lead to scaling. |

Ferrer-i-Cancho, R The variation of Zipf's law in human language Journal Article European Physical Journal B, 44 , pp. 249-257, 2005. Abstract | Links | BibTeX | Tags: information theory @article{Ferrer2004a, title = {The variation of Zipf's law in human language}, author = {R Ferrer-i-Cancho}, doi = {10.1140/epjb/e2005-00121-8}, year = {2005}, date = {2005-01-01}, journal = {European Physical Journal B}, volume = {44}, pages = {249-257}, abstract = {Words in humans follow the so-called Zipf’s law. More precisely, the word frequency spectrum follows a power function, whose typical exponent is β≈2, but significant variations are found. We hypothesize that the full range of variation reflects our ability to balance the goal of communication, i.e. maximizing the information transfer and the cost of communication, imposed by the limitations of the human brain. We show that the higher the importance of satisfying the goal of communication, the higher the exponent. Here, assuming that words are used according to their meaning we explain why variation in β should be limited to a particular domain. From the one hand, we explain a non-trivial lower bound at about β=1.6 for communication systems neglecting the goal of the communication. From the other hand, we find a sudden divergence of β if a certain critical balance is crossed. At the same time a sharp transition to maximum information transfer and unfortunately, maximum communication cost, is found. Consistently with the upper bound of real exponents, the maximum finite value predicted is about β=2.4. It is convenient for human language not to cross the transition and remain in a domain where maximum information transfer is high but at a reasonable cost. Therefore, only a particular range of exponents should be found in human speakers. The exponent β contains information about the balance between cost and communicative efficiency.}, keywords = {information theory}, pubstate = {published}, tppubtype = {article} } Words in humans follow the so-called Zipf’s law. More precisely, the word frequency spectrum follows a power function, whose typical exponent is β≈2, but significant variations are found. We hypothesize that the full range of variation reflects our ability to balance the goal of communication, i.e. maximizing the information transfer and the cost of communication, imposed by the limitations of the human brain. We show that the higher the importance of satisfying the goal of communication, the higher the exponent. Here, assuming that words are used according to their meaning we explain why variation in β should be limited to a particular domain. From the one hand, we explain a non-trivial lower bound at about β=1.6 for communication systems neglecting the goal of the communication. From the other hand, we find a sudden divergence of β if a certain critical balance is crossed. At the same time a sharp transition to maximum information transfer and unfortunately, maximum communication cost, is found. Consistently with the upper bound of real exponents, the maximum finite value predicted is about β=2.4. It is convenient for human language not to cross the transition and remain in a domain where maximum information transfer is high but at a reasonable cost. Therefore, only a particular range of exponents should be found in human speakers. The exponent β contains information about the balance between cost and communicative efficiency. |

Ferrer-i-Cancho, R Zipf's law from a communicative phase transition Journal Article European Physical Journal B, 47 , pp. 449-457, 2005. Abstract | Links | BibTeX | Tags: information theory @article{Ferrer2004e, title = {Zipf's law from a communicative phase transition}, author = {R Ferrer-i-Cancho}, doi = {10.1140/epjb/e2005-00340-y}, year = {2005}, date = {2005-01-01}, journal = {European Physical Journal B}, volume = {47}, pages = {449-457}, abstract = {Here we present a new model for Zipf's law in human word frequencies. The model defines the goal and the cost of communication using information theory. The model shows a continuous phase transition from a no communication to a perfect communication phase. Scaling consistent with Zipf's law is found in the boundary between phases. The exponents are consistent with minimizing the entropy of words. The model differs from a previous model [Ferrer i Cancho, Solé, Proc. Natl. Acad. Sci. USA 100, 788–791 (2003)] in two aspects. First, it assumes that the probability of experiencing a certain stimulus is controlled by the internal structure of the communication system rather than by the probability of experiencing it in the `outside' world, which makes it specially suitable for the speech of schizophrenics. Second, the exponent α predicted for the frequency versus rank distribution is in a range where α>1, which may explain that of some schizophrenics and some children, with α=1.5-1.6. Among the many models for Zipf's law, none explains Zipf's law for that particular range of exponents. In particular, two simplistic models fail to explain that particular range of exponents: intermittent silence and Simon's model. We support that Zipf's law in a communication system may maximize the information transfer under constraints.}, keywords = {information theory}, pubstate = {published}, tppubtype = {article} } Here we present a new model for Zipf's law in human word frequencies. The model defines the goal and the cost of communication using information theory. The model shows a continuous phase transition from a no communication to a perfect communication phase. Scaling consistent with Zipf's law is found in the boundary between phases. The exponents are consistent with minimizing the entropy of words. The model differs from a previous model [Ferrer i Cancho, Solé, Proc. Natl. Acad. Sci. USA 100, 788–791 (2003)] in two aspects. First, it assumes that the probability of experiencing a certain stimulus is controlled by the internal structure of the communication system rather than by the probability of experiencing it in the `outside' world, which makes it specially suitable for the speech of schizophrenics. Second, the exponent α predicted for the frequency versus rank distribution is in a range where α>1, which may explain that of some schizophrenics and some children, with α=1.5-1.6. Among the many models for Zipf's law, none explains Zipf's law for that particular range of exponents. In particular, two simplistic models fail to explain that particular range of exponents: intermittent silence and Simon's model. We support that Zipf's law in a communication system may maximize the information transfer under constraints. |

Ferrer-i-Cancho, R; Riordan, O; Bollobás, B The consequences of Zipf's law for syntax and symbolic reference Journal Article Proceedings of the Royal Society of London B, 272 , pp. 561-565, 2005. Abstract | Links | BibTeX | Tags: network science, Zipf's law for word frequencies @article{Ferrer2004f, title = {The consequences of Zipf's law for syntax and symbolic reference}, author = {R Ferrer-i-Cancho and O Riordan and B Bollobás}, doi = {10.1098/rspb.2004.2957}, year = {2005}, date = {2005-01-01}, journal = {Proceedings of the Royal Society of London B}, volume = {272}, pages = {561-565}, abstract = {Although many species possess rudimentary communication systems, humans seem to be unique with regard to making use of syntax and symbolic reference. Recent approaches to the evolution of language formalize why syntax is selectively advantageous compared with isolated signal communication systems, but do not explain how signals naturally combine. Even more recent work has shown that if a communication system maximizes communicative efficiency while minimizing the cost of communication, or if a communication system constrains ambiguity in a non-trivial way while a certain entropy is maximized, signal frequencies will be distributed according to Zipf's law. Here we show that such communication principles give rise not only to signals that have many traits in common with the linking words in real human languages, but also to a rudimentary sort of syntax and symbolic reference.}, keywords = {network science, Zipf's law for word frequencies}, pubstate = {published}, tppubtype = {article} } Although many species possess rudimentary communication systems, humans seem to be unique with regard to making use of syntax and symbolic reference. Recent approaches to the evolution of language formalize why syntax is selectively advantageous compared with isolated signal communication systems, but do not explain how signals naturally combine. Even more recent work has shown that if a communication system maximizes communicative efficiency while minimizing the cost of communication, or if a communication system constrains ambiguity in a non-trivial way while a certain entropy is maximized, signal frequencies will be distributed according to Zipf's law. Here we show that such communication principles give rise not only to signals that have many traits in common with the linking words in real human languages, but also to a rudimentary sort of syntax and symbolic reference. |

Ferrer-i-Cancho, R; P., Vito D Can simple models explain Zipf's law for all exponents? Journal Article Glottometrics, 11 , pp. 1-8, 2005. Abstract | Links | BibTeX | Tags: Zipf's law for word frequencies @article{Ferrer2005c, title = {Can simple models explain Zipf's law for all exponents?}, author = {R Ferrer-i-Cancho and Vito D. P.}, url = {http://hdl.handle.net/2117/176249}, year = {2005}, date = {2005-01-01}, journal = {Glottometrics}, volume = {11}, pages = {1-8}, abstract = {H. Simon proposed a simple stochastic process for explaining Zipf’s law for word frequencies. Here we introduce two similar generalizations of Simon’s model that cover the same range of exponents as the standard Simon model. The mathematical approach followed minimizes the amount of mathematical background needed for deriving the exponent, compared to previous approaches to the standard Simon’s model. Reviewing what is known from other simple explanations of Zipf’s law, we conclude there is no single radically simple explanation covering the whole range of variation of the exponent of Zipf’s law in humans. The meaningfulness of Zipf’s law for word frequencies remains an open question.}, keywords = {Zipf's law for word frequencies}, pubstate = {published}, tppubtype = {article} } H. Simon proposed a simple stochastic process for explaining Zipf’s law for word frequencies. Here we introduce two similar generalizations of Simon’s model that cover the same range of exponents as the standard Simon model. The mathematical approach followed minimizes the amount of mathematical background needed for deriving the exponent, compared to previous approaches to the standard Simon’s model. Reviewing what is known from other simple explanations of Zipf’s law, we conclude there is no single radically simple explanation covering the whole range of variation of the exponent of Zipf’s law in humans. The meaningfulness of Zipf’s law for word frequencies remains an open question. |

Ferrer-i-Cancho, R Hidden communication aspects inside the exponent of Zipf's law Journal Article Glottometrics, 11 , pp. 96-117, 2005. BibTeX | Tags: information theory @article{Ferrer2005d, title = {Hidden communication aspects inside the exponent of Zipf's law}, author = {R Ferrer-i-Cancho}, year = {2005}, date = {2005-01-01}, journal = {Glottometrics}, volume = {11}, pages = {96-117}, keywords = {information theory}, pubstate = {published}, tppubtype = {article} } |

Ferrer-i-Cancho, R The structure of syntactic dependency networks from recent advances in the study of linguistic networks Incollection Levickij, V; Altmann, G (Ed.): The problems in quantitative linguistics, pp. 60-75, Ruta, Chernivtsi, 2005. Abstract | BibTeX | Tags: network science @incollection{Ferrer2005f, title = {The structure of syntactic dependency networks from recent advances in the study of linguistic networks}, author = {R Ferrer-i-Cancho}, editor = {V Levickij and G Altmann}, year = {2005}, date = {2005-01-01}, booktitle = {The problems in quantitative linguistics}, pages = {60-75}, publisher = {Ruta}, address = {Chernivtsi}, abstract = {Complex networks have received substantial attention from physics recently. Here we review from a physics perspective the different linguistic networks that have been studied. We focus on syntactic dependency networks and summarize some recent new results that suggest new possible ways of understanding the universal properties of world languages.}, keywords = {network science}, pubstate = {published}, tppubtype = {incollection} } Complex networks have received substantial attention from physics recently. Here we review from a physics perspective the different linguistic networks that have been studied. We focus on syntactic dependency networks and summarize some recent new results that suggest new possible ways of understanding the universal properties of world languages. |

## 2004 |

Ferrer-i-Cancho, R; Solé, R V; Köhler, R Patterns in syntactic dependency networks Journal Article Physical Review E, 69 , pp. 051915, 2004. Abstract | Links | BibTeX | Tags: network science @article{Ferrer2003f, title = {Patterns in syntactic dependency networks}, author = {R Ferrer-i-Cancho and R V Solé and R Köhler}, doi = {10.1103/PhysRevE.69.051915}, year = {2004}, date = {2004-01-01}, journal = {Physical Review E}, volume = {69}, pages = {051915}, abstract = {Many languages are spoken on Earth. Despite their diversity, many robust language universals are known to exist. All languages share syntax, i.e., the ability of combining words for forming sentences. The origin of such traits is an issue of open debate. By using recent developments from the statistical physics of complex networks, we show that different syntactic dependency networks (from Czech, German, and Romanian) share many nontrivial statistical patterns such as the small world phenomenon, scaling in the distribution of degrees, and disassortative mixing. Such previously unreported features of syntax organization are not a trivial consequence of the structure of sentences, but an emergent trait at the global scale.}, keywords = {network science}, pubstate = {published}, tppubtype = {article} } Many languages are spoken on Earth. Despite their diversity, many robust language universals are known to exist. All languages share syntax, i.e., the ability of combining words for forming sentences. The origin of such traits is an issue of open debate. By using recent developments from the statistical physics of complex networks, we show that different syntactic dependency networks (from Czech, German, and Romanian) share many nontrivial statistical patterns such as the small world phenomenon, scaling in the distribution of degrees, and disassortative mixing. Such previously unreported features of syntax organization are not a trivial consequence of the structure of sentences, but an emergent trait at the global scale. |

Ferrer-i-Cancho, R Euclidean distance between syntactically linked words Journal Article Physical Review E, 70 , pp. 056135, 2004. Abstract | Links | BibTeX | Tags: network science, word order @article{Ferrer2004b, title = {Euclidean distance between syntactically linked words}, author = {R Ferrer-i-Cancho}, doi = {10.1103/PhysRevE.70.056135}, year = {2004}, date = {2004-01-01}, journal = {Physical Review E}, volume = {70}, pages = {056135}, abstract = {We study the Euclidean distance between syntactically linked words in sentences. The average distance is significantly small and is a very slowly growing function of sentence length. We consider two nonexcluding hypotheses: (a) the average distance is minimized and (b) the average distance is constrained. Support for (a) comes from the significantly small average distance real sentences achieve. The strength of the minimization hypothesis decreases with the length of the sentence. Support for (b) comes from the very slow growth of the average distance versus sentence length. Furthermore, (b) predicts, under ideal conditions, an exponential distribution of the distance between linked words, a trend that can be identified in real sentences.}, keywords = {network science, word order}, pubstate = {published}, tppubtype = {article} } We study the Euclidean distance between syntactically linked words in sentences. The average distance is significantly small and is a very slowly growing function of sentence length. We consider two nonexcluding hypotheses: (a) the average distance is minimized and (b) the average distance is constrained. Support for (a) comes from the significantly small average distance real sentences achieve. The strength of the minimization hypothesis decreases with the length of the sentence. Support for (b) comes from the very slow growth of the average distance versus sentence length. Furthermore, (b) predicts, under ideal conditions, an exponential distribution of the distance between linked words, a trend that can be identified in real sentences. |

## 2003 |

Ferrer-i-Cancho, R; Solé, R V Least effort and the origins of scaling in human language Journal Article Proceedings of the National Academy of Sciences USA, 100 , pp. 788-791, 2003. Abstract | Links | BibTeX | Tags: information theory, Zipf's law for word frequencies @article{Ferrer2002a, title = {Least effort and the origins of scaling in human language}, author = {R Ferrer-i-Cancho and R V Solé}, doi = {10.1073/pnas.0335980100}, year = {2003}, date = {2003-01-01}, journal = {Proceedings of the National Academy of Sciences USA}, volume = {100}, pages = {788-791}, abstract = {The emergence of a complex language is one of the fundamental events of human evolution, and several remarkable features suggest the presence of fundamental principles of organization. These principles seem to be common to all languages. The best known is the so-called Zipf's law, which states that the frequency of a word decays as a (universal) power law of its rank. The possible origins of this law have been controversial, and its meaningfulness is still an open question. In this article, the early hypothesis of Zipf of a principle of least effort for explaining the law is shown to be sound. Simultaneous minimization in the effort of both hearer and speaker is formalized with a simple optimization process operating on a binary matrix of signal–object associations. Zipf's law is found in the transition between referentially useless systems and indexical reference systems. Our finding strongly suggests that Zipf's law is a hallmark of symbolic reference and not a meaningless feature. The implications for the evolution of language are discussed. We explain how language evolution can take advantage of a communicative phase transition.}, keywords = {information theory, Zipf's law for word frequencies}, pubstate = {published}, tppubtype = {article} } The emergence of a complex language is one of the fundamental events of human evolution, and several remarkable features suggest the presence of fundamental principles of organization. These principles seem to be common to all languages. The best known is the so-called Zipf's law, which states that the frequency of a word decays as a (universal) power law of its rank. The possible origins of this law have been controversial, and its meaningfulness is still an open question. In this article, the early hypothesis of Zipf of a principle of least effort for explaining the law is shown to be sound. Simultaneous minimization in the effort of both hearer and speaker is formalized with a simple optimization process operating on a binary matrix of signal–object associations. Zipf's law is found in the transition between referentially useless systems and indexical reference systems. Our finding strongly suggests that Zipf's law is a hallmark of symbolic reference and not a meaningless feature. The implications for the evolution of language are discussed. We explain how language evolution can take advantage of a communicative phase transition. |

Ferrer-i-Cancho, R; Solé, R V Optimization in complex networks Incollection Pastor-Satorras, R; Rubí, J M; Díaz-Guilera, A (Ed.): Statistical Mechanics of complex networks, 625 , pp. 114-125, Springer, Berlin, 2003. Abstract | Links | BibTeX | Tags: network science @incollection{Ferrer2003a, title = {Optimization in complex networks}, author = {R Ferrer-i-Cancho and R V Solé}, editor = {R Pastor-Satorras and J M Rubí and A Díaz-Guilera}, doi = {10.1007/b12331}, year = {2003}, date = {2003-01-01}, booktitle = {Statistical Mechanics of complex networks}, volume = {625}, pages = {114-125}, publisher = {Springer}, address = {Berlin}, series = {Lecture Notes in Physics}, abstract = {Many complex systems can be described in terms of networks of interacting units. Recent studies have shown that a wide class of both natural and artificial nets display a surprisingly widespread feature: the presence of highly heterogeneous distributions of links, providing an extraordinary source of robustness against perturbations. Although most theories concerning the origin of these topologies use growing graphs, here we show that a simple optimization process can also account for the observed regularities displayed by most complex nets. Using an evolutionary algorithm involving minimization of link density and average distance, four major types of networks are encountered: (a) sparse exponential-like networks, (b) sparse scale-free networks, (c) star networks and (d) highly dense networks, apparently defining three major phases. These constraints provide a new explanation for scaling of exponent about -3. The evolutionary consequences of these results are outlined.}, keywords = {network science}, pubstate = {published}, tppubtype = {incollection} } Many complex systems can be described in terms of networks of interacting units. Recent studies have shown that a wide class of both natural and artificial nets display a surprisingly widespread feature: the presence of highly heterogeneous distributions of links, providing an extraordinary source of robustness against perturbations. Although most theories concerning the origin of these topologies use growing graphs, here we show that a simple optimization process can also account for the observed regularities displayed by most complex nets. Using an evolutionary algorithm involving minimization of link density and average distance, four major types of networks are encountered: (a) sparse exponential-like networks, (b) sparse scale-free networks, (c) star networks and (d) highly dense networks, apparently defining three major phases. These constraints provide a new explanation for scaling of exponent about -3. The evolutionary consequences of these results are outlined. |

Ferrer-i-Cancho, R Language: universals, principles and origins PhD Thesis Universitat Politècnica de Catalunya, 2003. Abstract | BibTeX | Tags: information theory, network science @phdthesis{Ferrer2003b, title = {Language: universals, principles and origins}, author = {R Ferrer-i-Cancho}, year = {2003}, date = {2003-01-01}, address = {Barcelona}, school = {Universitat Politècnica de Catalunya}, abstract = {Here, old and new linguistic universals, i.e. properties obeyed by all languages on Earth are investigated. Basic principles of language predicting linguistic universals are also investigated. More precisely, two principles of reference, i.e. coding least effort and decoding least effort, a reformulation of G. K. Zipf's speaker and hearer least effort principles. Such referential principles predict Zipf's law, a universal of word frequencies, at the maximum tension between coding and decoding needs. Although trivial processes have been proposed for explaining Zipf's law in non-linguistic contexts, Zipf's law meaningfulness for human language is supported here. Minimizing the Euclidean distance between syntactically related words in sentences is a principle predicting projectivity, a universal stating that arcs between syntactically linked words in sentences generally do not cross. Besides, such a physical distance minimization successfully predicts (a) an exponential distribution for the distribution of the distance between syntactically related words and (b) subject-verb-object (SVO) order superiority in the actual use of world languages. Previously unreported non-trivial features of real syntactic dependency networks are presented here, i.e. scale-free degree distributions, small-world phenomenon, disassortative mixing and hierarchical organization. Instead of a universal grammar, a single universality class is proposed for world languages. Syntax and symbolic reference are unified under a single topological property, ie. connectedness in the network of signal-object associations of a communication system. Assuming Zipf's law, not only connectedness follows, but the above properties of real syntactic networks. Therefore, (a) referential principles are the principles of syntax and symbolic reference, (b) syntax is a by product of simple communication principles and (c) the above properties of syntactic dependency networks must be universal if Zipf's law is universal, which is the case. The transition to language is shown to be of the kind of a continuous phase transition in physics. Thereafter, the transition to human language could not have been gradual. The reduced network morphospace resulting from a combination of a network distance minimization principle and link density minimization principle is presented as an alternative hypothesis and a promising prospect for linguistic networks subject to fast communication pressures. The present thesis is unique among theories about the origins of language, in the sense that (a) it explains how words or signals naturally glue in order to form complex messages, (b) it validates its predictions with real data, (c) unifies syntax and symbolic reference and (d) uses ingredients already present in the animal communication systems, in a way no other approximations do. The framework presented is radical shift in the research of linguistic universals and its origins through the physics of critical phenomena. The principles presented here are not principles of human language, but principles of complex communication. Therefore, the such principles suggest new prospects for other information transmission systems in nature.}, keywords = {information theory, network science}, pubstate = {published}, tppubtype = {phdthesis} } Here, old and new linguistic universals, i.e. properties obeyed by all languages on Earth are investigated. Basic principles of language predicting linguistic universals are also investigated. More precisely, two principles of reference, i.e. coding least effort and decoding least effort, a reformulation of G. K. Zipf's speaker and hearer least effort principles. Such referential principles predict Zipf's law, a universal of word frequencies, at the maximum tension between coding and decoding needs. Although trivial processes have been proposed for explaining Zipf's law in non-linguistic contexts, Zipf's law meaningfulness for human language is supported here. Minimizing the Euclidean distance between syntactically related words in sentences is a principle predicting projectivity, a universal stating that arcs between syntactically linked words in sentences generally do not cross. Besides, such a physical distance minimization successfully predicts (a) an exponential distribution for the distribution of the distance between syntactically related words and (b) subject-verb-object (SVO) order superiority in the actual use of world languages. Previously unreported non-trivial features of real syntactic dependency networks are presented here, i.e. scale-free degree distributions, small-world phenomenon, disassortative mixing and hierarchical organization. Instead of a universal grammar, a single universality class is proposed for world languages. Syntax and symbolic reference are unified under a single topological property, ie. connectedness in the network of signal-object associations of a communication system. Assuming Zipf's law, not only connectedness follows, but the above properties of real syntactic networks. Therefore, (a) referential principles are the principles of syntax and symbolic reference, (b) syntax is a by product of simple communication principles and (c) the above properties of syntactic dependency networks must be universal if Zipf's law is universal, which is the case. The transition to language is shown to be of the kind of a continuous phase transition in physics. Thereafter, the transition to human language could not have been gradual. The reduced network morphospace resulting from a combination of a network distance minimization principle and link density minimization principle is presented as an alternative hypothesis and a promising prospect for linguistic networks subject to fast communication pressures. The present thesis is unique among theories about the origins of language, in the sense that (a) it explains how words or signals naturally glue in order to form complex messages, (b) it validates its predictions with real data, (c) unifies syntax and symbolic reference and (d) uses ingredients already present in the animal communication systems, in a way no other approximations do. The framework presented is radical shift in the research of linguistic universals and its origins through the physics of critical phenomena. The principles presented here are not principles of human language, but principles of complex communication. Therefore, the such principles suggest new prospects for other information transmission systems in nature. |

## 2002 |

Ferrer-i-Cancho, R; Solé, R V Zipf's law and random texts Journal Article Advances in Complex Systems, 5 , pp. 1-6, 2002. Abstract | Links | BibTeX | Tags: Zipf's law for word frequencies @article{Ferrer2001c, title = {Zipf's law and random texts}, author = {R Ferrer-i-Cancho and R V Solé}, doi = {10.1142/S0219525902000468}, year = {2002}, date = {2002-01-01}, journal = {Advances in Complex Systems}, volume = {5}, pages = {1-6}, abstract = {Random-text models have been proposed as an explanation for the power law relationship between word frequency and rank, the so-called Zipf's law. They are generally regarded as null hypotheses rather than models in the strict sense. In this context, recent theories of language emergence and evolution assume this law as a priori information with no need of explanation. Here, random texts and real texts are compared through (a) the so-called lexical spectrum and (b) the distribution of words having the same length. It is shown that real texts fill the lexical spectrum much more efficiently and regardless of the word length, suggesting that the meaningfulness of Zipf's law is high.}, keywords = {Zipf's law for word frequencies}, pubstate = {published}, tppubtype = {article} } Random-text models have been proposed as an explanation for the power law relationship between word frequency and rank, the so-called Zipf's law. They are generally regarded as null hypotheses rather than models in the strict sense. In this context, recent theories of language emergence and evolution assume this law as a priori information with no need of explanation. Here, random texts and real texts are compared through (a) the so-called lexical spectrum and (b) the distribution of words having the same length. It is shown that real texts fill the lexical spectrum much more efficiently and regardless of the word length, suggesting that the meaningfulness of Zipf's law is high. |

Ferrer-i-Cancho, R; Reina, F Quantifying the semantic contribution of particles Journal Article Journal of Quantitative Linguistics, 9 , pp. 35-47, 2002. Abstract | Links | BibTeX | Tags: information theory @article{Ferrer2002f, title = {Quantifying the semantic contribution of particles}, author = {R Ferrer-i-Cancho and F Reina}, doi = {10.1076/jqul.9.1.35.8483}, year = {2002}, date = {2002-01-01}, journal = {Journal of Quantitative Linguistics}, volume = {9}, pages = {35-47}, abstract = {Certain word types of natural languages - conjunctions, articles, prepositions and some verbs - have a very low or very grammatically marked semantic contribution. They are usually named functional categories or relational items. Recently, the possibility of considering prepositions as simple parametrical variations of semantic features instead of categorial features or as the irrelevance of such categorial features has been pointed out. The discussion about such particles has been and still is widespread and controversial. Nonetheless, there is no quantitative evidence of such semantic weakness and no satisfactory evidence against the coexistence of categorial requirements and the fragility of the semantic aspects. This study aims to quantify the semantic contribution of particles and presents some corpora-based results for English that suggest that such weakness and its relational uncertainty come from the categorial irrelevance mentioned before.}, keywords = {information theory}, pubstate = {published}, tppubtype = {article} } Certain word types of natural languages - conjunctions, articles, prepositions and some verbs - have a very low or very grammatically marked semantic contribution. They are usually named functional categories or relational items. Recently, the possibility of considering prepositions as simple parametrical variations of semantic features instead of categorial features or as the irrelevance of such categorial features has been pointed out. The discussion about such particles has been and still is widespread and controversial. Nonetheless, there is no quantitative evidence of such semantic weakness and no satisfactory evidence against the coexistence of categorial requirements and the fragility of the semantic aspects. This study aims to quantify the semantic contribution of particles and presents some corpora-based results for English that suggest that such weakness and its relational uncertainty come from the categorial irrelevance mentioned before. |

Solé, R V; Ferrer-i-Cancho, R; Montoya, J M; Valverde, S Selection, tinkering and emergence in complex networks Journal Article Complexity, 8 , pp. 20-33, 2002. Links | BibTeX | Tags: network science @article{Sole2000a, title = {Selection, tinkering and emergence in complex networks}, author = {R V Solé and R Ferrer-i-Cancho and J M Montoya and S Valverde}, doi = {10.1006/jtbi.1999.0901}, year = {2002}, date = {2002-01-01}, journal = {Complexity}, volume = {8}, pages = {20-33}, keywords = {network science}, pubstate = {published}, tppubtype = {article} } |

Valverde, S; Ferrer-i-Cancho, R; Solé, R V Scale free networks from optimal design Journal Article Europhysics Letters, 60 (4), pp. 512-517, 2002. Links | BibTeX | Tags: network science @article{Valverde2002, title = {Scale free networks from optimal design}, author = {S Valverde and R Ferrer-i-Cancho and R V Solé}, doi = {10.1209%2Fepl%2Fi2002-00248-2}, year = {2002}, date = {2002-01-01}, journal = {Europhysics Letters}, volume = {60}, number = {4}, pages = {512-517}, keywords = {network science}, pubstate = {published}, tppubtype = {article} } |

## 2001 |

Ferrer-i-Cancho, R; Solé, R V Two regimes in the frequency of words and the origin of complex lexicons: Zipf's law revisited Journal Article Journal of Quantitative Linguistics, 8 (3), pp. 165-173, 2001. Abstract | Links | BibTeX | Tags: Zipf's law for word frequencies @article{Ferrer2000a, title = {Two regimes in the frequency of words and the origin of complex lexicons: Zipf's law revisited}, author = {R Ferrer-i-Cancho and R V Solé}, doi = {10.1076/jqul.8.3.165.4101}, year = {2001}, date = {2001-01-01}, journal = {Journal of Quantitative Linguistics}, volume = {8}, number = {3}, pages = {165-173}, abstract = {Zipf’s law states that the frequency of a word is a power function of its rank. The exponent of the power is usually accepted to be close to (-)1. Great deviations between the predicted and real number of different words of a text, disagreements between the predicted and real exponent of the probability density function and statistics on a big corpus, make evident that word frequency as a function of the rank follows two different exponents, ~(-)1 for the first regime and ~(-)2 for the second. The implications of the change in exponents for the metrics of texts and for the origins of complex lexicons are analyzed.}, keywords = {Zipf's law for word frequencies}, pubstate = {published}, tppubtype = {article} } Zipf’s law states that the frequency of a word is a power function of its rank. The exponent of the power is usually accepted to be close to (-)1. Great deviations between the predicted and real number of different words of a text, disagreements between the predicted and real exponent of the probability density function and statistics on a big corpus, make evident that word frequency as a function of the rank follows two different exponents, ~(-)1 for the first regime and ~(-)2 for the second. The implications of the change in exponents for the metrics of texts and for the origins of complex lexicons are analyzed. |

Ferrer-i-Cancho, R; Solé, R V The small-world of human Language Journal Article Proceedings of the Royal Society of London B, 268 , pp. 2261-2266, 2001. Abstract | Links | BibTeX | Tags: network science @article{Ferrer2001a, title = {The small-world of human Language}, author = {R Ferrer-i-Cancho and R V Solé}, doi = {10.1098/rspb.2001.1800}, year = {2001}, date = {2001-01-01}, journal = {Proceedings of the Royal Society of London B}, volume = {268}, pages = {2261-2266}, abstract = {Words in human language interact in sentences in non–random ways, and allow humans to construct an astronomic variety of sentences from a limited number of discrete units. This construction process is extremely fast and robust. The co–occurrence of words in sentences reflects language organization in a subtle manner that can be described in terms of a graph of word interactions. Here, we show that such graphs display two important features recently found in a disparate number of complex systems. (i) The so called small–world effect. In particular, the average distance between two words, d (i.e. the average minimum number of links to be crossed from an arbitrary word to another), is shown to be d≈ 2–3, even though the human brain can store many thousands. (ii) A scale–free distribution of degrees. The known pronounced effects of disconnecting the most connected vertices in such networks can be identified in some language disorders. These observations indicate some unexpected features of language organization that might reflect the evolutionary and social history of lexicons and the origins of their flexibility and combinatorial nature.}, keywords = {network science}, pubstate = {published}, tppubtype = {article} } Words in human language interact in sentences in non–random ways, and allow humans to construct an astronomic variety of sentences from a limited number of discrete units. This construction process is extremely fast and robust. The co–occurrence of words in sentences reflects language organization in a subtle manner that can be described in terms of a graph of word interactions. Here, we show that such graphs display two important features recently found in a disparate number of complex systems. (i) The so called small–world effect. In particular, the average distance between two words, d (i.e. the average minimum number of links to be crossed from an arbitrary word to another), is shown to be d≈ 2–3, even though the human brain can store many thousands. (ii) A scale–free distribution of degrees. The known pronounced effects of disconnecting the most connected vertices in such networks can be identified in some language disorders. These observations indicate some unexpected features of language organization that might reflect the evolutionary and social history of lexicons and the origins of their flexibility and combinatorial nature. |

Ferrer-i-Cancho, R; Janssen, C; Solé, R V Topology of technology graphs: small world patterns in electronic circuits Journal Article Physical Review E, 64 , pp. 046119, 2001. Abstract | Links | BibTeX | Tags: network science @article{Ferrer2001e, title = {Topology of technology graphs: small world patterns in electronic circuits}, author = {R Ferrer-i-Cancho and C Janssen and R V Solé}, doi = {10.1103/PhysRevE.64.046119}, year = {2001}, date = {2001-01-01}, journal = {Physical Review E}, volume = {64}, pages = {046119}, abstract = {Recent theoretical studies and extensive data analyses have revealed a common feature displayed by biological, social, and technological networks: the presence of small world patterns. Here we analyze this problem by using several graphs obtained from one of the most common technological systems: electronic circuits. It is shown that both analogic and digital circuits exhibit small world behavior. We conjecture that the small world pattern arises from the compact design in which many elements share a small, close physical neighborhood plus the fact that the system must define a single connected component (which requires shortcuts connecting different integrated clusters). The degree distributions displayed are consistent with a conjecture concerning the sharp cutoffs associated to the presence of costly connections [Amaral et al., Proc. Natl. Acad. Sci. USA 97, 11 149 (2000)], thus providing a limit case for the classes of universality of small world patterns from real, artificial networks. The consequences for circuit design are outlined.}, keywords = {network science}, pubstate = {published}, tppubtype = {article} } Recent theoretical studies and extensive data analyses have revealed a common feature displayed by biological, social, and technological networks: the presence of small world patterns. Here we analyze this problem by using several graphs obtained from one of the most common technological systems: electronic circuits. It is shown that both analogic and digital circuits exhibit small world behavior. We conjecture that the small world pattern arises from the compact design in which many elements share a small, close physical neighborhood plus the fact that the system must define a single connected component (which requires shortcuts connecting different integrated clusters). The degree distributions displayed are consistent with a conjecture concerning the sharp cutoffs associated to the presence of costly connections [Amaral et al., Proc. Natl. Acad. Sci. USA 97, 11 149 (2000)], thus providing a limit case for the classes of universality of small world patterns from real, artificial networks. The consequences for circuit design are outlined. |

Miralles, R; Ferrer, R; Solé, R V; Moya, A; Elena, S F Multiple infection dynamics has pronounced effects on the fitness of RNA viruses Journal Article Journal of Evolutionary Biology, 14 (4), pp. 654-662, 2001. Links | BibTeX | Tags: evolutionary biology @article{Miralles2001a, title = {Multiple infection dynamics has pronounced effects on the fitness of RNA viruses}, author = {R Miralles and R Ferrer and R V Solé and A Moya and S F Elena}, doi = {10.1046/j.1420-9101.2001.00308.x}, year = {2001}, date = {2001-01-01}, journal = {Journal of Evolutionary Biology}, volume = {14}, number = {4}, pages = {654-662}, keywords = {evolutionary biology}, pubstate = {published}, tppubtype = {article} } |

## 1999 |

Solé, R V; Ferrer-i-Cancho, R; González-Garcia, I; Quer, J; Domingo, E Read queen dynamics, competition and critical points in a model of RNA virus quasispecies Journal Article Journal of Theoretical Biology, 198 , pp. 47-59, 1999. Links | BibTeX | Tags: evolutionary biology @article{Sole1999, title = {Read queen dynamics, competition and critical points in a model of RNA virus quasispecies}, author = {R. V. Solé and R Ferrer-i-Cancho and I González-Garcia and J Quer and E Domingo}, doi = {10.1006/jtbi.1999.0901}, year = {1999}, date = {1999-01-01}, journal = {Journal of Theoretical Biology}, volume = {198}, pages = {47-59}, keywords = {evolutionary biology}, pubstate = {published}, tppubtype = {article} } |

# Scientific publications

## 2021 |

Bounds of the variation of the sum of edge lengths in linear arrangements of trees Journal Article Journal of Statistical Mechanics, pp. 023403, 2021. |

Minimum projective linearizations of trees in linear time Journal Article under review, 2021. |

Anti dependency distance minimization in short sequences. A graph theoretic approach Journal Article Journal of Quantitative Linguistics, 28 (1), pp. 50-76, 2021, (published online in 2019). |

## 2020 |

The optimality of syntactic dependency distances Journal Article pp. under review, 2020. |

Memory limitations are hidden in grammar Journal Article pp. under review, 2020. |

Fast calculation of the variance of edge crossings in random linear arrangements Journal Article pp. under review, 2020. |

Edge crossings in random linear arrangements Journal Article Journal of Statistical Mechanics, (2), pp. 023403, 2020. |

Reappraising the distribution of the number of edge crossings of graphs on a sphere Journal Article Journal of Statistical Mechanics, pp. 083401, 2020. |

Distinct flavors of Zipf's law and its maximum likelihood fitting: Rank-size and size-distribution representations Journal Article Physical Review E, pp. 052113, 2020. |

## 2019 |

Optimal coding and the origins of Zipfian laws Journal Article Journal of Quantitative Linguistics, pp. in press, 2019. |

SyntaxFest 2019 Invited talk - Dependency distance minimization: facts, theory and predictions Inproceedings Proceedings of the First Workshop on Quantitative Syntax (Quasy, SyntaxFest 2019), pp. 1–1, Association for Computational Linguistics, Paris, France, 2019. |

Lingüística cuantitativa. La estadística de las palabras Book EMSE EDAPP y Prisanoticas Colecciones, 2019, (English title: Quantitative linguistics. The statistics of words). |

Polysemy and brevity versus frequency in language Journal Article Computer Speech and Language, 58 , pp. 19 – 50, 2019. |

Linguistic laws in chimpanzee gestural communication Journal Article Proceedings of the Royal Society B: Biological Sciences, 286 , pp. 20182900, 2019. |

The sum of edge lengths in random linear arrangements Journal Article Journal of Statistical Mechanics, pp. 053401, 2019. |

## 2018 |

The polysemy of the words that children learn over time Journal Article Interaction Studies, 19 , pp. 389 – 426, 2018. |

A dependency look at the reality of constituency Journal Article Glottometrics, 40 , pp. 104-106, 2018. |

Are crossing dependencies really scarce? Journal Article Physica A, 493 , pp. 311-329, 2018. |

The origins of Zipf's meaning-frequency law Journal Article Journal of the American Association for Information Science and Technology, 69 , pp. 1369–1379, 2018. |

Optimization models of natural communication Journal Article Journal of Quantitative Linguistics, 25 , pp. 207-237, 2018. |

## 2017 |

Thoughts About Disordered Thinking: Measuring and Quantifying the Laws of Order and Disorder Journal Article Schizophrenia Bulletin, 43 (3), pp. 509-513, 2017. |

The entropy of words - Learnability and expressivity across more than 1000 languages Journal Article Entropy, 19 (6), 2017. |

A commentary on ``The now-or-never bottleneck: a fundamental constraint on language'', by Christiansen and Chater (2016) Journal Article Glottometrics, 38 , pp. 107-111, 2017. |

Scarcity of crossing dependencies: a direct outcome of a specific constraint? Journal Article Physical Review E, 96 , pp. 062304, 2017. |

A correction on Shiloach's algorithm for minimum linear arrangement of trees Journal Article SIAM Journal of Computing, 46 , pp. 1146-1151, 2017. |

Random crossings in dependency trees Journal Article Glottometrics, 37 , pp. 1-12, 2017. |

The placement of the head that maximizes predictability. An information theoretic approach Journal Article Glottometrics, 39 , pp. 38-71, 2017. |

## 2016 |

The optimality of attaching unlinked labels to unlinked meanings Journal Article Glottometrics, 36 , pp. 1-16, 2016. |

Fast calculation of entropy with Zhang's estimator Incollection Knight, Macutek Kelih J E R; Wilson, A (Ed.): Issues in Quantitative Linguistics 4. Dedicated to Reinhard Köhler on the occasion of his 65th birthday, pp. 273-285, RAM-Verlag, Lüdenscheid, 2016, (No. 23 of the series ``Studies in Quantitative Linguistic''). |

Zipf's law of abbreviation as a language universal Inproceedings Bentz, Christian; Jäger, Gerhard; Yanovich, Igor (Ed.): Proceedings of the Leiden Workshop on Capturing Phylogenetic Algorithms for Linguistics, University of Tübingen, 2016. |

Acoustic sequences in non-human animals: a tutorial review and prospectus Journal Article Biological Reviews, 91 (1), pp. 13–52, 2016. |

The infochemical core Journal Article Journal of Quantitative Linguistics, 23 , pp. 133-153, 2016. |

Testing the robustness of laws of polysemy and brevity versus frequency Inproceedings Král, P; Martín-Vide, C (Ed.): 4th International Conference on Statistical Language and Speech Processing (SLSP 2016). Lecture Notes in Computer Science 9918, pp. 19–29, 2016. |

Non-crossing dependencies: least effort, not grammar Incollection Mehler, A; ü, A L; Banisch, S; Blanchard, P; Job, B (Ed.): Towards a theoretical framework for analyzing complex linguistic networks, pp. 203-234, Springer, Berlin, 2016. |

Kauffman's adjacent possible in word order evolution Inproceedings The evolution of language: Proceedings of the 11th International Conference (EVOLANG11), 2016. |

Gelada vocal sequences follow Menzerath's linguistic law Journal Article Proceedings of the National Academy of Sciences USA, 13 , pp. E2750–E2758, 2016. |

Liberating language research from dogmas of the 20th century. Journal Article Glottometrics, 33 , pp. 33-34, 2016. |

Compression and the origins of Zipf's law for word frequencies Journal Article Complexity, 21 , pp. 409-411, 2016. |

The scaling of the minimum sum of edge lengths in uniformly random trees Journal Article Journal of Statistical Mechanics, pp. 063401, 2016. |

The meaning-frequency law in Zipfian optimization models of communication Journal Article Glottometrics, 35 , pp. 28-37, 2016. |

## 2015 |

Zipf's law of abbreviation as a language universal Inproceedings Capturing Phylogenetic Algorithms for Linguistics, Lorentz Center Workshop, Leiden, 2015. |

Linguistic laws in primate vocal communication Inproceedings Proceedings of the 6th European Federation for Primatology Meeting, XXII Italian Association of Primatology Congress Rome, Italy, August 25-28. Folia Primatologica 86, 357, 2015. |

Zipf's law for word frequencies: word forms versus lemmas in long texts Journal Article PLoS ONE, 10 , pp. e0129031, 2015. |

Compression and the origins of Zipf's law of abbreviation Journal Article http://arxiv.org/abs/1504.04884, 2015. |

Crossings as a side effect of dependency lengths Journal Article Complexity, 21 , pp. 320-328, 2015. |

Reply to the commentary ``Be careful when assuming the obvious'', by P. Alday Journal Article Language Dynamics and Change, 5 , pp. 147-155, 2015. |

The placement of the head that minimizes online memory. A complex systems approach Journal Article Language Dynamics and Change, 5 , pp. 114-137, 2015. |

## 2014 |

When is Menzerath-Altmann law mathematically trivial? A new approach Journal Article Statistical Applications in Genetics and Molecular Biology, 13 , pp. 633-644, 2014. |

Beyond description. Comment on "Approaching human language with complex networks" by Cong & Liu Journal Article Physics of Life Reviews, 11 , pp. 621-623, 2014. |

Physics of Life Reviews, 21 , pp. 218-220, 2014. |

Why might SOV be initially preferred and then lost or recovered? A theoretical framework Inproceedings Cartmill, E A; Roberts, S; Lyn, H; Cornish, H (Ed.): THE EVOLUTION OF LANGUAGE - Proceedings of the 10th International Conference (EVOLANG10), pp. 66-73, Wiley, Vienna, Austria, 2014, (Evolution of Language Conference (Evolang 2014), April 14-17). |

What if we are not at the center? Inproceedings THE EVOLUTION OF LANGUAGE - Proceedings of the 10th International Conference (EVOLANG10), Wiley, Vienna, Austria, 2014, (Evolution of Language Conference (Evolang 2014), April 14-17). |

A stronger null hypothesis for crossing dependencies Journal Article Europhysics Letters, 108 , pp. 58003, 2014. |

The risks of mixing dependency lengths from sequences of different length Journal Article Glottotheory, 5 , pp. 143-155, 2014. |

## 2013 |

Erratum to "Random models of Menzerath-Altmann law in genomes" (BioSystems 107 (3), 167-173) Journal Article Biosystems, 111 (3), pp. 216-217, 2013. |

The parameters of Menzerath-Altmann law in genomes Journal Article Journal of Quantitative Linguistics, 20 (2), pp. 94-104, 2013. |

The evolution of the exponent of Zipf's law in language ontogeny Journal Article PLoS ONE, 8 (3), pp. e53227, 2013. |

Networks in cognitive science Journal Article Trends in Cognitive Sciences, 17 , pp. 348-360, 2013. |

Constant conditional entropy and related hypotheses Journal Article Journal of Statistical Mechanics, pp. L07001, 2013. |

Hubiness, length, crossings and their relationships in dependency trees Journal Article Glottometrics, 25 , pp. 1-21, 2013. |

The failure of the law of brevity in two New World primates. Statistical caveats Journal Article Glottotheory, 4 (1), 2013. |

Compression as a universal principle of animal behavior Journal Article Cognitive Science, 37 (8), pp. 1565-1578, 2013. |

The challenges of statistical patterns of language: the case of Menzerath's law in genomes Journal Article Complexity, 18 (3), pp. 11-17, 2013. |

## 2012 |

Random models of Menzerath-Altmann law in genomes Journal Article Biosystems, 107 , pp. 167-173, 2012. |

The span of dependencies in dolphin whistle sequences Journal Article Journal of Statistical Mechanics, pp. P06002, 2012. |

## 2011 |

Information content versus word length in random typing Journal Article Journal of Statistical Mechanics, pp. L12002, 2011. |

Size of the whole versus number of parts in genomes Journal Article Entropy, 13 , pp. 1465-1480, 2011. |

## 2010 |

Scaling laws in cognitive sciences Journal Article Trends in cognitive sciences, 14 (5), pp. 223–232, 2010. |

## 2009 |

The frequency spectrum of finite samples from the intermittent silence process Journal Article Journal of the American Association for Information Science and Technology, 60 (4), pp. 837-843, 2009. |

The self-organization of genomes Journal Article Complexity, 15 (5), pp. 34-36, 2009. |

A law of word meaning in dolphin whistle types Journal Article Entropy, 11 (4), pp. 688-701, 2009. |

Efficient coding in dolphin surface behavioral patterns Journal Article Complexity, 14 (5), pp. 23-25, 2009. |

Random texts do not exhibit the real Zipf's-law-like rank distribution Journal Article PLoS ONE, 5 (4), pp. e9411, 2009. |

## 2008 |

Long-distance dependencies are not uniquely human Incollection Smith, A D M; Smith, K; Ferrer-i-Cancho, R (Ed.): The Evolution of Language: Proceedings of the 7th International Conference (EVOLANG7), World Scientific Press, Singapore, 2008. |

Information theory Incollection information P, (Ed.): The Cambridge encyclopedia of the language sciences, Cambridge University Press, 2008. |

Network theory Incollection Hogan, Colm P P (Ed.): The Cambridge encyclopedia of the language sciences, pp. 555-557, Cambridge University Press, 2008. |

Power laws and the golden number Incollection Kelih, E; Levickij, V; Altmann, G (Ed.): Problems of text analysis, 2008. |

Some word order biases from limited brain resources. A mathematical approach Journal Article Advances in Complex Systems, 11 (3), pp. 393-414, 2008. |

Some limits of standard linguistic typology. The case of Cysouw's models for the frequencies of the six possible orderings of S, V and O Journal Article Advances in Complex Systems, 11 (3), pp. 421-432, 2008. |

The Evolution of Language: Proceedings of the 7th International Conference (EVOLANG7) Book World Scientific Press, Singapore, 2008. |

## 2007 |

Spectral methods cluster words of the same class in a syntactic dependency network Journal Article International Journal of Bifurcation and Chaos, 17 (7), pp. 2453-2463, 2007. |

On the universality of Zipf's law for word frequencies Incollection Grzybek, P; Köhler, R (Ed.): Exact methods in the study of language and text. To honor Gabriel Altmann, pp. 131-140, Gruyter, Berlin, 2007. |

The global minima of the communicative energy of natural communication systems Journal Article Journal of Statistical Mechanics, pp. P06009, 2007. |

Correlations in the organization of large-scale syntactic dependency networks Inproceedings Proceedings of the workshop TextGraphs-2: Graph-based Methods for Natural Language Processing at the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT 2007), Rochester, New York, pp. 65-72, 2007. |

## 2006 |

When language breaks into pieces. A conflict between communication through isolated signals and language Journal Article Biosystems, 84 , pp. 242-253, 2006. |

Long-term correlations in the surface behavior of dolphins Journal Article Europhysics Letters, 74 (6), pp. 1095-1101, 2006. |

Why do syntactic links not cross? Journal Article Europhysics Letters, 76 (6), pp. 1228-1235, 2006. |

## 2005 |

Decoding least effort and scaling in signal frequency distributions Journal Article Physica A, 345 , pp. 275-284, 2005. |

The variation of Zipf's law in human language Journal Article European Physical Journal B, 44 , pp. 249-257, 2005. |

Zipf's law from a communicative phase transition Journal Article European Physical Journal B, 47 , pp. 449-457, 2005. |

The consequences of Zipf's law for syntax and symbolic reference Journal Article Proceedings of the Royal Society of London B, 272 , pp. 561-565, 2005. |

Can simple models explain Zipf's law for all exponents? Journal Article Glottometrics, 11 , pp. 1-8, 2005. |

Hidden communication aspects inside the exponent of Zipf's law Journal Article Glottometrics, 11 , pp. 96-117, 2005. |

The structure of syntactic dependency networks from recent advances in the study of linguistic networks Incollection Levickij, V; Altmann, G (Ed.): The problems in quantitative linguistics, pp. 60-75, Ruta, Chernivtsi, 2005. |

## 2004 |

Patterns in syntactic dependency networks Journal Article Physical Review E, 69 , pp. 051915, 2004. |

Euclidean distance between syntactically linked words Journal Article Physical Review E, 70 , pp. 056135, 2004. |

## 2003 |

Least effort and the origins of scaling in human language Journal Article Proceedings of the National Academy of Sciences USA, 100 , pp. 788-791, 2003. |

Optimization in complex networks Incollection Pastor-Satorras, R; Rubí, J M; Díaz-Guilera, A (Ed.): Statistical Mechanics of complex networks, 625 , pp. 114-125, Springer, Berlin, 2003. |

Language: universals, principles and origins PhD Thesis Universitat Politècnica de Catalunya, 2003. |

## 2002 |

Zipf's law and random texts Journal Article Advances in Complex Systems, 5 , pp. 1-6, 2002. |

Quantifying the semantic contribution of particles Journal Article Journal of Quantitative Linguistics, 9 , pp. 35-47, 2002. |

Selection, tinkering and emergence in complex networks Journal Article Complexity, 8 , pp. 20-33, 2002. |

Scale free networks from optimal design Journal Article Europhysics Letters, 60 (4), pp. 512-517, 2002. |

## 2001 |

Two regimes in the frequency of words and the origin of complex lexicons: Zipf's law revisited Journal Article Journal of Quantitative Linguistics, 8 (3), pp. 165-173, 2001. |

The small-world of human Language Journal Article Proceedings of the Royal Society of London B, 268 , pp. 2261-2266, 2001. |

Topology of technology graphs: small world patterns in electronic circuits Journal Article Physical Review E, 64 , pp. 046119, 2001. |

Multiple infection dynamics has pronounced effects on the fitness of RNA viruses Journal Article Journal of Evolutionary Biology, 14 (4), pp. 654-662, 2001. |

## 1999 |

Read queen dynamics, competition and critical points in a model of RNA virus quasispecies Journal Article Journal of Theoretical Biology, 198 , pp. 47-59, 1999. |