Analytic and Continental Philosophy: playing around with quantitative methods

by Pietro Lana


What can quantitative methods tell us about the differences between the Analytic and the Continental philosophical traditions? Attempts to define, characterize and distinguish the two have led to such a variety of positions that even the most cautious proposals have been put into question (Glock, 2008). The difficulties in identifying sufficient or necessary differentiation criteria – be they doctrinal, methodological, stylistic or thematic – seem to call for a different approach. It was 1993 when Michael Dummet in his Origins of Analytical Philosophy compared the analytic and the continental traditions to the Rhine and the Danube: rivers that after rising close to one another flow into different seas. What if, instead of attempting to provide a cartography of the two rivers, were we to dive into their waters? What follows is a brief exploration of some possible applications of quantitative methods to the study of the differences between the two traditions. As any exploration, it is not meant to provide conclusive results on the subject, but rather to shed light on further possible routes. 

The assessed corpus consists of all the articles published between 1980 and 2018 by four Anglophone philosophy journals, two of them belonging to the analytic tradition (“Philosophical Studies”, “Mind”) and two of them belonging to the continental tradition (“Continental Philosophy Review”, “Research in Phenomenology”). Because of the comparative nature of the study, the journals were chosen both on the basis of their relevance – in terms of average number of weighted citations – and on the basis of their representativeness: all four journals explicitly present themselves as being part of one of the two traditions. In conducting the textual analysis, the “analytic” and “continental” corpora have also been divided into further subcorpora by decades of publication, in order to allow for the results to show eventual changes over time.

The text mining software employed to assess the corpus is Lancsbox, developed at Lancaster University by Vaclav Brezina, Matthew Timperley and Anthony McEnery. It has been chosen because it allows for a variety of analysis on the language data that are present in a given corpus, such as type/token ratio, frequency, dispersion, keyword generation and collocation. The collocation graphs below have been generated by using Gephi, a network exploration software developed by Mathieu Bastian and Eduardo Ramos Ibañez. 

Type/token ratio

In a given text, the type-token ratio (TTR) is the total number of unique words (types) divided by the total number of words (tokens). The value of the type-token ratio is, therefore, directly proportional to the lexical richness of the text considered. The idea behind beginning the exploration by comparing the type-token ratios of the corpora was guided by the possibility of generating a first, preliminary result concerning the stylistic differences in the two traditions. If reducing something as complex as the notion of style to a matter of variety in vocabulary seems like a dubious choice, it is nevertheless a first and necessary step in the direction of a deeper analysis. In conducting this first analysis, the number of texts contained in the significantly larger analytic subcorpora has been manually reduced to that of the continental subcorpora, in order to avoid the risk that a considerable difference in the number of tokens could affect the validity of a comparison of their ratios. The table and the graph below show the type/token ratios of the analytic (AA) and continental (CC) subcorpora.


type/token ratios of the analytic (AA) and continental (CC) subcorpora

The graph shows that there is, indeed, a consistent difference between the corpora of the two traditions: continental articles exhibit a substantially higher lexical richness than the analytic ones, throughout all of the decades taken in consideration. One possible interpretation of these results can be found in the positions expressed by, among others, D’Agostini and Marconi concerning the different stylistic approaches of the two traditions. In other words, the lower lexical richness observed in the analytic articles could be a consequence of a more pronounced aspiration to formal rigor and the use of explicit arguments, as opposed to the continental tendency to a more varied exposition, closer to that of other fields in the humanities (D’Agostini, 1997; Marconi, 2011). In order to delve deeper into this possibility, a second textual analysis has been conducted by looking into the corpora for terms that could give further hints in this direction.


Relative frequencies

Drawing from widespread characterizations of the late analytic philosophy1, a list of terms was built that could be traced back to the use of a rigorous and explicit argumentative style, so as to determine whether a difference in their average relative frequency could be observed between the different corpora. Below is the list of terms followed by the results of the textual analysis.

The list: Argue Argues Argued Argument Arguments Objection Objections Defend Defends Defense Reject Rejects Justify Justifies Justified Reply Replies Assume Assumed Assumes Assumption Assumptions Example Examples Define Defines Definition Conclusion Conclusions Axiom Axioms Law Laws Norm Norms Principle Principles Condition Conditions Requirement Requirements Required Criterion Criteria Theory Theories Hypothesis Hypotheses Consequence Consequences Necessary Necessarily Logical Rational Reason Reasons.

average relative frequency of corpora

As it turns out, the terms in the list above are extremely more frequent in the analytic corpora than in the continental ones. The results of the textual analysis are not surprising at all. Afterall, the terms in the list were selected on the basis of extremely popular characterizations of the late analytic philosophy: the use of explicit arguments and the aspiration to formal rigor. Nevertheless, the results are valuable from a methodological point of view to the extent that they confirm the validity of the method employed by quantifying a difference that is often only expressed in discursive terms. 

At the same time it is important to remind that, insofar as this analysis is concerned, the results only shed light on the use of explicit arguments, but do not say much about the presence of implicit ones: they can show a stylistic difference, but hardly one of content.


Lockwords and collocations

Among the various tools that Lancsbox offers to users, the lockword tool is specifically designed for the identification of similarities between corpora: it allows to generate a list of the most frequent terms that display a similar relative frequency in two corpora, thus highlighting their common ground. Among the many rather vague and general terms that were identified by means of this tool, two of them in particular stood out because of their inherent complexity and relevance: “analysis” and “language”. In the attempt to thoroughly examine their different uses in the context of the two traditions, an analysis of their collocates has been attempted. Unfortunately, the scale of the corpora made it impossible for the software to obtain the graphs initially intended: the analysis only displays the main collocates (i.e. the most frequent terms within a range of 5 words from the lockword) but does not show the relations between the collocates themselves, making the results hard to read. Further topic modeling analyses might contribute to a better understanding of the perspectives that these graphs merely suggest. 

Nevertheless, for this very reason the following graphs result in an unexpectedly stimulating opportunity to engage oneself in the interpretative exercise of contextualizing the generated linkages, given the fact that aside from some obvious cases, many collocates require some reflection and others are real head-scratchers. For example, by looking at the graphs built around the term “analysis”, in the one based on the continental corpus it is possible to notice words related to Husserl’s phenomenological analysis and to Heidegger’s ontological analysis. On the other hand, interesting terms among the analytic collocates are “conceptual” and “concepts”, which seem at least in part to undermine Williamson’s opinion that late analytic philosophy is no longer interested in conceptual analysis (Williamson, 2008). Also, by examining the collocates of “language” in the graphs of the two traditions it is possible to notice in both of them the presence of the term “ordinary”, which is not surprising in the analytic tradition but is somewhat unexpected in the continental context. These are only a few examples: the graphs are left below with no further comment, free to examine for anyone willing to engage in the task.

analysis corpora AA and CC

language corpora AA and CC


Bastian, Mathieu, and Eduardo Ramos Ibañez. Gephi (version 0.9.2), s.d.

Brezina, Vaclav, Matthew Timperley and Anthony McEnery. Lancsbox (version 5.0.1). Lancaster University, s.d.

D’Agostini, Franca. Analitici E Continentali, Guida Alla Filosofia Degli Ultimi Trent’anni, 1997.

Dummett, Michael. Origins of analytical philosophy. A&C Black, 2014.

Glock, Hans-Johann. What Is Analytic Philosophy? Cambridge University Press, 2008.

Marconi, Diego. Analytic Philosophy and Intrinsic Historicism. «Teorema: Revista Internacional de Filosofía 30», n. 1 Book Symposium: What is Analytic Philosophy? (2011): 23–32.

Williamson, Timothy. The Philosophy of Philosophy. John Wiley & Sons, 2008. 



[1] To find out more on our project on History of Late Analytic Philosophy click here.

This entry was posted in Data-Driven Research, Digital Humanities, Distant Reading, History of analytic philosophy, History of philosophy, Quantitative methods, Text mining, Text-Mining. Bookmark the permalink.

One Response to Analytic and Continental Philosophy: playing around with quantitative methods

  1. Eugenio Petrovich says:

    Interesting work, well done!

Comments are closed.