TEPT final event: presentation of the Tree of Philosophers

On November 22nd DR2 launched the Tree of Philosophers, an online resource providing academic genealogical trees of philosophers. The Tree of Philosophers is the main result of the project TEPT, funded by Fondazione CRT.

The event have been held in the library of the department of philosophy and have been participated by master students, PhD candidates, high school teachers, historians of philosophy and librarians.

The event was held in the LIFE laboratory of the Department of Philosophy and was attended to by students, PhD candidates, high school teachers, historians of philosophy and librarians

Director of ILIESI-CNR Enrico Pasini and Director of DISH Research Centre Cristina Trinchero took part in the event.

Lezioni di Franco Moretti

The Tree of Philosophers is branching out!

Thanks to the collaboration and commitment of a growing number of researchers, the scope of the Tree of Philosophers is growing.

The TEPT project started out by considering the most commonly acknowledged historical relation of academic descent, that is the relation between a PhD candidate who is supervised by a university professor. Although this kind of relation is common in contemporary academia, many academical environments have been characterised by different institutional relations.

For this reason, the first part of the project has been devoted to the reconstruction of significant samples from different historical academic contexts characterised by the availability of the relation between PhD candidate and supervisor: 19th-20th century Germany, the United States, Austria, 20th-century Canada, Australia, New Zealand.

The inclusion in the tree of other historical contexts featuring academic training in the field of philosophy is a goal of the TEPT project, but it is a difficult task. Indeed, the definition of historical relations that can be treated as institutional relations of descent requires specific knowledge of the historical configurations of academic institutions in diverse contexts. Moreover, this knowledge needs to be paired with familiarity with domain-specific archives and, unsurprisingly, with a good amount of working hours.

Such precious resources have been provided by researchers from the department of philosophy at the University of Turin (DFE Unito) and from the Northwestern Italian Philosophy PhD Curriculum, who allowed the Tree of Philosophers to grow by encompassing other kinds of relations of academic descent than PhD supervision, and thus by including new lines of descent in TEPT’s genealogy.

More specifically, the TEPT project can now rely on specific research concerning institutional landscapes and academic descent relations in:

  • 19th century École Normale Supérieure (previously Pensionnat Normale), thanks to the work of Alessandro Taverniti (PhD candidate at FINO Curriculum/IHRIM-ENS Lyon);
  • Magdalen College, New College, Balliol College at Oxford between 1940s and 1960s, thanks to the work of Paolo Babbiotti (PhD candidate at FINO Curriculum) and prof. Paolo Tripodi (DFE Unito);
  • 20th century Italian universities, with a comprehensive focus on the University of Turin, thanks to the work of Giorgio Matteoli (PhD candidate at FINO Curriculum), Giuseppe Guastamacchia (DFE Unito) and prof. Massimo Ferrari (DFE Unito);
  • Renaissance universities, thanks to the work of Giuseppe Pignatelli (PhD candidate at FINO Curriculum).

DR2 COLLOQUIUM in Turin – Lucia Pasini, “Distant Listening” – 9 February 2023

16-17 January 2023

Dipartimento di Filosofia, Sapienza Università di Roma

Villa Mirafiori, via Carlo Fea, 2 – Roma

Links for following through Webex:

16 January, 15:00-19:00

Session 1. Quantitative History of Philosophy: Methodological Peculiarities 

15:00 Enrico Pasini (Torino, Roma), Introduction

15:20 Arianna Betti (Amsterdam), The Status of Philosophy as a Data-Driven Science

16:10 Sander Verhaegh (Tilburg), Toward a Computational History of American Philosophy: Problems and Promises

17:00 Break

17:20 Christophe Malaterre & Francis Lareau (Montréal), Mining Eight Decades of Philosophy and Philosophers of Science 

18:10 Angela Ambrosino & Mario Cedrini (Torino), What is Inside the Cambridge Journal of Economics? A Topic Modelling and Network Analysis

17 January, 9:00-13:30

Session 2. Beyond Digital Humanities

The Problem of Transparency

9:00 Paolo Tripodi (Torino), Introduction

9:10 Teresa Numerico (Roma), Abstraction and Categorization without a Cause: Epistemic Opacity in the Critical Process

9:50 Davide Pulizzotto (Montréal), Methodological Transparency in Computer-Assisted Text Analysis

10:30 Break

Hybrid Figures

10:50 Dino Buzzetti (Bologna), Introduction

11:50 Charles Pence (Louvain), Interdisciplinarity and Collaboration in Digital Philosophy

12:30 Julie Giovacchini (Paris), Building a Philosophical Glossary with TEI: the Multidisciplinary Epicurei Project. Strenght and Weakness of Thematical Named Entities in Context

12:10 Roberto Lalli (Torino), Hybrid Experts and Scientific Cooperation in the Historical Analysis of Socio-Epistemic Networks

Discussant: Cristina Marras (Roma)

17 January, 15:30-18:40

Session 3. TEPT: Turin Enhanced Philosophy Tree

15:30 Guido Bonino (Torino), Introduction

15:50 Stefan Heßbrüggen-Walter (Berlin), Intellectual Genealogies and Canon(s) of Philosophy: Some Reflections

16:40 Break

17:00 Eugenio Petrovich (Tilburg, Torino), Links and Ties. Information Loss in the Translation of Texts into Networks

17:50 Daniele Radicioni (Torino), Reshaping Distant Reading into Probabilistically Oriented DR: the case of the Turin Enhanced Philosophers Tree

Discussants: Michele Alessandrelli (Roma), Michele Ciruzzi (Insubria), Nicola Ruschena (Torino)


Conference co-organised by the DR2 Research Group, ILIESI-CNR, and DISH.

Part of the teaching program of the FINO PhD Consortium.

With the support of CRT Foundation (TEPT project).

TEPT data and disambiguation – The use of personal identifiers

The TEPT project is committed to the development of an infrastructure for the reconstruction of the relations of academic descent among philosophers. Such relations constitute a socio-institutional network of arcs connecting nodes, i.e. couples of academic parent/offspring. The Tree of Philosophers that is being developed by DR2 is meant to represent the network of relations that are reconstructed.

Evidently, philosophers are the smallest and fundamental units of the project from a structural point of view, as they are the constituents of the very subject of the project, i.e. the descent relations.

TEPT’s criteria for the inclusion of people in the Tree of Philosophers are quite broad: anyone who ever granted or received a high-level academic degree to or from someone who either granted or received a high-level academic degree in philosophy is, in principle, a proper addition to the tree.

People (philosophers) are thus included in the tree regardless of their notability, their career paths, their productivity in the intellectual domain or the reception of their works.

Trivially, thus, the Tree of Philosophers is meant to record many people. A resource such as the Tree of Philosophers will hardly be ever completed, so that the number of philosophers that can be expected to be included is not easy to assess. Nevertheless, the 3000 people constituting the first batch of philosophers TEPT has ordered and analysed, along with our instant acknowledgement of the little historical coverage of such a sample, can provide some insights on the order of magnitude we expect the tree to deal with. In spite of its institutional focus, which makes it blind towards non-academic transmission of knowledge, the tree and the infrastructure it relies upon have to manage and treat as “authors” (people whose works, represented by their dissertations at the bear minimum, contributed to intellectual production) a number of people that is larger than usual in historical reconstructions of the history of philosophy.

Moreover, a significant part of the tree’s domain is populated by what we can naively call non-famous philosophers. We can’t provide an esteemed ratio yet (yet!), but common sense is sufficient to assume that in most historical contexts in which academic philosophical training exists, people graduating in philosophy usually come in greater numbers than people becoming notable because of their philosophical work (regardless of training and background).

The need of working on large sets of mostly unknown people heavily influences most choices and strategies in the development of the TEPT project. In order to see how this aspect affects the development of TEPT’s infrastructures, a brief description of the tasks involved in the reconstruction of lines of academic descent can be of use.

Once that a specific type of descent relation is defined, e.g. the master-pupil relation of a PhD advisor and a PhD candidate, TEPT researchers work to track the academic genealogy of a philosophers who earned or granted a PhD, then proceeding to recursively retrieve either their parents (their PhD supervisors and the supervisors of the supervisors) or their offsprings (the PhD candidates they advised and those that the latter supervised). In order to retrieve information about such connections, researchers need to browse a variety of documents and archives: depending on the philosophers inquired upon, on the time of their academic training and on the institutions where the training took place, we can find pieces of information about philosophers’ academic training in (auto)biographies, in institutional archives and in national registers, but professional resumés as well, along with public commemorative speeches and even obituaries are valuable sources.

Most pieces of information needed for the reconstruction of biographies are difficult to find, even at expensive costs in terms of time. In some cases, non-famous people have left fewer traces (e.g. if they did not publish philosophical works because they pursued different careers), but what is noticeable in most cases is that those traces that have been left by non-famous people are harder to find and to put together. By relying on resources such as Proquest, for example, we can find names of philosophers graduated in the USA the second half of 20th century along with the titles of their theses. Nonetheless, such names and titles are seldom sufficient for the attribution of descent relations: apart from cases concerning well-known philosophers, it is very difficult to assess if different pieces of information linked to a name do refer to the same person.

Providing a couple of examples, we retrieved the identities of seven different philosophers whose last name is “Davidson”, all of them being trained in the US and active in 20th century; considering the forty philosophers whose last name is “Johnson” we have been able to distinguish four different “David Johnson” with different and sometimes punctuated middle names, all trained in the US between 1949 and 1978. The attribution to the same people of data retrieved from different sources is often a difficult task simply because we are not sure that they actually refer to the same people. Such attribution thus requires multiple validation steps that are hard to formalise in a set of instructions. If we are provided with the title of a dissertation we can evaluate the disciplinary proximity of the thesis with the academic production or the professional path followed by someone that has the same name of the thesis’ author (at least until we prove that they are the same person). The same evidently holds for geographical and chronological information. We often find published works, but the retrieval of sets of very heterogenous works with the same author’s name is often the first cue that we are probably dealing with some case of homonymy.

For all these reasons, two major issues that demand our attention in the development of the Tree of Philosophers are the duplication and the overlapping of the personal identities of philosophers included in the Tree.

We find a partial solution to both problems by relying on virtual identifiers of authority data, which are resources used in archival disciplines and in library institutions. Virtual identifiers are simple numbers or strings of text that are used as labels or indexes, directing to a specific person identified as the author of a number of works. In order to mitigate the problems of duplication and overlapping of identities we operate slight modifications to TEPT’s database, enlarging the Philosophers Table by adding four additional fields, one for each virtual identifier we want to rely upon: these are Wikidata, ORCID, ISNI and VIAF.

VIAF (Virtual International Authority File) is an international authority data identifier assigned by OCLC by aggregating authority data from national library systems: this means that a name that is assigned a VIAF by OCLC is a name that is recorded as the author of at-least-one work in at-least-one national library catalogue. ISNI is the ISO effort of standardisation of personal identification of contributors to the intellectual production and it works in a similar manner to VIAF, by aggregating authority data from national catalogues along with academic production of article-like works. Both OCLC’s VIAF and ISNI aggregates authority data from national systems using samples of published titles and authors’ birth (and sometimes death) years. ORCID identifiers are obviously available only for recent time, but their assignment is directly requested by researchers or institutions and they can help in disambiguating nodes in recent branches of the Tree of Philosophers, at the cost of an insignificant increase in the sparsity of the Philosophers Table in TEPT’s database. Technically, ORCID is a part of ISNI, because ORCID identifiers are included as a region of ISNI identifiers. Nonetheless, we prefer to keep ORCID an ISNI ids as separate fields. First, because a recent productive philosopher can easily have different ISNI and ORCID ids; secondly, because while ISNI ids are assigned by aggregating data, the assignment of ORCID ids is directly requested by researchers or their institutions of affiliation, so that reliability of ORCID is greater than that of ISNI. Finally, Wikidata identifiers are assigned by an automatic system supervised by users of the Wikidata community. Noticeably, Wikidata ids are assigned to somehwhat notable people so that we cannot expect Wikidata ids to make a huge difference in the disambiguation of non-famous philosophers.

The search for such different identifiers and, whenever available, their attribution to personal records in TEPT’s database, provide means to improve the way in which TEPT works: the inclusion of four types of identifiers makes it possible for us to assess the population of the database at different stages, thus allowing for the evaluation of different strategies of data collection; we can evaluate the coverage of the database in terms of intra- or inter-disciplinary renown of philosophers (e.g. by comparing ISNI, ORCID and Wikidata coverage); furthermore, we are able to approximate a measure of the “Great Unread” that is included in the Tree of Philosophers, by measuring the philosophers that are not recorded in any of the mentioned repositories.

From a technical point of view, the assignment of virtual identifiers to personal records in TEPT’s database also improves TEPT data in terms of findability, accessibility, interoperability and reusability (also known as FAIR principles ): if disambiguation allowed by the introduction of virtual identifiers trivially increases findability and accessibility of data, the indexation of philosophers’ data with external identifiers dramatically improves interoperability, and consequently reusability. Indeed, reliance on external, widely used identifiers allows for the development of semi-automatic procedures for the inclusion of large quantities of personal records. As stated above, a variety of archival sources provide data about philosophers, and some of these already come in the form of structured data (mostly spreadsheet of archival records). In the best cases, structured data are clean enough to be added almost directly (after some filtering) in TEPT’s database. Without reliance on external identifiers, we could have done such an addition only if we were sure that no data that we would have added would have overlapped with personal data already present in the database, thus duplicating philosophers. This entails that when two sources of structured data concerning the same context are available, without relying on external identifiers we would have been forced to choose one of the two sources and discard the other. By contrast, reliance on external identifiers and on procedures for their attribution allows for the assignment of identifiers to the structured data to be added, and then for the application of filters in the fields of the identifiers.

Along with our collaborators in the department of Computer Science, we devised a process to link as many philosophers as possible to their respective virtual identifiers. More specifically, dedicated annotation and disambiguation tools have been developed for the assignment of the identifiers to personal records in TEPT’s database. The application works by searching for matching identifiers in four different pages, one for each repository of virtual identifiers. In each page, the application suggests potential matches for the name of the selected philosopher in the repository (e.g. VIAF) by performing automatic queries (e.g. through the VIAF search engine). If there is one matching result of a recorded author, the author’s identifier is assigned to the name, and annotators must verify the assignment, then confirming or disconfirming it. If there are multiple matches, annotators select the identifier that they consider correct, if they find any, by exploring authors’ contextual data in the repositories of identifiers.

This disambiguation process is necessary because virtual identification leaves room for error and ambiguity. Data-aggregation procedures of VIAF, ISNI and Wikidata can lack precision and collapse different people with similar names in single entities, or they can assign two identifiers to the same author. Typically, this happens when, given two authored books 1 and 2 and two foreign countries A and B, book 1 is translated and published in country A but not in country B, while book 2 is translated and published in country B but not in country A.

Facing these ambiguities, we decided a set of explicit rules to guide us in evaluating multiple matching identifiers for each repository, e.g. in VIAF assignments, if multiple matching VIAF ids do not contain mistakes in contextual data we select the one that has the lowest serial number. Noticeably, we devised criteria that ideally leave no room for interpretations, in order to ensure consistency of assignment of virtual identifiers in the case that two collaborators need to assign ids to two datasets of philosophers that are to be merged.

New course on “Distant Reading in the History of Philosophy” in Turin

A new “Distant Reading in the History of Philosophy” course will be held at the University of Turin by the DR2 co-founder Paolo Tripodi. The course is intended both for Philosophy students belonging to the Philosophy International Curriculum and for Digital Humanities students belonging to the Language Technologies and Digital Humanities postgraduate cycle degree of the University of Turin.

Here is a link to the course webpage, where a syllabus is included.

Franco Moretti in Turin


TEPT gets an annotation tool

Thanks to the collaboration with Enrico Mensa, Davide Colla and Matteo Delsanto (Department of Computer Science of the University of Turin), TEPT can now rely on a tool for data-entry tasks and for metadata management. 

One of the tasks of TEPT is to enhance the accuracy and the usefulness of the data originally available. That is why it was important to develop a system for easy metadata management, which allows us to easily add or edit information concerning each philosopher or relation in the tree. This annotation tool allows us to interact with the database and fill it with additional information on a philosopher or relation. The following kinds of data can be added or edited:

–       Name

–       Surname

–       Birth Year

–       Birth Country

–       Death Year

–       Death Country

–       Graduation Year

–       Graduation Country

–       PhD Year

–       PhD Country

–       University

The tool also allows to add comments to each relation in the tree and to assign a label to it, thus defining its nature. This is extremely relevant as genealogical relations in the history of philosophy are historically and geographically varied. 

A certain kind of institutional relation that clearly defines a philosopher as the philosophical parent of another in a specific historical context might not have the same relevance in a different context. A clear definition of each type of relation is thus essential in order for the tree to adequately represent distinct real-world institutional relations. The identification of historically accurate descent relations in different academic contexts covered by the tree is still in progress and it will rely on domain-expert advice.

A point of note: the enrichment of the tree by the addition of various kinds of metadata does not serve any genealogical purpose directly. Nonetheless, it provides a firmer grasp on the contexts of the relations of descent. Among other things, contextual information on specific relations makes it easier to detect mistakes in genealogical lines that are integrated in TEPT. In the long run (i.e., in later stages of the project), the addition of both contextual information about relationships and biographical data about people involved will allow for the introduction of filtering functions in the visual representation of the tree. Availability of specific filters and searches will make the tree a more user-friendly and refined tool with a higher value for academic research. In the short run, the enriched tree is simply easier to browse, providing additional pieces of information to the user.

