Archives of Virology 2018; 163: 2047 – 2054

Viral species, viral genomes and HIV vaccine design: is the rational design of biological complexity a utopia?

Marc H V Van Regenmortel

School of Biotechnology, CNRS, University of Strasbourg, Illkirch, France




So far, about 5000 different viral species have been demarcated and the viruses that are members of these species all have a common name that is used for referring to the infectious agent, which is a concrete physical object. A child can thus be infected by measles virus which is a member of the genus Morbillivirus in the family Paramyxoviridae. Over the years, many thousands of non-Latinized binomial names (NLBNs) of species, for instance Measles morbillivirus, have been coined which consist of the virus name followed by the genus name and these species names are easily memorized because virologists are familiar with the names of viruses and genera. The major journals and reference books in Virology are written in English which is the predominant communication language used by scientists, and virologists, irrespectively of their mother tongue, are familiar with English virus names. The International Committee on Taxonomy of Viruses (ICTV) is currently considering replacing about 4000 of these familiar NLBNs of virus species by introducing 5000 new Latinized names for all virus species that will follow the Linneaen pattern used in biology, namely a Latin genus name followed by a Latin epithet. For more than 40 years, virologists have been opposed to the use of Latin in viral taxonomy and it seems unlikely that having to use 5000 Latin species names will be a popular alternative.

Although the sequences of viral genomes are increasingly used in taxonomy, viral species have always been demarcated on the basis of the relational properties of viruses which arise because viruses necessarily interact with biological partners such as vectors, hosts and immune systems. These relations give rise to variable phenotypic properties of viruses which are not the same in every member of the species because of the occurrence of mutations and it is impossible to predict them from the viral genome if the biological partners are unknown. It is therefore not possible to identify the numerous viruses that may be present in metagenomic databases nor to develop a species-level virus classification only on the basis of viral nucleotide sequences.

The original 1991 ICTV definition of virus species stated that a virus species is a polythetic class of viruses which is a conceptual construction of the mind and not a physical, real object located in space and time. No common defining property is present in all the members of a polythetic species class. In 2013, the ICTV redefined a virus species no longer as a class but as a material object consisting of a monophyletic group of viruses that were all physically part of the species. This new definition is an example of the logical fallacy of reification which treats abstractions such as classes as if they were physical entities. The complications that arise from this new ontology of virus species for viral taxonomy will be discussed.

Although HIV has become the virus we know most about, it is paradoxical that after three decades of high quality research funded by billions of dollars, no effective HIV vaccine has yet been developed. The reasons for this failure will be analyzed in terms of the reductionist mindset of vaccinologists who assumed that the antigenic sites in HIV spikes that are recognized by neutralizing antibodies would also be effective immunogens capable of eliciting protective antibodies in vaccinees. This expectation amounted to reducing biological immunogenicity to chemical antigenicity.

Since the epitope structure identified by X-ray crystallography of a spike-antibody complex will always be only one of the many epitopes that a polyspecific antibody can recognize, it is unlikely to correspond to a vaccine immunogen that induced the antibody and would be able to induce neutralizing antibodies. Hundreds of attempts were made to elicit protective antibodies by this type of structure-based reverse vaccinology but they failed to succeed because investigators focused their attention only on epitope-paratope complementarity and not on determining which features of the immune system (for instance the host immunoglobulin repertoire, antibody affinity maturation and various cellular regulatory mechanisms) are responsible for the induction of neutralizing antibodies.



A single property cannot define a virus species. Virus taxonomy makes use of a hierarchy of taxa, the lowest one being a virus species followed by higher taxa such as genera, families and orders. The relation between a lower taxon and the higher taxon immediately above it is called class inclusion, which is a crucial relation in the logic of all biological hierarchical classifications. To say that the species Measles virus is included in the genus Morbillivirus means that the properties required for classifying a virus as a member of that species include all the properties required to classify it as a member of the genus Morbillivirus. The lower species taxon, having fewer viruses as members, requires more properties to meet the qualification for membership. This situation arises from the logical principle that reducing the number of required qualifications increases membership whereas increasing the number of qualifications decreases membership [1]. It should be emphasized that this logical principle invalidates the claim commonly made today that a single property of a viral genome is sufficient for defining a virus species [2].

In 2013, the ICTV endorsed the following new species definition: A virus species is a monophyletic group of viruses whose properties can be distinguished from those of other species by multiple criteria. Although it was acknowledged that the criteria could be any viral property, the abundance of complete viral genome sequences encouraged the creation of new species only on the basis of a single genome metric in a particular region of the genome. Since a species was no longer considered to be a polythetic class defined by a variable distribution of several properties without any property being necessarily present in every member of the species [Fig 1], it was no longer required to make use of species-defining properties that allowed different  species to be distinguished in the same genus [2, 3]. Since species were now given the status of physical objects instead of classes, the relation of membership inclusion that is the foundation of all hierarchical classification was abandoned and species classes were no longer included in genus classes. Once a species taxon has been demarcated by virologists, it becomes possible to identify a member of that species by relying on a diagnostic marker such as a reaction with a specific monoclonal antibody but this diagnostic tool should not be confused with a species-defining property [3].

It is rarely possible to infer the phenotypic properties of a virus from its genome sequence

The reason for this is that alternative splicing and discarding of introns from a viral genome produce unpredictable RNA transcripts that interact with unknown host and vector gene products through mechanisms that have not been elucidated. This leads to relational biological and phenotypic properties of viruses that cannot be predicted from the viral genome on its own. The multiple causal mechanisms that arise from complex interactions between the transcripts of viral and host genomes and lead to phenotypic traits remain totally unknown. It is sometimes possible to infer that a virus is a member of a certain family because an intrinsic and stable family -defining property of virions is strongly correlated with a particular nucleotide sequence but this is not feasible with the members of a species whose relational properties always depend on unknown gene products of hosts and vectors. A further difficulty is that the full pangenome of a virus species (i.e. the entire set of genes present in all the members of a species) is not necessarily shared by all the members of a species [4]. It is therefore unlikely that it will be possible to attribute viral sequences detected in metagenomic databases to the currently established species that the ICTV previously demarcated using relational phenotypic properties of viruses [5].

Binomial virus species names: Non-Latinized names versus Latinized Linnaean names.

Binomial virus species names were introduced by Fenner in 1976 in the 2nd ICTV Report [6] by replacing the word “virus” that occurs in all English names of viruses (measles virus) by the genus name to which the virus belongs which also ends in “-virus” ( i.e. Measles morbillivirus). The ICTV always followed its own rules and Code of Nomenclature and did not follow the binomial order that Linnaeus had introduced in biology for species, namely genus name­ first/species identifier-second. These binomial names based on known species and genus names became widely used in plant virology papers and books [2], and subsequently several thousands of such NLBNs were proposed and introduced in the whole of virology [7-9]. However, instead of endorsing these names, the ICTV in 1998 introduced the rule that virus species names would be the same as virus names but italicized with a capital initial; measles virus became the virus and Measles virus the species. This unfortunate rule created considerable confusion because virologists found it difficult to distinguish between a viral object and a species taxon when they had the same name. Recently, the ICTV proposed yet another Latinized species nomenclature following the Linnaeus format of a genus name followed by a Latinized species epithet that would require the creation of about 5000 new Latin names [10]. As shown below, the similarity between familiar virus names and the NLBNs is easy to memorize whereas having to learn 5000 newly coined Latin epithets is unlikely to be welcomed by virologists. For more than 40 years virologists have been opposed to using a Latinized taxonomy and there is no convincing rationale for adopting the binomial system that Linnaeus had introduced in biology [5].

Two examples of virus names with their NLBN species names and proposed Latin binomial species names are:

1) Adelaide River virus, Adelaide River ephemerovirus and Ephemerovirus fiumenadelaidense

2) Merino Walk virus, Merino River mammarenavirus and Mammarenavirus viamerinense

The following taxonomic principles should be accepted by virologists [2]:

1) Virus species are not groups of real viruses but conceptual classes of the mind that have real, physical viruses as their members.

2) The 2013 ICTV definition of virus species is not appropriate because it applies equally to virus genera.

3) Virus species cannot be described but can only be defined by listing a minimum number of species -defining relational properties that arise from biological interactions with hosts and vectors.

4) The variable distribution of phenotypic properties that characterizes a polythetic class (see Fig. 1) is not itself a single common property of all the members of the class, since this would lead to the paradox that a polythetic class is a monothetic one.

5) A classification based only on nucleotide sequences of viral genomes is a classification of genomes and not of viruses.



Figure 1: Examples of polythetic and monothetic classes in the case of eight individuals (1-8) and eight properties (A-H). The presence of a property is indicated by a cross. A polythetic class is formed by individuals 1-4, with each member possessing 3 out of 4 properties, while no common defining property is present in all the members. Two monothetic classes are formed by individuals 5-6 and 7-8 respectively that share three properties in all the members while one monothetic class is formed by individuals 5, 6, 7 and 8 that share two properties in all the members.


The search for an HIV vaccine. Many studies have been devoted to analyzing the structure of antibodies that appear after several years of chronic HIV infection although these antibodies are not effective in controlling the infection in the patients from whom the antibodies were obtained. Hundreds of neutralizing monoclonal antibodies (nMabs) bound to various epitopes of the HIV spikes have been studied by X-ray crystallography in an attempt to reconstruct the epitope by reverse engineering, in the hope that the epitope designed to fit a neutralizing antibody would have acquired the immunogenic capacity to induce a protective immune response. This structure-based reverse vaccinology (SBRV) strategy [11] became the predominant approach used by hundreds of investigators who concentrated on optimizing the complementarity between single epitope-paratope pairs [12] but without determining which intrinsic properties of the immune system control the formation of neutralizing antibodies. The bound epitopes studied by crystallography have a defined and constrained antigenic structure which arises from induced fit during the binding reaction, but which is absent in the highly flexible and dynamic spikes used for immunization [13]. Furthermore, since every antibody is polyspecific because it always harbours several distinct paratopes, the one binding epitope studied by SBRV does not necessarily correspond to the immunogen that elicited the neutralizing template antibody, in which case it should not be expected to act as an effective vaccine. It has also been argued [14] that SBRV is not taking into account the degeneracy and other intrinsic features of the immune system and it is noteworthy that such arguments against SBRV have never been refuted by its proponents.

Although the role of empiricism in vaccine development is widely recognized, it is astonishing that many vaccinologists regard SBRV as a successful example of rational vaccine design. Doing something by design is doing it intentionaly and it involves the deliberate conceiving of a novel object or process .Rational drug design, for instance, consists in using the 3D structure of a biological target in order to design, by molecular docking, a drug that is able to selectively bind to it and inhibit its biological activity. In immunology, however, increasing the fit of an HIV epitope for an antibody only improves the epitope’s antigenicity but not its immunogenicity. When epitopes are referred to as vaccine immunogens, this does not mean that they are able by themselves to elicit neutralizing antibodies in the immunized host since they only trigger a series of reactions with B cell receptors that eventually leads the immune system to produce antibodies, some of which may be neutralizing. The type of antibody that is obtained is controlled by numerous genetic and cellular constituents as well as by various immunoregulatory mechanisms of the immune system. Enhancing epitope-paratope complementarity is thus not sufficient since in order to elicit protective antibodies by design, it is necessary to fully understand and reproduce the complex mechanisms that the immune system utilizes to produce such antibodies. Since this knowledge is not available to the investigator, such antibodies cannot be obtained by design.

The Nobel laureate Herbert Simon pointed out that humans only possess what he called bounded rationality which is due to the intrinsic limitations of human cognition that inevitably result from inaccurate available information and insufficient time for reaching a complete understanding of the dynamics of any complex system [15]. In order to make truly rational decisions it would be necessary to know all the relevant parameters that influence the behaviour of complex systems. Our limited knowledge of such systems explains our inability to make long-term predictions of the weather, of the stock exchange or of the results of a vaccination trial. A good example of this is the RV 144 HIV vaccine trial initiated in Thailand in 2004 which was immediately condemned by 28 senior vaccinologists from 22 different institutions in two papers published in 2004 in Science (vol 303: 316 and vol 305: 180). Ironically, this was the only vaccine trial which after 5 years showed a modest success [16].

In their daily work, scientists continually try to solve direct, downstream problems that consist in intervening in a system by determining experimentally what are the effects that follow from certain causes and they then explain their results in terms of specific causal mechanisms. Vaccinologists, on the other hand, are mainly confronted with inverse problems, for instance, trying to imagine what are the multiple causes that are responsible for the absence of deleterious HIV infection in human elite controllers. An inverse problem thus starts with an observed or wanted effect and in order to solve it, the investigator must first try to imagine a theoretical model that would explain why something took place and then to reproduce it experimentally. Since it is obviously impossible to investigate experimentally something that happens in the past, the approach used for solving direct problems is not applicable .In the case of the highly complex immune system and a virus as intractable as HIV it is not astonishing that vaccinologists have been unable to solve a series of inverse problems [15]. Since trying to solve inverse problems by imagining what are the multiple causes that could produce a wanted result is rarely feasible, it seems likely that the so-called rational design of a preventive HIV vaccine will forever remain a utopia.

Although considerable progress has been made in the treatment of HIV patients using antiretroviral drugs, it is increasingly accepted that only an effective HIV vaccine will be able to end the pandemic. However, the failure of rational HIV vaccine design is not a reason for giving up hope. Successful vaccines in the past have always been developed empirically rather than by so-called rational design, which would require an extremely detailed understanding of the enormous complexity of immune systems. We still do not fully understand why some of our successful vaccines work which is a sobering thought when trying to decide which research grant should be elaborated or funded. It cannot be excluded that another HIV trial might succeed even when most experts predict that it would not, since trial and error empiricism in the past has been most successful in spite of many initial errors. Currently, two empirical approaches for developing an HIV vaccine based on immunological tolerance [17] and on chemical inactivation of HIV virions [18] are being investigated and only phase III trial will tell us the outcome.



[1] R.C. Buck, D. L. Hull, The logical structure of the Linnaean hierarchy, Syst Zool. 15 (1966) 97-111

[2] M.H.V. Van Regenmortel, Classes, taxa and categories in hierarchical virus classification: a review of current debates on definition and names of virus species, Bionomina. 10 (2016) 1-21

[3] M.H.V. Van Regenmortel, The species problem in virology, Adv Virus Res. 100 (2018) 1-18

[4] A.F. Brito, C.T. Braconi, M. Weidmann et al, The pangenome of the Anticarsia gemmatalis multiple nucleopolyhedrovirus (AgMNPV), Genome Biol Evol. 8 (2015) 94- 108

[5] M.H.V. Van Regenmortel, Solving the species problem in viral taxonomy: recommendations on non-Latinized binomial species names and on abandoning attempts to assign metagenomic viral sequences to species taxa, Arch Virol, 164 (2019) 2223-2229

[6] F. Fenner, The classification and nomenclature of viruses. Second Report of the International Committee on Taxonomy of Viruses, Intervirology. 7 (1976) 1-115

[7] M.H.V. Van Regenmortel, On the relative merits of italics, Latin and binomial nomenclature in virus taxonomy, Arch Virol. 145 (2000) 433-441

[8] M.H.V. Van Regenmortel, D.S. Burke, C.H. Calisher, et al, A proposal to change existing virus species names to non-Latinized binomials, Arch Virol. 155 (2010) 1909-1919

[9] J.H. Kuhn, R. Durrwald, Y Bao, et al, Taxonomic reorganization of the family Bornaviridae Arch. Virol. 160 (2015) 621-632

[10] T.S. Postler, A. N. Clawson, G.K. Amarasinghe, et al, Possibility and challenges of conversion of current virus species names to Linnaean binomials, Syst Biol. 66 (2017) 463-473

[11] O. Ringel. V. Vieillard, P. Debré et al. The hard way towards an antibody-based HIV-1 vaccine: lessons from other viruses, Viruses. 10 (2018) 197, doi: 3390/v10040197

[12] R. Pejchal, I.A. Wilson, Structure-based vaccine design: blind men and the elephant? Curr Pharm Des. 16 (2010) 3744-3753

[13] M.H.V. Van Regenmortel, Immune systems rather than antigenic epitopes elicit and produce protective antibodies against HIV, Vaccine. 35 (2017) 1985-1986

[14] M.H.V. Van Regenmortel, Structure-based reverse vaccinology failed in the case of HIV because it disregarded accepted immunological theory, Int J Mol Sci. 17 (2016) 1591-1625

[15] M.H.V. Van Regenmortel, Development of a preventive HIV vaccine requires solving inverse problems which is unattainable by rational vaccine design, Front Immun. 8 (2018) 2009

[16] S. Rerks-Ngarm, P. Pitisuttithum, S. Nitayaphan et al, Vaccination with ALVAC and AIDSVAX to prevent HIV-1 infection in Thailand, N Engl J Med. 361 (2009) 2209-2220

[17] J-M. Andrieu, W. Lu, A 30-year journey of trial and error towards a tolerogenic AIDS vaccine, Arch Virol. (2018) 163, 2025-2031

[18] A. Rios, Fundamental challenges to the development of a preventive HIV vaccine, Curr Opin Virol. 29 (2018) 26-32