Surabhi Srivastava, Sofia Banu, Priya Singh, Divya Tej Sowpati & Rakesh K. Mishra
Journal of Biosciences volume 46, Article number: 22 (2021)
Abstract
Since its emergence as a pneumonia-like outbreak in the Chinese city of Wuhan in late 2019, the novel coronavirus disease COVID-19 has spread widely to become a global pandemic. The first case of COVID-19 in India was reported on 30 January 2020 and since then it has affected more than ten million people and resulted in around 150,000 deaths in the country. Over time, the viral genome has accumulated mutations as it passes through its human hosts, a common evolutionary mechanism found in all microorganisms. This has implications for disease surveillance and management, vaccines and therapeutics, and the emergence of reinfections. Sequencing the viral genome can help monitor these changes and provides an extraordinary opportunity to understand the genetic epidemiology and evolution of the virus as well as tracking its spread in a population. Here we review the past year in the context of the phylogenetic analysis of variants isolated over the course of the pandemic in India and highlight the importance of continued sequencing-based surveillance in the country.
Introduction
The SARS-CoV-2 genome
COVID-19 is caused by the RNA virus SARS-CoV-2, a betacoronavirus with a nearly 30 kb positive-sense, single-strand RNA genome that encodes 29 proteins (Wu et al. 2020). These include structural proteins utilized by the virus to package its RNA as well as proteins for enabling its entry and propagation in the host by hijacking the host cellular machinery for viral replication. SARS-CoV-2 is an enveloped virus with a host-derived lipid membrane. The viral capsid assembly is mediated by several structural proteins encoded by the virus, the most important being the S (spike protein that forms a crown-like structure), M (a hydrophobic membrane protein), E (an integral membrane protein or envelope protein) and N (an abundant nucleocapsid protein that binds the RNA genome) proteins (figure 1). The Spike protein encoded by the S gene has a receptor-binding domain specifically evolved to bind to the human angiotensin-converting enzyme-2 (ACE2) receptor found on the surface of many human cells, including those of the nasal cavity, lungs, kidneys, intestines, brain, heart and blood vessels (Li et al. 2020a, b). Respiratory transmission is the primary route of infection via the nose and mouth when infected individuals in close contact with uninfected people spread the viral particles that bind to the epithelial cells of the new hosts and enter their body. A few studies suggest a correlation between the extent of ACE2 expression in individuals and the clinical outcome of SARS-CoV-2 infection, especially in elderly populations and those with comorbidities (Li et al. 2020a, b; Wang et al. 2020).