Bioinformatics is a relatively new scientific subdiscipline that incorporates elements of biology and computer science together for the purpose of developing efficient and robust methods for the analyses and interpretation of large amounts of biological data, such as DNA,
As one demonstration, the team of researchers picked out a human protein, known as RPA43, which has three lysine-rich LCRs. This protein is one of many subunits that make up an enzyme called RNA polymerase 1, which synthesizes ribosomal RNA. The scientists discovered that the copy number of lysine-rich LCRs is important for helping the protein integrate into the nucleolus, the organelle responsible for synthesizing ribosomes.
In a comparison of the proteins found in eight different species, the researchers found that some LCR types are highly conserved between species, meaning that the sequences have changed very little over evolutionary timescales. These sequences tend to be found in proteins and cell structures that are also highly conserved, such as the nucleolus.
“These sequences seem to be important for the assembly of certain parts of the nucleolus,” Lee says. “Some of the principles that are known to be important for higher order assembly seem to be at play because the copy number, which might control how many interactions a protein can make, is important for the protein to integrate into that compartment.”
The MIT team also found differences between LCRs seen in two different types of proteins that are involved in nucleolus assembly. They discovered that a nucleolar protein known as TCOF contains many glutamine-rich LCRs that can help scaffold the formation of assemblies, while nucleolar proteins with only a few of these glutamic acid-rich LCRs could be recruited as clients (proteins that interact with the scaffold).
Another structure that appears to have many conserved LCRs is the nuclear speckle, which is found inside the cell nucleus. The researchers also found many similarities between LCRs that are involved in forming larger-scale assemblies such as the extracellular matrix, a network of molecules that provides structural support to cells in plants and animals.
The research team also found examples of structures with LCRs that seem to have diverged between species. For example, plants have distinctive LCR sequences in the proteins that they use to scaffold their cell walls, and these LCRs are not seen in other types of organisms.
Now the researchers plan to expand their LCR analysis to additional species.
“There’s so much to explore, because we can expand this map to essentially any species,” Lee says. “That gives us the opportunity and the framework to identify new biological assemblies.”
Reference: “A unified view of low complexity regions (LCRs) across species” by Byron Lee, Nima Jaberi-Lashkari and Eliezer Calo, 13 September 2022, eLife.
The research was funded by the National Institute of General Medical Sciences, National Cancer Institute, the Ludwig Center at MIT, a National Institutes of Health Pre-Doctoral Training Grant, and the Pew Charitable Trusts.