The San Diego supercomputer center’s “Triton” provides testing for DNA research
November 12, 2022 — As humans, we have trillions of cells. Each cell has a nucleus with individual genetic information – DNA – that can mutate to create a defect. If a person is born with an abundance of abnormalities within cells, or if mutations develop over time, disease ensues. To make this even more complicated, cells are often a mixture of abnormal and normal DNA — a mosaic, so to speak, and like an art form, this complex montage is hard to fathom. However, a research team led by Joseph Gleeson, MD, Rady Professor of Neuroscience at UC San Diego School of Medicine and Director of Neuroscience Research at the Rady Children’s Institute for Genomic Medicine, is using the Triton Shared Computing Cluster (TSCC) at the San Diego Computer Center. Giant (SDSC) at the University of California, San Diego, processed the data and trained the model to reveal new ways to identify DNA mosaics.
Gleeson and his team recently discovered new genes and pathways in malformations of cortical development, a spectrum of disorders that cause up to 40 percent of drug-resistant focal epilepsy. Their research shows how computer-generated models can more efficiently simulate the actions of human recognition in a more effective way, and was published this week in the journal Nature Genetics. A related study was published earlier this month in Nature Biotechnology.
“We started with experimental customization on SDSC’s Comet supercomputer many years ago, and have been part of the TSCC community for nearly a decade,” said Xiaoxu Yang, a postdoctoral researcher in Dr. Gleason’s Pediatric Brain Diseases Laboratory. “TSCC allows us to draw models generated by a computer-recognition program called DeepMosaic and these simulations allowed us to realize that once we trained the supercomputer program to identify abnormal regions of cells, we could quickly examine thousands of mosaic variants from every human genome — this wouldn’t be the case. possible if it was performed with the human eye.”
This type of computer-generated knowledge is known as convolutional neural network-based deep learning and has been around since the 1970s. At that time, neural networks were already being built to mimic human visual processing. It took researchers only a few decades to develop accurate and efficient systems for this type of modeling.
“Often the goal of machine learning and deep learning is to train computers to perform prediction or classification tasks on labeled data. When the trained models prove accurate and effective, researchers will use the information gained — rather than manual annotations to process large amounts of information,” explains Shin. Shaw, a former research assistant in Gleeson’s lab and now a data scientist at Novartis. . “We’ve come a long way over the past 40 years in developing machine learning and deep learning algorithms, but we’re still using the same concept that replicates human ability to process data.”
Shaw points to the knowledge needed to better understand the diseases that occur when abnormal mosaicism overtakes normal cells. Yang and Xu work in a lab that aims to do just that — better understand the mosaics that lead to diseases — such as epilepsy, congenital brain disorders, and more.
“Deep learning approaches are much more effective and their ability to discover hidden structures and connections within data is sometimes beyond human ability,” Xu said. “We can process the data faster this way, which leads us more quickly to the knowledge needed.”
For more information about TSCC, visit tritoncluster.sdsc.edu
Source: Kimberly Mann Bruch, San Diego Supercomputer Center