Artificial intelligence (AI) is the development of computer systems that can perform tasks that normally require human intelligence. Advances in AI software and hardware, especially deep learning algorithms and the graphics processing units (GPUs) that power their training, have led to a recent and rapidly increasing interest in medical AI applications. In clinical diagnostics, AI-based computer vision approaches are poised to revolutionize image-based diagnostics, while other AI subtypes have begun to show similar promise in various diagnostic modalities. In some areas, such as clinical genomics, a specific type of AI algorithm known as deep learning is used to process large and complex genomic datasets. AI has two main applications in genetics: identification of harmful genes and treatment of disease. Let’s take a look at how each of these applications works. Also, with some good GPS trackers, you can implement them on your vehicle. AI is also being used to identify genetic mutations within tumors using 3D imaging. AI is very useful in personalized medicine, which requires treatments to be specified to one patient’s needs versus another. The 0.1% of our DNA that is unique to us has more than three million differences.
The rise of AI in genomics is unsurprising. Genomics – a ‘big data’ field – requires computational approaches to interrogate the enormous volume of data generated by sequencing technologies and to marry it in meaningful ways with other biological and clinical data. Analyzing these datasets for new biological insights can be especially difficult when the rules have to be explicitly predefined, step by step, within the computer code. Instead, machine learning techniques can learn from data without the need to specify explicit rules.
Although a highly useful tool, AI is not without challenge the key issue of AI in genomics is scale – with the amount of genomics data being generated exponentially growing. Slavé Petrovski, Head of Genome Analytics and Informatics at AstraZeneca’s Centre for Genomics Research (GCR) says that – “AI in genomics can be extended across different omic studies, such as transcriptomics.” Although there are challenges of having the infrastructure and resources to cope with large datasets and mine it effectively if managed correctly the problem is eliminated.
Slavé Petrovski highlighted a key piece of research, conducted by AstraZeneca, which presented a multi-dimensional machine learning framework, taking into account 52 layers of information including gene expression, human disease literature, and mouse phenotypes. This approach is proposed as a “support framework for objectively and quantitatively triaging potential novel disease target genes.
Another focus within genomic AI is the strengthening of data. Petrovski remarks that this area is “constantly evolving,” but a key observation is that the method adopted is often not as important as the underlying data. This means that the information inputted into AI systems must be of high quality or it cannot be used to its full potential.
Genetics is always a data-driven science and largely utilizes machine learning to capture dependencies in data and derive biological hypotheses. However, the ability to extract new insights from the exponentially increasing volume of genomics data requires more expressive machine learning models. By effectively combining large data sets, deep learning has transformed the field, and now, it is becoming the method of choice for many genomics modeling tasks, including predicting the impact of genetic variation on gene regulatory mechanisms such as DNA accessibility and splicing.
In recent years, deep learning has been widely used in diverse fields of biology. In biology, applications of deep learning are gaining increasing popularity in predicting the structure and function of genomic elements, such as promoters, enhancers, or gene expression levels.
The application of deep learning to genomic datasets is an exciting rapidly developing area and is primed to revolutionize genome analysis. Deep learning has been successfully implemented in areas such as image recognition or robotics (e.g., self-driving cars) and is most useful when large amounts of data are available. In this respect, using deep learning as a tool in the field of genomics is entirely apt. Although it is still in somewhat early stages, deep learning in genomics has the potential to inform fields such as cancer diagnosis and treatment, clinical genetics, crop improvement, epidemiology and public health, population genetics, evolutionary or phylogenetic analyses, and functional genomics. Also, in recent years, deep learning (DL) methods have been considered in the context of genomic prediction. The DL methods are nonparametric models that provide flexibility to adapt to complicated associations between data and output and adapt to very complex patterns. The applications of deep learning (DL) methods in genomic selection (GS) to obtain a meta-picture of GS performance and highlight how these tools can help solve challenging plant breeding problems. We also provide general guidance for the effective use of DL methods including the fundamentals of DL and the requirements for its appropriate use. The main requirement for using DL is the quality and sufficiently large training data. There is clear evidence that DL algorithms capture nonlinear patterns more efficiently than conventional genome-based. Deep learning algorithms can integrate data from different sources as is usually needed in GS and it shows the ability for improving prediction accuracy for large breeding data. It is important to apply DL to large training-testing data sets.
Functional genomic analysis in the field in which deep learning has made the most inroads to date. The availability of vast troves of data of various types (DNA, RNA, methylation, chromatin accessibility, histone modifications, chromosome interactions, and so forth) ensures that there are enough training datasets to build accurate prediction models relating to gene expression, genomic regulation, or variant interpretation. Other features are identification of long noncoding RNAs or splice-site prediction can also be analyzed. As more data becomes available, better models will be able to be trained, thus resulting in even more precise and accurate predictions of genomic features and functions.
Although deep learning holds enormous promise for advancing new genomics discoveries, it should also be implemented mindfully and with appropriate caution. Deep learning should be applied to biological datasets of sufficient size, usually on the order of thousands of samples.
Even though the use of AI and deep learning can solve many problems related to genomic data, the use is challenging and not cost-effective at all. It needs high-quality machines and skilled experts. But as it was said that – ” If it’s worth it then it’s nothing to worry about.” But if we succeed in applying the concept of deep learning and AI in genetics with cost-effectiveness it will create a history in the world of science and biology and it is possible that humans can reach new heights of genetics and the human genome.
Blackcoffer Insights 32: Manika Gupta, Shivaji College, University of Delhi