Introduction

The new corona virus, SARS-CoV-2 which causes the disease Coronavirus disease 2019 (COVID-19) has been declared as a pandemic by WHO [1], due to its spread in almost all the countries over the world. The entire world is in crisis due to reasons including, (i) COVID-19 has rapid transmission characteristics, (ii) lack of vaccine or medicine to cure the disease, (iii) the virus easily spreads through droplets generated when an infected person coughs, sneezes, or speaks [2], (iv) An infected person can infect others even asymptomatically as well as during pre-symptomatic stage [3,4,5], and (v) lack of sufficient medical infrastructure, test kits, equipment, ventilators, intensive-care units (ICUs), masks, appropriate medical suits, etc., to accommodate the rising number of infected cases. Since the number of people affected is huge, performing clinical test to all the individuals is unlikely due to both availability of only limited number of testing kits [6] as well as the time involved in testing [7]. Though government authorities of many countries implement social distancing, isolation, and lockdown (curfew) to control the spread of the virus, it keeps on increasing rapidly. Due to the real situation, healthcare industry is at an urgent need to tackle the virus effectively.

AI due to its following exceptional characteristics plays a crucial role in combating various issues of COVID-19. ML algorithms are capable of learning from huge examples and extracting valuable inferences automatically in less time which facilitates faster decision-making. The availability of huge amount of historical data, advancement of hardware technologies such as multicore processors, graphical processing units (GPU), and high-speed memory helps the establishment of deeper and reinforcement learning paradigms through complex neural networks which enables object recognition, prediction, or classification with higher accuracy. Thus, AI-based tools can serve as a supplement to physicians and policy-makers to customize the healthcare plans and follow-ups according to the priorities and situation of the crisis as well as to face future pandemic situations with well-established countermeasures.

The research question of this paper is to identify the potential applications of AI to resolve the issues associated with COVID-19. Research publications have been searched via Google search engine using different keywords including ‘COVID-19’, ‘applications of AI for COVID-19’, ‘applications of machine learning and deep learning for COVID-19’, ‘digital intelligence for COVID-19’, etc. A collection of 92 publications from different electronics databases, namely, PubMed, Researchgate, arXiv, medRixv, bioRxiv, chemRxiv, and Google Scholar repositories is retrieved as in Table 1.

Table 1 Details of publications searched via electronic databases

Though the publications retrieved have dealt with the theme of COVID-19 in a broader sense, the publications were found to have a wider scope to include various dimensions such as the statistical and mathematical modeling of dispersion or spread of corona virus, the applications of the internet of things (IoT) for COVID-19 issues, the applications of cloud computing and blockchain for COVID-19 challenges, the applications of microbiology, biochemistry and pathology to develop drugs and vaccines for COVID-19 along with generic survey, and AI publications. Therefore, the publications relevant to the research question are filtered out by the method of systematically reviewing the retrieved literature. That is, the collected articles are manually applied to the filtering criteria of publications that discuss the applicability of AI for combating COVID-19 issues, which resulted in 60 publications. Short descriptions of the reviewed literature are given in Table 2, along with a note on methods, strength, and weaknesses. The potential applications of AI for COVID-19 are shown in Fig. 1.

Table 2 Summary of reviewed literature
Fig. 1
figure 1

Potential applications of AI for COVID-19

Machine learning is a subset of AI in which statistical algorithms are trained to learn from small amount of data itself to solve a problem whereas deep learning is a subset of machine learning where deep neural network architectures are constructed to provide automatic learning of features and their representation at different hierarchical levels from huge amount of data. ML consists of different learning strategies namely supervised, unsupervised, semi-supervised, and reinforcement which are being employed according to the problem in hand. For classification and regression problems, supervised learning is employed where the algorithms are trained with labeled data to construct models. Examples for supervised algorithms include decision tree, random forest, K-nearest neighbor, support vector machine, regression, Naïve Bayes, etc. For clustering and association rule mining, unsupervised learning is used where the algorithms learn themselves from the unlabeled data according to similarity among the individual data items. In supervised learning, before training, each data item needs to be associated with its label. This consumes huge amount of time when the number of data is large. Therefore, another strategy called semi-supervised learning has been developed where at first the data are grouped into different clusters, and then, labels are annotated to the clustered in a very short time. In reinforcement learning, the algorithms are trained to take decisions in a trial-and-error method according to the environmental factors.

From the investigation made with the retrieved articles, it is found that AI provides its support for COVID-19 in different dimensions including (i) surveillance and early warnings, (ii) detection of infected individuals and assessment of risk levels, (iii) COVID-19 diagnosis, (iv) mortality prediction/prediction of individuals at high and prognostic treatment, (v) drug repurposing and drug discovery, and (vi) identification of protein structure and vaccine development. These applications are discussed in the subsequent sections. In addition, the paper discusses the challenges associated with implementation of AI-based solutions and directions for further research and development.

AI for Surveillance and Early Warnings

AI plays a crucial role throughout the surveillance of an epidemic, prediction of trend, providing assistance to policy-makers in taking preventive or countermeasures and assessing the effectiveness of the interventions, as shown in Fig. 2. As in Fig. 2, data are collected from different sources and analyzed using different AI techniques such as time-series and regression to identify the trend of the evolution, places which are going to be affected by the infection, when the peak of spread is likely to occur, etc.

Fig. 2
figure 2

Role of AI in surveillance

Basically, the public health surveillance is data driven and the evolving Internet and variety of open databases ease the sharing of required data to surveillance. There are several factors such as virus resistance, population density, population mobility, climatic conditions, latitude, and longitude/location which affect the spread of the virus. For example, in [32], among various parameters, namely, minimum temperature, maximum temperature, average temperature, humidity, amount of rain fall, the average temperature was found to be significantly correlated with the spread of the virus. Also, as described in [33], absolute air temperature and humidity significantly affect the spread of the virus.

Furthermore, AI-based tools and multiagent systems have the capability to mine both structured and unstructured data such as image, video, text, audio, genomic sequenced data, geographic data, data from medical devices, Electronic Health Records (EHR), and data generated by wearable devices. Also, these systems can simulate the evolution of a pandemic over time, space, and people [34, 35].

With the help of both statistical techniques such as time-series analysis and automated AI-based tools, epidemiologists along with support of data scientists are able to forecast and give daily updates on number of infected cases, number of death cases, and number of recovered cases in different countries all over the world. The potential of AI in giving early warnings can be understood with tools such as Bluedot and HealthMap [8]. The Canadian-based AI tool, BlueDot predicted the spread of Zika virus to Florida 6 months before as well as the outbreak of Ebola in West Africa. Now, in the case of COVID-19 also, this tool has identified and gave early warning by 31st December 2019 itself. This tool continuously reviews over 40 pathogen-specific datasets and integrates the data with other data resources such as news, airline ticket sales, demographic data, climate data, and animal and insect population to track over 150 infectious diseases all over the globe. In addition, another AI-based model, HealthMap at Boston Children’s hospital in USA issued an alarm on 30 December 2019 and identified the major cities that were likely to be affected by the spread. HealthMap integrates data from Google database, social media, blogs, discussion forum, etc., and predicts the epidemic.

The ability of AI-based tools in predicting the spread of a virus well in advance helps the policy-makers to plan, prioritize, and implement different interventions such as lockdown, keeping isolation centers ready, etc., according to the severity of the spread. In addition, AI also helps to check whether the public adheres to the preventive actions taken by policy-makers. For example, in Hyderabad, India, CCTV cameras which are fixed across the city are equipped with deep learning and computer vision techniques to identify the people who are not wearing face masks and will give some alerts to control center.

AI for Early Detection and Assessment

Detection of infection in affected individuals has to be done as early as possible not only for giving treatment to the infected individual but also for preventing the secondary infection. Due to lack of sufficient number of reverse transcription polymerase chain reaction (RT-PCR) testing kits and expensive cost, it becomes essential to assess the risk levels associated with individuals, so that the individuals at high risk can be given priority for testing. As mentioned in [12], phone-based surveys are analyzed using machine learning algorithms to assess the risk levels associated with individuals into different categories, namely, high risk, moderate risk, and low risk.

In this section, the usage of a machine learning algorithm to assess the risk level associated with an individual is illustrated with Naïve Bayesian classification algorithm. This algorithm is chosen as it can handle multiclass problems where more than two class labels will be involved. The problem in hand is a multiclass problem, i.e., the classifier has to predict the class label of an input data (the symptoms of COVID-19) into different risk levels or classes, namely, low, moderate, and high. In addition, this algorithm is simple and can work well even with small training data set. The algorithm assumes that the features are independent of one another. To describe the risk assessment with this algorithm, eight attributes, namely, travel (indicates an individual who had traveled abroad), contact (individual who had been in contact with people who traveled abroad), fever, cough, sneeze, shortness of breath, sore throat, and comorbidity (indicates the presence of chronic illnesses such as heart diseases, diabetes, and lung diseases) are considered [12, 36, 37].

A simple hypothetical training data set is generated as given in Table 3. For simplicity, the attributes are taken as binary attributes and can have two possible values, 0 and 1. For an attribute, the presence of 0 indicates false value and the presence of 1 indicates true value.

Table 3 Sample training dataset

In this example, based on the attributes, the risk for COVID-19 associated with an individual is predicted using the Naïve Bayesian algorithm. Naïve Bayesian algorithm is a probability-based prediction model which works based on Bayes theorem. According to Bayes theorem, the probability of a class variable, say ‘c’ happening, given that the feature(evidence), say ‘x’ has occurred is expressed (i.e., posterior probability of class variable) as in Eq. (1)

$$P(c|x) = \frac{{P(x|c)P(c)}}{{P(x)}},$$
(1)

where P(c|x) denotes the posterior probability of class (target) given predictor (attribute), P(c) denotes the prior probability of class, P(x|c) denotes the likelihood which is the probability of predictor, given class and P(x) denotes the prior probability of predictor.

Here, the attributes are assumed to be independent and the classification problem is framed as in (2)

$$P(c|x_{1} ,x_{2} , \ldots ,x_{n} ) = \frac{{P(x_{1} /c) \times P(x_{2} /c) \ldots \times P(x_{n} /c) \times P(c)}}{{P(x)}}.$$
(2)

Here, the posterior probability of a class can be calculated by first constructing a frequency table for each attribute against the target class. Then, the frequency tables are converted to likelihood tables. Finally, the posterior probability for each class is calculated using the above equation. The class with the highest posterior probability is the outcome of prediction. Please note, the denominator term, i.e., P(x) is going to be a common term while calculating the posterior probability different class labels and hence it is ignored. Therefore, the posterior probability is computed using Eq. (3)

$$P(c|x_{1} ,x_{2} , \ldots x_{n} ) = P(x_{1} /c) \times P(x_{2} /c) \ldots \times P(x_{n} /c) \times P(c).$$
(3)

Calculation of P(x|c) and P(x)

Frequency tables for the eight attributes are shown in Table 4. The likelihood tables for the above attributes (i.e., P(x|c) are given in Table 5.

Table 4 Frequency tables for the attributes
Table 5 Likelihood tables for attributes (i.e., P(x|c))

Calculation of P(c)

In Table 3, the column, risk level refers to target class label and it can have three values, high, moderate, and low. Prior probability of class [i.e., P(c)] is computed and given in Table 6.

Table 6 Prior probability of class, P(c)

Calculation of Posterior Probability, P(c|x)

Now, consider test dataset as given in Table 7.

Table 7 Test dataset

Now, consider the first record as X. To find the class label for X which is having attribute values, travel = no, contact = no, fever = yes, cough = yes, sneeze = yes, shortness of breath = no, sore throat = no, comorbidity = no, posterior probabilities for different risk levels are calculated as given through Eqs. (4), (5) and (6). Then, the class which has the highest value for probability is predicted as class of the given record

$$\begin{aligned} P({\text{high}}|X) & = P({\text{travel}} = {\text{no|high}}) \times P({\text{contact}} = {\text{no|high}}) \times P({\text{fever}} = {\text{yes|high}}) \\ & \times \;P({\text{cough}} = {\text{yes|high}}) \times P({\text{sneeze}} = {\text{yes|high}}) \\ & \times \;P({\text{shortness\_of\_breathe}} = {\text{no|high}}) \\ & \times \;P({\text{sore\_throat}} = {\text{no|high}}) \times P({\text{comorbidity}} = {\text{no|high}}) \times P({\text{high}}) \\ \end{aligned}$$
(4)
$$P({\text{high}}|X) = \frac{3}{7} \times \frac{6}{7} \times \frac{4}{7} \times \frac{2}{7} \times \frac{2}{7} \times \frac{6}{7} \times \frac{6}{7} \times \frac{3}{7} \times \frac{7}{{17}} = \frac{{31104}}{{14000231}} = 0.0022216776.$$
(5)

Similarly

$$P({\text{moderate}}|X) = \frac{7}{8} \times \frac{7}{8} \times \frac{4}{8} \times \frac{2}{8} \times \frac{1}{8} \times \frac{7}{8} \times \frac{6}{8} \times \frac{6}{8} \times \frac{8}{{17}} = \frac{{98784}}{{35651584}} = 0.0027708166,$$
(6)
$$P({\text{low}}|X) = 0.$$
(7)

Now, the posterior probabilities are normalized between 0 and 1 as given in Eqs. (8), (9) and (10)

$$P({\text{high}}|X) = \frac{{0.0022216776}}{{0.0022216776 + 0.0027708166 + 0}} = 0.4450035415,$$
(8)
$$P({\text{moderate}}|X) = \frac{{0.0027708166}}{{0.0022216776 + 0.0027708166 + 0}} = 0.5549964585,$$
(9)
$$P({\text{low}}|X) = 0.$$
(10)

Since, the moderate class is having highest probability, for the given record, X, the class is predicted as ‘moderate’.

Similarly, the class label has been predicted for the remaining two records as ‘moderate’ and ‘moderate’, respectively.

The accuracy of the above hypothetical example is 100% as the predicted classes exactly match with those of actual class labels.

In real application, the input dataset is split into training data and test data where training data are used to train the algorithm. The training data should include that are likely to occur. The training dataset should be carefully chosen to all possible situations and cases as the quality of training dataset is important in determining the accuracy of the algorithm. After training, the algorithm needs to be employed over test dataset to verify the adequacy of accuracy of the algorithm for a given problem. Along with accuracy, precision, recall, and f_score are also used to validate the performance of the model. The first step of evaluation is to construct the confusion matrix. The second step is to compute different measures from the confusion matrix. The computation of different evaluation measures for a multiclass problem is shown with sample of 100 test records with assumed values for different classes as in Table 8.

Table 8 Confusion matrix for a multiclass problem with three classes high, moderate, and low

After constructing the confusion matrix, the values for precision and recall can be calculated using two methods, microaverage and macroaverage [16]. In microaverage method, individual true positive, true negative, false positive, and false negate are added up for all class labels and then average is taken. In macroaverage method, precision and recall values for each class are computed, and then, the average is taken. To provide an insight to the reader, the macroaverage method is discussed below.

With values given in Table 8, the precision and recall for the individual classes can be computed.

Now, consider the class label, high. Recall and precision for this class are computed as

$${\text{Precision}}\_{\text{for}}\_{\text{high}} = {\text{TP}}\_{\text{high}}/\left( {{\text{TP}}\_{\text{high}} + {\text{FP}}\_{\text{high}}} \right) = 80/\left( {90} \right) = 88.8\%$$
$${\text{Recall}}\_{\text{for}}\_{\text{high}} = {\text{TP}}\_{\text{high}}/\left( {{\text{TP}}\_{\text{high}} + {\text{FN}}\_{\text{high}}} \right) = {\text{8}}0/\left( {{\text{1}}00} \right) = {\text{8}}0\% .$$

Similarly, recall and precision for the other classes moderate and low are computed as follows:

$${\text{Precision}}\_{\text{for}}\_{\text{moderate}} = {\text{TP}}\_{\text{moderate}}/\left( {{\text{TP}}\_{\text{moderate}} + {\text{FP}}\_{\text{moderate}}} \right) = 90/\left( {90 + 10} \right) = 90/100 = 90\%$$
$${\text{Recall}}\_{\text{for}}\_{\text{moderate}} = {\text{TP}}\_{\text{moderate}}/\left( {{\text{TP}}\_{\text{moderate}} + {\text{FN}}\_{\text{moderate}}} \right) = 90/\left( {100} \right) = 90\%$$
$${\text{Precision}}\_{\text{for}}\_{\text{low}} = {\text{TP}}\_{\text{low}}/\left( {{\text{TP}}\_{\text{low}} + {\text{FP}}\_{\text{low}}} \right) = 90/\left( {90 + 10} \right) = 90/100 = 90\%$$
$${\text{Recall}}\_{\text{for}}\_{\text{low}} = {\text{TP}}\_{\text{low}}/\left( {{\text{TP}}\_{\text{low}} + {\text{FN}}\_{\text{low}}} \right) = 90/\left( {90} \right) = 100\% .$$

Now, the precision and recall for the models are computed by taking the average of the individual class values. Therefore, precision of the model is average_of (88.8 + 90 + 90) = 89.6%. Similarly, the recall of the model is average_of (80 + 90 + 100) = 90%.

Another interesting point about Naïve Bayesian algorithm is that the value of probability gives the confidence of prediction also. Thus, machine learning algorithms can be used to assess risk level associated with individuals. With the knowledge about risk levels of individuals, individuals at high risk can be given higher priority for testing as well as hospital-based medical treatment. Furthermore, this kind of preliminary assessment using machine learning algorithms serves as a non-invasive and non-contact method of screening, which too reduces the spread of infection. It helps in keeping ready the required number of medical equipment for the needy.

AI for Diagnosis

As mentioned earlier, lung CT scan is one of the testing procedures for detecting COVID-19 infection in lungs. Convolutional Neural Network (CNN), a kind of neural network which takes images as input, assigns importance to various aspects of the image and when taught with huge number of examples with annotated labels, can classify the images. The typical design of CNN for image classification is shown in Fig. 3 [38]. The design of input and output layers is simple and straight forward. The input layer would contain the values of pixels of lung CT scan images. For example, if the image is \(64 \times 64\) pixels, then there will be 4096 inputs having values of intensity ranging from 0 to 1.

Fig. 3
figure 3

Higher level CNN-based deep learning for COVID-19 classification

In CNN, one or more convolutional and pooling layers are used to perform extraction of features such as edges, lines, etc., and reduction of spatial representation of images, respectively [15]. Basically, the convolutional layer applies a filter that moves over the input image and extracts features from the input image. Filters use \(3 \times 3\) or \(5 \times 5\) kernels which when moves over the input image, performs element-wise multiplication and addition. The number of pixels needs to be shifted is determined by a parameter called stride. There are different types of filters which extract different features, namely edges, lines, etc. The convolutional layer includes a rectified linear unit (ReLU) activation function which is used to bring non-linear transformation in the network. Pooling layer is responsible for reducing the spatial size of the convolved feature to reduce the computational power required to process the data through dimensionality reduction. There are different types of pooling, namely, Max Pooling, which returns the maximum value from the portion of the image covered by the filter and Average Pooling which returns the average of all the values from the portion of the image covered by the filter. Fully connected serves as a classifier. It uses extracted features and predicts the label of the image.

An important feature of CNN is that weights of these filters are automatically learnt and fixed during training. All these extracted features then are ‘combined’ to form feature maps which enable the image classification in the fully connected layer. Typically, in healthcare applications, to ensure high accuracy, deep learning algorithms are designed in supervised learning model. That is, the input images are annotated with corresponding class labels and given for learning. When deep learning algorithm is given a huge number of training images with annotations, the algorithm will learn through the hidden, connected neural network and will try to detect the common patterns/similarities from the images with similar labels. Once the learning step is complete, then the algorithm can predict the type of unknown image and classify the label of unknown image. In addition to image classification, as in [18], in the case of COVID-19-positive images, the abnormalities and ground glass opacities (GGO) are mapped and measured using 2D slice analysis and 3D volume analysis tools which are then visualized with 3D visualization tools. Ultimately, deep learning can analyze a huge number of CT scans per day and reduce the burden lying with the radiologist; also, the performance of the algorithm is comparable to an expert radiologist as in [19, 31].

AI for Mortality Prediction and Prognostic Support

Predicting mortality risk is very important in the case of COVID-19, as the number of infected cases goes on increasing very rapidly. Obviously, both developed and developing countries do not have sufficient medical infrastructure to accommodate all the affected individuals. In such situation, it becomes mandatory to have a knowledge on (i) how many individuals are at high risk, (ii) number of individuals who are already having chronic diseases such as heart disease, diabetes, and lung diseases, (iii) number of individuals who are in need of ventilators, and (iv) number of individuals who are in need of ICUs, etc.

As mentioned earlier, research works [21,22,23,24,25,26] have employed AI algorithms to predict people who are at mortality risk and provide information about the same well in advance. For example, as in [21], an open-source tool using Random Forest algorithm has been deployed online through https://ashis-das.shinyapps.io/CoCoMoRP/ for mortality prediction which enables appropriate planning for keeping the required number of ventilators ready. In [22], neural network-based model has been developed using data collected from different hospitals, all around the world, which predicts mortality with an accuracy of 89.98%, based on demographic information, physiological data, patient’s symptoms, and pre-existing conditions. Three features, namely, LDH, hs-CRP, and lymphocytes, have been identified as predictors for mortality based on clinical data collected from China [23]. Also, XGBoost algorithm was found to predict mortality rate more than 10 days in advance and then assists in prognostic treatment. In [39], machine learning techniques have been applied to large collection of patient-level data to predict mortality based on three clinical features, namely, patient’s age, minimum oxygen saturation, and type of patient, i.e., inpatient or outpatient. Experimentation with two different datasets of the work exhibits that the patients who died are older with mean age of 73.4 years and had lower oxygen saturation at their presentation to hospitals. Also, they had comparatively later encounter at a hospital. Patients who died are likely to have hypertension and diabetes. Alternate to mortality prediction, in the research work [40], different data mining models such as decision tree, support vector machine, naïve Bayes, etc., have been used to predict the minimum and maximum number of days for recovery of the patients with respect to age of the patients. This study predicted that the patients of age from 65 to 85 are at high risk of recovery from the disease.

AI for Repurposing Drugs

Repurposing drugs is a method of using existing drugs for the treatment of other new diseases for which the drugs had not been approved. It provides a very cost-effective and less time-consuming approach for identifying existing drugs for emerging diseases. As seen from Fig. 4 and Fig. 5, the timeline (excluding Food and Drug Administration post-market safety monitoring) involved in the conventional drug discovery is very long of around 15 years where it is only around 3 to 10 years in drug repurposing [41]. The time spent in initial drug discovery and pre-clinical trial stages are greatly reduced in drug repurposing [42, 43]. In addition, risks and cost involved in drug repurposing also greatly reduced.

Fig. 4
figure 4

Timeline in conventional drug discovery

Fig. 5
figure 5

Time line in drug repurposing

In compound identification stage, computational approaches are efficient in identifying the exiting drugs. AI-based models are used to scan through drug and disease databases containing terabytes of published and unpublished data to construct biomedical knowledge graph with more than 31 million biomedical disease and drug concepts. The graph provides clues in identifying the existing drugs which inhibit the viral infection into lungs. Consider the portion of the knowledge graph, as shown in Fig. 6. It is understood that that the virus binds to a particular protein called ACE2 to enter into lung cells. This biological process is called endocytosis which is regulated by another protein called AAK1. As in Fig. 6, it is clear that the biomedical graph helps in identifying a number of compounds that inhibit AAK1, including Baricitinib which is already licensed to treat rheumatoid arthritis. In addition to knowledge graph-based approach, there are various other approaches namely, molecular docking, target, pathway mapping, machine learning, artificial neural network, deep learning and text mining-based approaches as discussed in [44]. The computational methods are very effective in the high level integration of existing knowledge and data which in turn identifies existing drugs for treatment of new diseases along with side effects and other interactions. In the case of COVID-19, the computational methods help in identifying potential drugs including chloroquine, remdesivir, ritonavir, etc. [43].

Fig. 6
figure 6

A portion of biomedical knowledge graph

Furthermore, AI also helps in extracting information about the adverse effects that may exist when two or more drugs are taken together with the help of entity linking tools such as Falcon. For example, this tool enables the understanding of the potential adverse medical conditions that may occur when an existing drug is tried in the presence of chronic conditions such as high blood pressure, asthma, or diabetes. Furthermore, AI models are trained to explore a huge chemical space and they automatically screen the huge space and identify the most potential candidates based on different parameters like opportunity, expected response reliability, and safety. For example, as discussed in [30], a reinforcement learning approach is used to identify potential candidates to inhibit the interactions of the virus with 3C-like protease, one of the target proteins for which the crystal structure is known. The algorithm helped in identifying 284 molecules to inhibit the interaction which will further be explored for development of new drugs.

AI for Study of Protein Structure and Vaccine Development

In general, vaccines imitate the virus to produce defensive white blood cells in our body. Vaccines are of different types, namely, whole pathogen vaccine designed using killed or weakened pathogens, subunit vaccine which use part of the germ like protein, and nucleic acid vaccine which contain the genetic material of the germ. Here, AI is useful in accelerating the development of subunit and nucleic acid vaccines. Determining the structure of a protein is essential in understanding the function of a protein which in turn facilitates the development of vaccines. However, a single protein is composed of several amino acids and determining the structure of a protein is tedious. When an antigen (say, COVID-19) enters a human body, it causes the immune system of a body to generate antibodies, specialized proteins which fight against antigen. Different kinds of antibodies including IgG, IgA, IgM, IgE, and IgD will get generated according to the antigen entered into a human body. Each antibody has a unique binding site shape which locks onto the specific shape of the antigen. Then, the antibodies destroy the antigen. An epitope is a cluster of amino acids found on the outside of the antigen and is the part of an antigen that is recognized by the antibodies. Finding and classifying epitopes are essential in developing new drugs.

Machine learning algorithms such as Support vector machines (SVM), hidden Markov Models, and artificial neural networks (specifically deep learning) have all proven to be faster and more accurate at identifying epitopes [45]. For example, Google DeepMind introduced Alphafold system to find protein structure of COVID-19 virus with Deep Residual Network. Furthermore, the AI models are helpful in analyzing the potential mutations of the virus and corresponding identifying the best possible candidates for vaccine design [46].

Challenges and Conclusion

Despite the capabilities of AI models in finding combating solutions for COVID-19, it is very difficult to get all kinds of relevant data that are required to build such models. For example, research works such as [21,22,23,24,25] need real patient-level data from different hospitals data (after depersonalization) while building the models, but it is not easy to access the hospital data due to Health Insurance Portability and Accountability Act of 1996 (HIPAA) restrictions. It becomes challenging to provide relevant and right data to the models. Furthermore, algorithms such as deep learning should be trained with large, accurately labeled, curated datasets to produce sufficient accuracy. As described in [36], choosing the right dataset from various open online databases of chest CT and chest X-ray datasets becomes more challenging and as a key to the above challenge, transfer learning is being used while building models for COVID-19.

In conventional machine learning methods, the models are isolated which means that there is no transfer of knowledge from one model to another model. However, in reality, humans always tend to use the past learnt knowledge while solving new tasks. For example, when one learns the driving of four wheelers, he uses his previous of knowledge of driving two wheelers (and he does not start from the scratch). In transfer learning, the model developed for a task is reused for solving another task. This kind of transfer learning can be used whenever the available data are small as well as when the training time is longer. Here, the detection of COVID-19 can be enhanced using transfer learning. According to the similarity between the domains and availability of data for source and target problems involved in transfer learning, different learning strategies, namely inductive, transductive, and unsupervised, as shown in Fig. 7 are being used.

Fig. 7
figure 7

Different techniques of transfer learning

Also, data augmentation operations, such as flip, rotation, scale, crop, translation, etc., are also being used to expand the data size and thereby reduce the overfitting issues and help in enhancing the performance of the models [47]. In addition, when training data are constructed from online resources, the data are likely to contain false data also which will bias the accuracy of AI-based models. Furthermore, data retrieved from different sources may conflict with one another which poses additional challenge in arriving at correct information to train the algorithms [48].

Existing deep learning models lack the explanation for their detection or classification of inputs, so the physicians may not be able to understand why the models have predicted as they did [49, 50]. Therefore, the black box nature of the models needs to be improved with explainable AI. A deep learning approach presented in [51] is explainable by its design. The approach provides reasons for why it predicts the label for a given data as it does, using a set of IF–THEN rules. In addition to the explainability or interpretability, the proposed approach has the ability to actively learn from new data samples.

Despite the challenge of implementation of transfer learning and explainability in the models, another big challenge exists with respect to the validation of the developed models. Most of the research works use a specific dataset for the proof of concept. Radiologists elsewhere have expressed that most of COVID-19 images come from Chinese hospitals, and hence, the developed models tend to suffer from selection bias [52]. Therefore, before put into practical use, the developed models are required to be tested over diverse data sets which cover a wider demographic feature, so that the models can be generalized and tools can be developed [53].

Despite AI is an enabling technology to support diagnosis and prognosis, it is dependent on various factors such as appropriate infrastructure, authenticated and trustable data sharing along with unbiased safe translation of models into usable solutions. Finding combating solutions for COVID-19 needs combined efforts from international cross-disciplinary collaborations to carefully identify time, course, and region-dependent clinical actions in response to COVID-19 [54]. Furthermore, the AI needs to go in hand with other digital intelligence technologies such as the wearable technology, mobile technology, the Internet of Things, big data analytics, cloud computing [55,56,57,58,59] to develop viable, thoroughly validated tools which can provide real support local healthcare providers. In addition, as discussed in the paper, AI has its applications in finding solutions for COVID-19 in many dimensions.