# What Is Virtual Screening?

**Virtual screening** is a computational technique used in drug discovery to evaluate large libraries of small molecules or compounds to identify potential drug candidates that can bind to a biological target, such as a protein. Virtual screening can complement traditional high-throughput screening in the quest for new therapeutics.

## What Are the Different Types of Virtual Screening?

Virtual screening approaches include **structure-based** (using a known 3D conformation of the target protein) and **ligand-based virtual screening** (using known active molecules as templates).

### **Structure-Based Virtual Screening**

**Structure-based virtual screening (SBVS)** is a technique used in computational chemistry when the **3D structure of the target protein** is known, often obtained from experimental techniques like x-ray crystallography or cryogenic electron microscopy (cryo-EM). The primary method used in SBVS is **molecular docking**, or molecular ligand docking, where virtual screening software simulates the binding of small molecules (ligands) to a target protein substructure, such as a kinase active site. The goal is to predict the best orientation, interactions, and fit between the ligand and the protein. **Scoring functions** are then applied to evaluate the strength of these interactions, helping prioritize compounds with the highest predicted binding affinity. The principle here is to identify compounds that can effectively bind to and act as an inhibitor or modulator of the target's bioactivity, which may be helpful as potential drugs.

### **Ligand-based Virtual Screening (LBVS):**

In cases where the 3D structure of the target protein is unavailable, researchers use a cheminformatics approach called **ligand-based virtual screening (LBVS)**. This approach identifies compounds similar to known active ligands based on their chemical structure or properties. The principle here is that compounds structurally similar to known active compounds are likely to exhibit similar biological activity in protein-ligand interactions. Two key methods used in LBVS include **quantitative structure-activity relationship (QSAR)** modeling, which correlates molecular properties with biological activity, and **pharmacophore modeling**, which identifies essential chemical features responsible for biological activity. These features are then used to screen compound libraries for molecules with similar characteristics.

Using **virtual screening**, researchers can significantly reduce the time and cost associated with traditional wet-lab high-throughput screening (HTS) assays. Instead of testing millions of compounds in the lab, virtual screening allows for the rapid evaluation of thousands or millions of compounds in silico, focusing only on the most promising candidates for experimental validation.

## What Are Major Design Considerations for Virtual Screening?

A few key design aspects guide the identification of potential drug candidates in **virtual screening**:

1. **Molecular databases:** Large virtual libraries of small molecules are screened computationally to identify compounds with **drug-like properties**. These libraries often contain millions of molecules, which can be evaluated for potential interactions with a target protein.
2. **Scoring functions:** Specialized algorithms predict the **binding affinity** or **interaction strength** between a compound and its biological target, helping to prioritize compounds based on their likelihood of efficacy.
3. **Filtering:** Preprocessing steps are applied to eliminate molecules that don't meet key drug-likeness criteria, such as **Lipinski's rule of five**, which ensures that compounds have the right balance of properties (e.g., solubility, permeability) to become potential drugs.
4. **Post-screening validation:** The top-ranked compounds identified during virtual screening undergo **experimental validation** to confirm their biological activity. This ensures that predicted interactions are functional in a biological system before further drug development.

These steps allow researchers to efficiently filter and evaluate large molecular datasets, streamlining the discovery of viable drug candidates.

## What Is the Hit Rate of Virtual Screening?

The **hit rate** of virtual screening refers to the percentage of compounds identified as hits out of the total screened. The hit rate can vary widely depending on the **quality of the compound library**, the **target protein**, and the **screening methods** used. Generally, hit rates for virtual screening are **low**, often ranging between 0.1% and 5%. However, the hit rate can improve with higher-quality target structures and more sophisticated molecule scoring algorithms, among other techniques.

**Generative virtual screening** is a computational method that uses [generative AI](https://www.nvidia.com/en-us/glossary/generative-ai.md) to design and optimize chemical structures with desirable properties. It offers several advantages over **traditional virtual screening** in terms of **efficiency, flexibility, and accuracy**.

Traditional HTS methods may yield more hits but are more resource-intensive. Virtual screening, on the other hand, efficiently narrows down large datasets, making the process faster and more cost-effective.

Here’s how traditional and generative virtual screenings compare.

## Traditional Versus Generative Virtual Screening

### **Approach to Molecule Generation**

* **Traditional virtual screening** typically relies on **brute-force screening** methods, where large virtual libraries of compounds are tested against a target protein to identify potential binders. This method involves evaluating millions of compounds in silico, often without focusing on intelligent or directed design.
* **Generative virtual screening**, on the other hand, uses **generativeAI models** like **MolMIM** to iteratively design molecules. These models are guided by seed molecules and optimized to meet specific drug properties, such as binding affinity and absorption, distribution, metabolism, excretion, and toxicity (ADMET) profiles. Instead of searching blindly through vast chemical libraries, generative models focus on smartdesign, reducing the number of compounds tested while ensuring higher-quality candidates.

### **Speed and Efficiency**

* In **traditional screening**, the **time** required to evaluate millions of compounds can be substantial, especially if the scoring functions are computationally intensive. This can create bottlenecks, particularly when moving from in silico predictions to experimental validation.
* **Generative virtual screening** uses **accelerated AI models** like **DiffDock**, which is **6.3x faster** than traditional docking methods, significantly speeding up the screening process. The generative models also reduce the computational burden by focusing only on a subset of promising molecules, which are designed intelligently rather than screened randomly.

### **Flexibility and Modularity**

* **Traditional virtual screening** methods often follow a **linear, rigid** pipeline, focusing on a single type of scoring function or computational model. This limits flexibility and makes it difficult to adapt the screening process to evolving scientific needs or new models.
* **Generative virtual screening** is highly **modular**, allowing for interchangeablecomponents for protein folding, molecule generation, and docking. As newer, more advanced AI models emerge, they can be easily integrated into workflows, ensuring that the screening process remains state of the art. [Machine learning](https://www.nvidia.com/en-us/glossary/machine-learning.md) algorithms can be seamlessly incorporated to enhance the predictive power and adaptability of the generative models.

### **Data Utilization and Human Input**

* **Traditional screening** is largely data-driven but may **not** incorporate feedback loops that allow for directed iterative molecule refinement.
* In **generative virtual screening**, human inputcan guide the AI models to further optimize the molecules based on drugproperties, disease-specific requirements, or safety concerns. This iterative loop ensures that the molecules generated are tailored to the researchers' specific needs, allowing for a moretargetedapproach.

### **Cost and Resource Savings**

* **Traditional virtual screening** can be resource-intensive due to the sheer volume of compounds that must be screened and the computational power required to screen them.
* **Generative virtual screening** reduces costs by focusing on **smarter, more efficient molecule design**. The reduced need for brute-force approaches, combined with the faster processing speeds of AI-driven models, translates into lower overall costs and a faster time to discovery.

## Next Steps

### Generative AI for Virtual Screening

Learn more about the NVIDIA-accelerated computing platform for virtual screening.

[Check Out NVIDIA Clara for Biopharma](https://www.nvidia.com/en-us/clara/biopharma.md)

### Stay Up to Date

Get the latest tech tips, industry trends, and NVIDIA news by signing up for our newsletter.

[Stay Informed](https://www.nvidia.com/en-us/industries/healthcare-life-sciences/healthcare-news-sign-up.md)