Bioinformatics

44 skills with this tag

deepTools is a bioinformatics skill for analyzing next-generation sequencing (NGS) data. It helps with quality control, normalization, and visualization of ChIP-seq, RNA-seq, and ATAC-seq experiments, providing workflow generators and comprehensive documentation for common analyses.

BioinformaticsNgsChip Seq+3

COBRApy is a documentation skill for systems biology and metabolic engineering analysis. It provides comprehensive guidance on using the COBRApy Python library for constraint-based reconstruction and analysis (COBRA) of metabolic models, including flux balance analysis, gene knockouts, flux sampling, and SBML model handling.

Systems BiologyMetabolic ModelingCobra+3

Cellxgene Census

This skill provides comprehensive guidance for programmatically accessing the CZ CELLxGENE Census, a collection of 61+ million single-cell genomics data. It covers querying expression data by cell type, tissue, or disease, integrating with PyTorch for machine learning, and using scanpy for analysis workflows.

BioinformaticsSingle CellGenomics+3

A comprehensive bioinformatics skill that provides a unified Python interface to 40+ biological databases including UniProt, KEGG, ChEMBL, and Reactome. It enables protein analysis, pathway discovery, compound searches, sequence similarity searches, and cross-database identifier mapping for scientific research workflows.

BioinformaticsProtein AnalysisKegg+3

A comprehensive reference skill for the Biopython library, providing documentation and code examples for computational molecular biology tasks. Covers sequence manipulation, NCBI database access (GenBank, PubMed), BLAST searches, protein structure analysis, phylogenetics, and advanced features like motif analysis and restriction enzyme mapping.

BioinformaticsMolecular BiologyBiopython+3

This skill provides comprehensive documentation and reference material for working with AnnData, a Python package for handling annotated data matrices used in single-cell genomics. It covers creating, reading, writing, and manipulating AnnData objects, along with best practices for memory management and integration with the scverse ecosystem (Scanpy, Muon, PyTorch).

AnndataSingle CellBioinformatics+3

Exploratory Data Analysis

Perform comprehensive exploratory data analysis on scientific data files across 200+ file formats. This skill should be used when analyzing any scientific data file to understand its structure, content, quality, and characteristics. Automatically detects file type and generates detailed markdown reports with format-specific analysis, quality metrics, and downstream analysis recommendations. Covers chemistry, bioinformatics, microscopy, spectroscopy, proteomics, metabolomics, and general scientific data formats.

Data AnalysisScientific ComputingBioinformatics+3

Drugbank Database

Access and analyze comprehensive drug information from the DrugBank database including drug properties, interactions, targets, pathways, chemical structures, and pharmacology data. This skill should be used when working with pharmaceutical data, drug discovery research, pharmacology studies, drug-drug interaction analysis, target identification, chemical similarity searches, ADMET predictions, or any task requiring detailed drug and drug target information from DrugBank.

DrugbankPharmaceuticalDrug Interactions+3

This skill should be used when working with LaminDB, an open-source data framework for biology that makes data queryable, traceable, reproducible, and FAIR. Use when managing biological datasets (scRNA-seq, spatial, flow cytometry, etc.), tracking computational workflows, curating and validating data with biological ontologies, building data lakehouses, or ensuring data lineage and reproducibility in biological research. Covers data management, annotation, ontologies (genes, cell types, diseases, tissues), schema validation, integrations with workflow managers (Nextflow, Snakemake) and MLOps platforms (W&B, MLflow), and deployment strategies.

BioinformaticsData ManagementOntologies+3

High-performance toolkit for genomic interval analysis in Rust with Python bindings. Use when working with genomic regions, BED files, coverage tracks, overlap detection, tokenization for ML models, or fragment analysis in computational genomics and machine learning applications.

GenomicsBioinformaticsRust+3

This skill should be used when working with genomic interval data (BED files) for machine learning tasks. Use for training region embeddings (Region2Vec, BEDspace), single-cell ATAC-seq analysis (scEmbed), building consensus peaks (universes), or any ML-based analysis of genomic regions. Applies to BED file collections, scATAC-seq data, chromatin accessibility datasets, and region-based genomic feature learning.

GenomicsMachine LearningBioinformatics+3

Comprehensive toolkit for protein language models including ESM3 (generative multimodal protein design across sequence, structure, and function) and ESM C (efficient protein embeddings and representations). Use this skill when working with protein sequences, structures, or function prediction; designing novel proteins; generating protein embeddings; performing inverse folding; or conducting protein engineering tasks. Supports both local model usage and cloud-based Forge API for scalable inference.

Protein EngineeringBioinformaticsMachine Learning+3

Infer gene regulatory networks (GRNs) from gene expression data using scalable algorithms (GRNBoost2, GENIE3). Use when analyzing transcriptomics data (bulk RNA-seq, single-cell RNA-seq) to identify transcription factor-target gene relationships and regulatory interactions. Supports distributed computation for large-scale datasets.

BioinformaticsGene Regulatory NetworksTranscriptomics+3

Cloud laboratory platform for automated protein testing and validation. Use when designing proteins and needing experimental validation including binding assays, expression testing, thermostability measurements, enzyme activity assays, or protein sequence optimization. Also use for submitting experiments via API, tracking experiment status, downloading results, optimizing protein sequences for better expression using computational tools (NetSolP, SoluProt, SolubleMPNN, ESM), or managing protein design workflows with wet-lab validation.

Protein EngineeringBioinformaticsLaboratory Automation+3