Skills
A curated collection of technical skills and tutorials for bioinformatics, organized by topic. All posts are hands-on guides with practical examples.
Version Control & Git
- The Evolution of Version Control - Git's Role in Reproducible Bioinformatics (Part 1)
- The Evolution of Version Control - CI/CD in bioinformatics (Part 2)
- Running GitHub Actions Locally with act: 5x Faster Development
Containers & Docker
- Containers in Bioinformatics: Community Tooling and Efficient Docker Building
- Containers on HPC: From Docker to Singularity and Apptainer
- Docker Out of Docker: Running Interactive Web Applications for Data Analysis
Package Management & Environment Setup
- Pixi - New conda era
- Upgrade Your Shell: From Bash to Zsh for a Better Terminal Experience
- Setting Up a Local Nextflow Training Environment with Code-Server and HPC
Nextflow & Workflow Management
- RIVER - A Web Application to Run Nf-Core
- How to Migrate from In-House Pipelines to Enterprise-Level Workflows: A Proven 3-Step Validation Framework
- Bioinformatics Cost Optimization for Computing Resources Using Nextflow (Part 1)
- Bioinformatics Cost Optimization For Input Using Nextflow (Part 2)
Variant Calling & NGS Pipelines
- Building a Reproducible GATK Variant Calling Bash Workflow with Pixi (Part 1) - Academic proof-of-concept implementation
- From Bash to Nextflow: GATK Best Practice With Nextflow (Part 2) - MD5 validation and scientific equivalence testing
- Variant Calling at Production Scale: HPC Deployment and Performance Optimization (Part 3) - SLURM, resource optimization, and scaling to 100+ samples
Slurm & HPC Clusters
- Building a Slurm HPC Cluster (Part 1) - Single Node Setup and Fundamentals
- Building a Slurm HPC Cluster (Part 2) - Scaling to Production with Ansible
- Building a Slurm HPC Cluster (Part 3) - Administration and Best Practices
CI/CD & Testing
- The Evolution of Version Control - CI/CD in bioinformatics (Part 2)
- Running GitHub Actions Locally with act: 5x Faster Development
- Bioinformatics Workflow Template: Standardizing Python Pipelines with Modular Design
Machine Learning & Data Analysis
- Introduction to AI/ML in Bioinformatics: Classification Models & Evaluation
- Machine Learning in Bioinformatics Part 1: Building KNN from Scratch
Data Management & Cloud Storage
- Working with Remote Files using bcftools and samtools (HTSlib) - S3, HTTP, and cloud file access