Skip to main content

River-utils

info

river-utils is a versatile Command-Line Interface (CLI) tool designed to streamline bioinformatics workflows, HPC job management, and cloud storage setup. This tool provides commands to manage cloud configurations, job scripts, and setup utilities.

Installation

Prerequisites

  • Python 3.8 or higher
  • AWS credentials configured locally (for cloud commands)
  • Required system utilities: wget, curl

Development Version

To install the development version, use the following commands for x64. For other CPU architectures, follow these instructions:

# Adjust the installation path for micromamba
git clone https://github.com/riverxdata/river-utils.git -b dev
cd river-utils
bash base.sh
make build
pip install -e .

Latest Version

To install the latest stable version, use the following command. Besides the Python package, it offers several useful software tools:

  • aws: The AWS client to interact with AWS services or AWS-compatible services
  • singularity: A container engine suitable for root-less environments
  • r-base (4.4.0): The R environment for AWS-related tasks
  • python (3.9.21): The Python environment
  • zsh: A shell with plugins to enhance performance when working with Unix

Note: This will install micromamba, create an environment called river, and install the above software without restricting versions.

version="add_awscli_zsh"
bash <(curl -Ls https://raw.githubusercontent.com/riverxdata/river-utils/${version}/install/setup.sh) $HOME $version
source ~/.river.sh

Usage

Overview

The river-utils CLI consists of three main subcommands:

  • cloud: Manage AWS S3 configurations and mount buckets using Goofys. Note that the current AWS CLI does not support region.
  • job: Manage and generate job scripts for HPC systems and the River web server.
  • setup: Set up standard tools for bioinformatics analysis.

1. Storage

The cloud command is used to configure and manage S3 buckets with Goofys.

Subcommands:

  • s3-config: Add a new AWS profile to your local configuration.
cloud s3-config --profile-name PROFILE_NAME --region REGION --aws-access-key-id AWS_ACCESS_KEY_ID --aws-secret-access-key AWS_SECRET_ACCESS_KEY

2. Slurm job monitoring

The job command is used for managing HPC job scripts and information.

Subcommands:

  • create: Generate a job script based on a provided Git repository and version.
  • info: Fetch and display information about jobs.
job create --job-id JOB_ID
job config --job-id JOB_ID
job info JOB_UUIDS

To load micromamba and the river command line tool with these setups, run the following command or add it to your .bashrc:

source ~/.river.sh

License

This project is licensed under the MIT License.