It is one the two main function of the MitoHEAR package (together with get_heteroplasmy). The function allows to obtain a matrix of counts (n_row = number of sample, n_col= 4*number of bases) of the four alleles in each base, for every sample. It takes as input a vector of sorted bam files (one bam file for each sample) and a fasta file for the genomic region of interest. It is based on the pileup function of the package Rsamtools.

get_raw_counts_allele(bam_input, path_fasta, cell_names, cores_number = 1)

Arguments

bam_input

Character vector of sorted bam files (full path). Each sample is defined by one bam file. For each bam file it is needed also the index bam file (.bai) at the same path.

path_fasta

Character string with full path to the fasta file of the genomic region of interest.

cell_names

Character vector of sample names.

cores_number

Number of cores to use.

Value

A list with three elements:

matrix_allele_counts

Matrix of counts (n_row = number of sample, n_col= 4*number of bases) of the four alleles in each base, for every sample. The row names is equal to cell_names.

name_position_allele

Character vector with length equal to n_col of matrix_allele_counts. Each element specifies the coordinate of genomic position for a base and the allele.

name_position

Character vector with length equal to n_col of matrix_allele_counts. Each element specifies the coordinate of genomic position for a base.

Author

Gabriele Lubatti gabriele.lubatti@helmholtz-muenchen.de