File handling functions and calculation of distances for sequence data in nexus format. More...

#include "distance_matrix.h"
#include "nexus_common.h"

Include dependency graph for alignment.h:

This graph shows which files directly or indirectly include this file:

Data Structures
struct	alignment_struct
	Data from alignment file. More...

Typedefs
typedef struct alignment_struct *	alignment

Functions
alignment	read_alignment_from_file (char *seqfilename)
	Reads DNA alignment (guess format between FASTA and NEXUS) from file and store info in alignment_struct.

alignment	read_fasta_alignment_from_file (char *seqfilename)
	Reads DNA FASTA alignment from file and store info in alignment_struct.

alignment	read_nexus_alignment_from_file (char *seqfilename)
	Reads DNA NEXUS alignment from file and store info in alignment_struct.

void	print_alignment_in_fasta_format (alignment align, FILE *stream)
	Prints alignment to FILE stream in FASTA format (debug purposes).

void	del_alignment (alignment align)
	Frees memory from alignment_struct.

distance_matrix	new_distance_matrix_from_valid_matrix_elems (distance_matrix original, int *valid, int n_valid)
	new matrix of pairwise distance by simply excluding original elements not present in valid[]

distance_matrix	new_distance_matrix_from_alignment (alignment align)
	creates and calculates matrix of pairwise distances based on alignment

void	store_likelihood_info_at_leaf (double *l, char align, int n_pat, int n_state)
	transform aligned sequence into likelihood for terminal taxa (e.g. A -> 0001, C-> 0010 etc) (e.g. A -> 0001, C-> 0010 etc) (e.g. A -> 0001, C-> 0010 etc) (e.g. A -> 0001, C-> 0010 etc) (e.g. A -> 0001, C-> 0010 etc) (e.g. A -> 0001, C-> 0010 etc) (e.g. A -> 0001, C-> 0010 etc) (e.g. A -> 0001, C-> 0010 etc)

Detailed Description

File handling functions and calculation of distances for sequence data in nexus format.

Reading of sequence data in nexus format (sequencial or interleaved) and fasta format. For fasta format the sequences don't need to be aligned, but for all formats if the sequences are aligned a data compression is used so that we keep only the distinct site (column) patterns and a mapping between original and compressed site columns. Based on the sequence pairs we can also calculate the matrix of distances between sequences.

Data Structures

Typedefs

Functions

Detailed Description