Skip to content

alignment_pruner.pl --chi2_prune parameter #1

Description

@skeiii

Hi, I have been constructing an Alphaproteobacteria tree recently and using the alignment_pruner.pl to trim the 20% most heterogeneity sites.

According to the script,

Use the chi2 statistic to choose columns to prune. This will first order
all of the columns by the chi2 statistic by comparing the chi2
statistic for the alignment with and without each column. Then the option
specifies how to remove columns:

    half    Remove half of the sites (starting with the most biased).
    f#      Remove sites until # fraction of sites remains, half can be
            specified as f0.5
    n#      Remove sites until only # number of sequences show significant bias.
    min     Remove sites until a minimum of sequences show significant bias.
    plot    Will print statistics to the screen suitable for plotting. It will
            contain 4 columns: idx, number of biased sequences, chi2 delta for
            this column and the names of the biased sequences.

And therefore, I use:
alignment_pruner_broCode.pl --file input.faa --chi2_prune f0.8 > pruned20_Alpa117.fasta

However, as compared to my original input file, instead of remain 80% of the sites, it only left with 20%.
Original file: 18823
f0.8: 3765

I am wondering, am I using the wrong command for trimming the most compositional heterogeneity sites, or the description of the text in f# is not correct. To trim 20% of the sites, it should be f0.2.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions