# There are four directories: bin, DETECTER, data and doc # /bin contains the main script: DETECTER.pl # To run the code, type the following command from the /bin directory perl DETECTER.pl inputfilename # To save the output to a file, do the following: perl DETECTER.pl > myoutput.txt # If no inputfilename is given, the program will take ABC.Redo.CFTRsubtree.Crop.txt # as a defaulft input. This file is located in the /data directory # /DETECTER contains three modules: Node, NodesBlock and Site # /data contains two sample input files and the corresponding output files # /doc contains the auto-generated html code to display documentation. To generate this # documentation (already done... no need to do it again) run this command from PAML_dir/bin to # generate web based documentation. To look at this documentation, simply open index.html perl ~/Programs/pdoc-1.0/scripts/perlmod2www.pl -source ../DETECTER/ -target ../doc # To view the help files open doc/index.html in any web browser Description of the script This script parses a PAML input file. The script locates the probability distribution at each node, by site, and stores the node number, the site of the alignment, the extant sequence and a list of posterior probabilities and their associated amino acids. The script then combines all of the extant sequences for each node, by site, and removes the redundancies. Similarly, all of the posterior probabilities and their associated amino acids are combined and the redundant amino acids are removed. In the case of the posterior probabilities, however, amino acids are maintained only if the posterior probability is greater or equal that 0.05. The output of this script contains the header "Prob distribution at node xxx, by site" for all of the nodes followed by 3 columns: the site of the alignment, a list of tolerated residues (those not resulting from the extant sequence are in lower case), and a list of non-tolerated residues.