File I/O -- Input and output sequence data.


Input Phyclust accepts three types of input:

The data reading functions read.*() will return a list object of class seq.data. Suppose we call the returned list object ret. Then, ret$org.code and ret$org are two matrices that store the data. Matrix ret$org.code contains the original data, e.g. A,G,C,T for nucleotide, and ret$org contains the data formatted for the computer, e.g. 0,1,2,3 for nucleotide.

Matrix ret$org is translated from ret$org.code according to the standard encoding of the chosen data type, and most calculations are done with ret$org.


Output Phyclust outputs sequence data in two formats: PHYLIP or FASTA.

We use "Great pony EIAV rev datasets" as examples, pony524.phy in PHYLIP format and pony625.fas in FASTA format. The other example for the data sets can be found at here. The following code will read in two file, create objects with class seq.data, and save the data matrix in two new files in the working directory.


Read a PHYLIP file
> data.path <- paste(.libPaths()[1], "/phyclust/data/pony524.phy", sep = "")
> (my.pony.524 <- read.phylip(data.path))
code.type: NUCLEOTIDE, n.seq: 146, seq.len: 405.
> str(my.pony.524)
List of 7
 $ code.type: chr "NUCLEOTIDE"
 $ info     : chr " 146 405"
 $ nseq     : num 146
 $ seqlen   : num 405
 $ seqname  : Named chr [1:146] "AF314258" "AF314259" "AF314260" "AF314261" ...
  ..- attr(*, "names")= chr [1:146] "1" "2" "3" "4" ...
 $ org.code : chr [1:146, 1:405] "g" "g" "g" "g" ...
 $ org      : num [1:146, 1:405] 1 1 1 1 1 1 1 1 1 1 ...
 - attr(*, "class")= chr "seq.data"

Read a FASTA file
> data.path <- paste(.libPaths()[1], "/phyclust/data/pony625.fas", sep = "")
> (my.pony.625 <- read.fasta.nucleotide(data.path))
code.type: NUCLEOTIDE, n.seq: 62, seq.len: 406.
> str(my.pony.625)
List of 6
 $ code.type: chr "NUCLEOTIDE"
 $ nseq     : num 62
 $ seqlen   : int 406
 $ seqname  : chr [1:62] "AF512608" "AF512609" "AF512610" "AF512611" ...
 $ org.code : chr [1:62, 1:406] "G" "G" "G" "G" ...
 $ org      : num [1:62, 1:406] 1 1 1 1 1 1 1 1 1 1 ...
 - attr(*, "class")= chr "seq.data"

Save files
> # PHYLIp
> write.phylip(my.pony.625$org, "new.625.txt")
> edit(file = "new.625.txt")
> # FASTA
> write.fasta(my.pony.524$org, "new.524.txt")
> edit(file = "new.524.txt")