Class SparseDoubleMatrixReader


public class SparseDoubleMatrixReader extends DoubleMatrixReader
Author:
pavlidis
  • Constructor Details

    • SparseDoubleMatrixReader

      public SparseDoubleMatrixReader()
  • Method Details

    • read

      public DoubleMatrix<String,String> read(InputStream stream) throws IOException
      Read a sparse matrix that is expressed as an adjacency list in a tab-delimited file:
        item1 item2 weight
        item1 item5 weight
       

      By definition the resulting matrix is square and symmetric.

      Note that the ordering of the items will be as they are encountered in the file.

      Overrides:
      read in class DoubleMatrixReader
      Parameters:
      stream - InputStream
      Returns:
      NamedMatrix
      Throws:
      IOException
    • read

      public DoubleMatrix<String,String> read(InputStream stream, Collection<String> wantedRowNames)
      Overrides:
      read in class DoubleMatrixReader
      Parameters:
      stream - InputStream
      wantedRowNames - Set
      Returns:
      read( stream, wantedRowNames, createEmptyRows ) with createEmptyRows set to true.
    • read

      public DoubleMatrix<String,String> read(InputStream stream, Collection<String> wantedRowNames, boolean createEmptyRows, int skipColumns, int maxRows)
      Overrides:
      read in class DoubleMatrixReader
      Parameters:
      stream - InputStream
      wantedRowNames - Set
      createEmptyRows - if a row contained in wantedRowNames is not found in the file, create an empty row filled with Double.NaN iff this param is true.
      maxRows -
      Returns:
      matrix
    • readJW

      public DoubleMatrix<String,String> readJW(InputStream stream) throws IOException
      Read a sparse matrix in "JW" (Jason Weston) format. The format is like this:
       2          <--- number of items - the first line of the file only. NOTE - this line is often blank or not present.
       1 2        <--- items 1 has 2 edges
       1 2        <--- edge indices are to items 1 & 2
       0.1 100    <--- with the following weights
       2 2        <--- items 2 also has 2 edges
       1 2        <--- edge indices are also to items 1 & 2 (fully connected)
       100 0.1    <--- with the following weights
       

      Note that the item numbering starts at 1. This is a requirement.

      Note that this cannot handle very large matrices - the limit to rows x columns is the number Integer.MAX_VALUE. This is an implementation problem for colt's sparse matrix.

      Parameters:
      stream -
      wantedRowNames -
      Returns:
      Throws:
      IOException