pfogmultipledb.cpp File Reference

#include "Range.h"
#include "Match.h"
#include <mysql++.h>
#include <iostream>
#include <map>
#include <fstream>
#include "bioseq.h"
#include "dbinfo.h"
#include "Tblastnmy.h"

Classes

struct  ProgParameters

Defines

#define DEBUG

Functions

int bufferProteinsFromFile (const string &infile, map< string, string > &store)
void fetchGenomic (Connection &dbh, const string &seqid, DNA &seq)
void storeBestModels (Tblastn &tbn, ostream &SUM, ostream &DET, const ProgParameters &par)
void usage (const ProgParameters &param)
bool processAll (const map< string, string > &pepstore, const MysqlDBInfo &mysqlauth, const ProgParameters &par)
int main (int argc, char *argv[])
void createRemoveTable (Connection &dbh, const string &table)
void storeDeleteRows (Connection *dbh, list< string > &rows, const string &table)
void loadProteinInfo (Query &query, map< string, string > &prtinf, Connection &nrdb, const string &nrtab)
int bufferProteins (Connection &conn, map< string, string > &pepstore, int offset, Connection &nrdb, const ProgParameters &par)

Define Documentation

#define DEBUG

Referenced by bufferProtein().


Function Documentation

int bufferProteins ( Connection &  conn,
map< string, string > &  pepstore,
int  offset,
Connection &  nrdb,
const ProgParameters par 
)

int bufferProteinsFromFile ( const string &  infile,
map< string, string > &  store 
)

this function buffers all the proteins from the input file into a map of <string,string> pid -> sequence string Not fully coded yet.

void createRemoveTable ( Connection &  dbh,
const string &  table 
)

void fetchGenomic ( Connection &  dbh,
const string &  seqid,
DNA seq 
)

assume that the target database has genomic table

void loadProteinInfo ( Query &  query,
map< string, string > &  prtinf,
Connection &  nrdb,
const string &  nrtab 
)

References string().

Referenced by main().

int main ( int  argc,
char *  argv[] 
)

bool processAll ( const map< string, string > &  pepstore,
const MysqlDBInfo mysqlauth,
const ProgParameters par 
)

process a subset from the whole table. In this particular case, I am picking a Reverse transcriptase scanning against different databases. I used RT from SwissProt and searched against a dozen fungal gnomic databases using tblastn. Because this program only deals with the whole table. I have the choice to write a wrapper to use a dozen tables or use a sql to select a portion of the tables.

proteinSource The buffering of proteins is outside this function

The reason that we use a database table is to do sorting on the target. Otherwise we will have to do the sorting which might not be a bad idea. Use of iterator version of the Result object of mysql++ 2.3 can slowdown the process by 1000x use the fetch_row() method.

References fetchGenomic(), MysqlDBInfo::getPassword(), MysqlDBInfo::getUser(), ProgParameters::identitycut, bioseq::length(), ProgParameters::showevery, storeBestModels(), string(), ProgParameters::tbndb, ProgParameters::tbntab, and user.

void storeBestModels ( Tblastn tbn,
ostream &  SUM,
ostream &  DET,
const ProgParameters par 
)

This is the major working horse helper function used by processOneBatch and Main

make it work into different streams

The par mainly provide the cutoff information. Store information into output streams SUM and DET

void storeDeleteRows ( Connection *  dbh,
list< string > &  rows,
const string &  table 
)

void usage ( const ProgParameters param  ) 

this describe how to use this program After calling this function the program will terminate.


Generated on Wed Oct 14 21:49:19 2009 for Softwares from Orpara by  doxygen 1.5.6