Sohel Merch
ant


GeneCodeXML (Master's Project)

I am developing an evolutionary XML bio-sequence data representation based on the current W3C XML Schema 1.1 standards. Through this effort, I intend to synergize the efforts of XML representation of bioinformatics data. I intend to combine existing standard XML representation of the various aspects of biological data like bio-sequence (RNA/DNA/AA) , gene expression and protein structure data with evolutionary data to form GeneCodeXML. My project is a sub project of bigger effort being carried out by Dr. Stephen Freeland and his lab at UMBC's Biological Sciences department to use evolutionary data to address the challenges in Bioinformatics at 3 frontiers: a) sequence alignment and homology detection b) heterogeneous gene expression c) protein structure prediction. For this work, I am exploring the existing XML representation namely ProXML, BSML, RNAML, GAME, MAGE-ML, SPTr-ML and others for my work.
In the next phase of the project, I plan to apply this GeneCodeXML to derive customized PAM matrices for vertebrate mitochondrion. In am currently implementing web services for Non-Standard Genetic Code database.

The role of web services in developing an extensible bioinformatics tools for sequence analysis (Independent Study)
Biological sequence analysis is a cornerstone of the emerging field of bioinformatics research that addresses many problems of applied biology (e.g. reconstructing long sequences from a random pool of shorter, overlapping fragments, preparing genetic maps etc.) to theoretical biology (homology searching, sequence alignment , phylogenetic reconstruction etc.). Indeed, more than 18,000 publications have included "sequence analysis" in their title, keyword or abstract since 2001. In this work, I would be reviewing the free, open-source tools that are available to perform homology searching, sequence alignment and phylogenetic reconstruction. I would be testing the applicability of web services model construct extensible sequence analysis tools.

Microarray data analysis using artificial neural networks

Microarrays allow us to study the molecular mechanisms underlying the tumors and the past patient outcomes by facilitating the monitoring of expressions of thousands of genes in one experiment. Cluster analysis has been applied to analyze the breast cancer data generated by microarray with inadequate results. Articial neural networks (ANN) are very powerful tools for classification problems and can give accurate patient diagnosis and prognosis. In this study, we apply artificial neural networks for prognosis of breast cancer.