Programming Using Java: Peptide Viewer

naveen kumar
Dec 17, 2020
2 min read

Write a Java program to visually represent peptides identified using output from peptide identification software.

The program should read two file inputs from the user (using a File Chooser):

1. A peptide fasta file (that should be presented within the program GUI)

2. A .CSV file containing output OMSSA output*

(*) The Open Mass Spectrometry Search Algorithm OMSSA is an efficient search engine for identifying MS/MS peptide spectra by searching libraries of known protein sequences. OMSSA scores significant hits with a probability score developed using classical hypothesis testing, the same statistical method used in BLAST. OMSSA is free and in the public domain. Detailed information can be found at http://pubchem.ncbi.nlm.nih.gov/omssa/.

3. Protein should be parsed from the FASTA file, your program should graphically represent the identified peptides as retrieved by OMSSA

4. The program should also highlight the identified peptides as shown in the following figure

Figure 1 (solid blue indicated observed portion of protein): Protein X:

Figure 2 (Tip: You can use a JTextPane to use different text colour): e.g. jTextPane1.setContentType("text/html"); MRLAVGALLVCAVLGLCLAVPDKTVRWCAVSEHEATKCQSFRDHMKSVIPSDGPSVACVK KASYLDCIRAIAANEADAVTLDAGLVYDAYLAPNNLKPVVAEFYGSKEDPQTFYYAVAVV KKDSGFQMNQLRGKKSCHTGLGRSAGWNIPIGLLYCDLPEPRKPLEKAVANFFSGSCAPC ADGTDFPQLCQLCPGCGCSTLNQYFGYSGAFKCLKDGAGDVAFVKHSTIFENLANKADRD QYELLCLDNTRKPVDEYKDCHLAQVPSHTVVARSMGGKEDLIWELLNQAQEHFGKDKSKE FQLFSSPHGKDLLFKDSAHGFLKVPPRMDAKMYLGYEYVTAIRNLREGTCPEAPTDECKP VKWCALSHHERLKCDEWSVNSVGKIECVSAETTEDCIAKIMNGEADAMSLDGGFVYIAGK CGLVPVLAENYNKSDNCEDTPEAGYFAIAVVKKSASDLTWDNLKGKKSCHTAVGRTAGWN IPMGLLYNKINHCRFDEFFSEGCAPGSKKDSSLCKLCMGSGLNLCEPNNKEGYYGYTGAF RCLVEKGDVAFVKHQTVPQNTGGKNPDPWAKNLNEKDYELLCLDGTRKPVEEYANCHLAR APNHAVVTRKDKEACVHKILRQQQHLFGSNVTDCSGNFCLFRSETKDLLFRDDTVCLAKL HDRNTYEKYLGEEYVKAVGNLRKCSTSSLLEACTFRRP

NB: These figures are made up examples – they don’t represent any real protein!

Additional files

You are provided with 2 OMSSA output files (.csv) that you can use to develop your program

• omssaResults.csv

• omssaResultsFiltered.csv

• Some extra (large protein files) for further testing in case you implemented additional functions

At a minimum you would need to use the data from these columns:

• Start: The position in the protein where the peptide starts.

• Stop: The position in the protein where the peptide stops.

• Defline: This gives the name of the protein to which the peptide has been mapped - most importantly the accession number is given between the first two bars (|).

There are lots of other refinements that could be done, like allowing the user to choose from proteins that exist in the file, annotating the output with data from the other columns like p-value (how likely the identification is to be correct)

• Reading a fasta file with multiple proteins and highlighting the corresponding peptides (See next Screenshot)

Programming Using Java: Peptide Viewer

To get java Assignment help at an affordable price you can contact at contact@codersarts.com and get instant help with an affordable prices.

Recent Posts

Comments