Number of RhoGAP29 homologs in the human genome

  1. RhoGAPs are one of the many numerous protein families and domains in the human genome, amd by no means the biggest. A RhoGAP is give here. Put the sequence in SMART or PFAM and inspect its domain composition.
  2. How many homologs are in the human genome. We are going to first use blast. Go to ensembl blast. For "sequence data" select protein. For "search against" select protein. For "maximum number of hits to report" select at least 1000.
  3. Press view results. Scroll down to see the pattern of hits across the human chromosomes. Look at the tabular output. How many hits do you have? Are these all the different genes? (look at the second column).
  4. How now to get the total number of RhoGAP homologs in the human genome? Press on the "download what you see" option in the upper right corner of the table. This list contains duplicates, count the number of unqiue hits in a scripting language like R or python. Or open the file in excel, copy the column containing the names of the hits and paste them in the list 1 field the the venny webtool (see ). You should get your number in a venn diagram of a single set. How many RhoGAP homologs does the human genome seem to contain?
  5. Do the same in blast. In order to only search human restrict your search to human in the "Organism" field. What do you see? Explain the difference. For this you maybe need to read a little bit about the NR database.
  6. Now let's see if we can also get an estimate for the number of RhoGAPs using profiles. Find the Rhogap model in pfam (either via text searching or via sequence searching our query gene). Then go the curation and model entry. Download the RhoGAP model to your machine. Then go to Select as database human ensembl. Search with the RhoGAP model. Look at two hits with identical identifiers (e.g. ENSG00000180448.10), what is different about the alignments of the hits (open the alignment by clicking on the ">" symbol to fold open the alignment)? Download the results as a tab delimited file. Again filter for unqiue either via a scripting language, or via opening the file in excel and pasting the list at .