Help
.. Preferences
| Homologs > Source | Fruitfly | Human | Mouse | Mosquito | Weed | Worm | Yeast | Zebrafish | Rat | Rice | E. coli | | | Any | N genes |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Fruitfly | -- | 5624 44% | 4302 33% | 7470 58% | 2427 19% | 4096 32% | 1772 13% | 826 6% | 2392 18% | 989 7% | 442 3% | | | 7888 61% | 12728 |
| Human | 8708 19% | -- | 15159 33% | 9400 20% | 4534 9% | 6750 14% | 2842 6% | 3421 7% | 7674 16% | 1593 3% | 608 1% | | | 17892 39% | 45767 |
| Mouse | 4424 48% | 8463 92% | -- | 4296 46% | 2072 22% | 3550 38% | 1428 15% | 1663 18% | 4266 46% | 876 9% | 344 3% | | | 8652 94% | 9186 |
| Mosquito | 7288 61% | 5137 43% | 3737 31% | -- | 2359 19% | 3822 32% | 1565 13% | 812 6% | 2217 18% | 992 8% | 413 3% | | | 7834 66% | 11868 |
| Weed | 3783 15% | 4468 18% | 3382 13% | 3690 15% | -- | 3543 14% | 2939 11% | 366 1% | 2070 8% | 7886 32% | 1076 4% | | | 10732 43% | 24521 |
| Worm | 4266 22% | 4918 26% | 3808 20% | 4428 23% | 2426 12% | -- | 1817 9% | 839 4% | 2241 11% | 996 5% | 456 2% | | | 5684 30% | 18893 |
| Yeast | 1508 25% | 1623 26% | 1251 20% | 1452 24% | 1601 26% | 1376 22% | -- | 179 2% | 832 13% | 697 11% | 440 7% | | | 2059 34% | 6031 |
| Zebrafish | 548 55% | 852 86% | 829 84% | 514 52% | 131 13% | 417 42% | 85 8% | -- | 625 63% | 68 6% | 46 4% | | | 876 89% | 982 |
| Rat | 3484 52% | 6213 93% | 5708 85% | 3415 51% | 1578 23% | 3031 45% | 1210 18% | 1810 27% | -- | 0 0% | 0 0% | | | 6413 96% | 6674 |
| Rice | 979 10% | 1092 11% | 784 8% | 1297 14% | 4577 49% | 1038 11% | 889 9% | 118 1% | 0 0% | -- | 0 0% | | | 4601 49% | 9226 |
| E. coli | 325 7% | 373 8% | 266 6% | 326 7% | 516 12% | 311 7% | 354 8% | 42 1% | 0 0% | 0 0% | -- | | | 667 15% | 4174 |
| Fruitfly | Human | Mouse | Mosquito | Weed | Worm | Yeast | Zebrafish | Rat | Rice | E. coli | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Paralogs | 4708 36% | 18602 40% | 4912 53% | 5232 44% | 18138 73% | 10071 53% | 1939 32% | 697 70% | 5470 81% | 5868 63% | 1194 28% |
Methods:
This is a computed gene homology or similarity using reference protein sequences
identified by the source databases. A BLAST of all these sequences is computed,
each sequence against all others, using specific parameters given below.
The sequences used in these calculations are found in the "Reference proteins" data files in each organism's folder. This summary is determined from the "Homologous genes table" data files also in each folder.
The counts in this table are of the number of available genes for an organism, and those which have one or more significant homologs in the other organisms. Percentages are the count of genes with any homolog (one or several) in another organism, divided by the total available genes in that organism (x 100).
You can read the homology summary table best this way. To answer the question what percent of all fruitfly genes have any homologs in C.elegans, look at the Fruitfly row, and find the Worm column. To answer the reverse, the Worm row and Fruitfly column says the percent of all worm genes that have any fruitfly homologs.
Note the asymmetry in these counts and percents. Fruitfly genome may have 5600 of its 13000 genes (43%) showing some homology to Human genes, whereas the Human genome may have 8700 of its 47000 genes (18%) showing some homology to Fruitfly genes. These include the same gene-pairs.
Please note that the summary you find here is based on a specific set of data and homology/similarity (BLAST) calculation parameters. Someone else may publish a different set of values, which would be equally valid, using a different set of data and options for determining "homology". The euGenes summary tries to stay up-to-date by using reference sequences identified by genome databases, and by matching current gene information from these databases with their current sequences.
These values are all moving targets; more data that is coming from genome sequencing projects will change these values. Especially in the euGenes summary, the fruitfly, worm (C.elegans) and yeast data is based on full genome sequences, while man, mouse, weed, and fish are based only on partial sequencing, which may skew values like percent homologies higher than in fact these organisms have.
Homology computations:
blastall -v 10 -b 10 -m 9 -a 4 -p blastp -e 1e-30 -d org1/refprot.fasta -i org2/refprot.fasta
Homologous genes for fly computed 27-June-2002 with BLASTP 2.2.2
Homologous genes for fly computed 27-June-2002 with BLASTP 2.2.2
Homologous genes for fly computed 27-June-2002 with BLASTP 2.2.2
Homologous genes for fly computed 27-June-2002 with BLASTP 2.2.2
Homologous genes for fly computed 27-June-2002 with BLASTP 2.2.2
Homologous genes for fly computed 27-June-2002 with BLASTP 2.2.2
Homologous genes for fly computed 27-June-2002 with BLASTP 2.2.2
Homologous genes for fly computed 27-June-2002 with BLASTP 2.2.2
Homologous genes for fly computed 27-June-2002 with BLASTP 2.2.2
Homologous genes for fly computed 27-June-2002 with BLASTP 2.2.2
Homologous genes for fly computed 27-June-2002 with BLASTP 2.2.2