Homologs > Source | Fruitfly | Human | Mouse | Worm | Yeast | Weed | Zebrafish | | | Any | N genes |
---|---|---|---|---|---|---|---|---|---|---|
Fruitfly | 307 2% | 4684 36% | 2969 23% | 4096 32% | 1733 13% | 1167 9% | 457 3% | | | 5698 44% | 12749 |
Human | 5079 49% | 459 4% | 6007 59% | 4153 40% | 1756 17% | 1208 11% | 1143 11% | | | 7812 76% | 10174 |
Mouse | 2557 51% | 4521 90% | 183 3% | 2103 42% | 808 16% | 581 11% | 863 17% | | | 4624 92% | 4989 |
Worm | 4477 23% | 4083 21% | 2721 14% | 1672 8% | 1722 9% | 1206 6% | 430 2% | | | 6249 32% | 19073 |
Yeast | 1530 24% | 1469 23% | 896 14% | 1397 22% | 398 6% | 813 13% | 93 1% | | | 2100 33% | 6190 |
Weed | 960 15% | 961 15% | 622 10% | 880 14% | 746 12% | 169 2% | 63 1% | | | 1357 22% | 6028 |
Zebrafish | 391 56% | 590 85% | 563 81% | 283 40% | 51 7% | 42 6% | 94 13% | | | 619 89% | 694 |
E. coli (bacteria) | 345 8% | 339 7% | 190 4% | 340 7% | 367 8% | 256 5% | 21 0% | | | 593 13% | 4288 |
Methods:
This is a computed gene homology or similarity using reference protein sequences
identified by the source databases. A BLAST of all these sequences is computed,
each sequence against all others, using specific parameters given below.
The sequences used in these calculations are found in the "Reference proteins" data files in each organism's folder. This summary is determined from the "Homologous genes table" data files also in each folder.
You can read the homology summary table best this way. To answer the question what percent of all fruitfly genes have homologs in C.elegans, look at the Fruitfly row, and find the Worm column. To answer the reverse, the Worm row and Fruitfly column says the percent of all worm genes that have fruitfly homologs.
Please note that the summary you find here is based on a specific set of data and homology/similarity (BLAST) calculation parameters. Someone else may publish a different set of values, which would be equally valid, using a different set of data and options for determining "homology". The euGenes summary tries to stay up-to-date by using reference sequences identified by genome databases, and by matching current gene information from these databases with their current sequences.
These values are all moving targets; more data that is coming from genome sequencing projects will change these values. Especially in the euGenes summary, the fruitfly, worm (C.elegans) and yeast data is based on full genome sequences, while man, mouse, weed, and fish are based only on partial sequencing, which may skew values like percent homologies higher than in fact these organisms have.
Asymmetry is due to blast query filter.
Homology computations:
# BLAST command: blastall -b 30 -a 4 -e 1e-30 -p blastp -d meowprot.fa -i meowprot.fa
# No. sequences=61,635; No. letters=28,932,922; Expectation cutoff=1.0e-30
# Homologous genes for ecoli computed 28-November-2000 with BLASTP 2.1.2
# Homologous genes for fish computed 25-November-2000 with BLASTP 2.1.2
# Homologous genes for fly computed 25-November-2000 with BLASTP 2.1.2
# Homologous genes for man computed 25-November-2000 with BLASTP 2.1.2
# Homologous genes for mouse computed 25-November-2000 with BLASTP 2.1.2
# Homologous genes for weed computed 25-November-2000 with BLASTP 2.1.2
# Homologous genes for worm computed 25-November-2000 with BLASTP 2.1.2
# Homologous genes for yeast computed 25-November-2000 with BLASTP 2.1.2