euGenes: Homologous Genes
euGenes .. Fish .. Fly .. Human .. Mouse .. Mosquito .. Rat .. Weed .. Worm .. Yeast Help .. Preferences

euGenes: Homologous Genes Summary Table

Genes with one or more similar genes or homologs in other organisms.
Number and percent with homologs, of total protein sequences available per organism.
  Homologs >
Source
FruitflyHumanMouseWormYeastWeedZebrafish|AnyN genes
Fruitfly307
2%
4684
36%
2969
23%
4096
32%
1733
13%
1167
9%
457
3%
|5698
44%
12749
Human5079
49%
459
4%
6007
59%
4153
40%
1756
17%
1208
11%
1143
11%
|7812
76%
10174
Mouse2557
51%
4521
90%
183
3%
2103
42%
808
16%
581
11%
863
17%
|4624
92%
4989
Worm4477
23%
4083
21%
2721
14%
1672
8%
1722
9%
1206
6%
430
2%
|6249
32%
19073
Yeast1530
24%
1469
23%
896
14%
1397
22%
398
6%
813
13%
93
1%
|2100
33%
6190
Weed960
15%
961
15%
622
10%
880
14%
746
12%
169
2%
63
1%
|1357
22%
6028
Zebrafish391
56%
590
85%
563
81%
283
40%
51
7%
42
6%
94
13%
|619
89%
694
E. coli (bacteria)345
8%
339
7%
190
4%
340
7%
367
8%
256
5%
21
0%
|593
13%
4288
Gene homologies of E. coli (bacteria) are provided for comparison. They are not part of the euGenes data set.

Source (rows): source protein for BLAST query
Homologs (columns): sequences in BLAST database match with probability <= 1e-30
Any column: Source organism protein have one or more homologs in any organism
N genes column: Number of protein sequences available for the organism

Methods:
This is a computed gene homology or similarity using reference protein sequences identified by the source databases. A BLAST of all these sequences is computed, each sequence against all others, using specific parameters given below.

The sequences used in these calculations are found in the "Reference proteins" data files in each organism's folder. This summary is determined from the "Homologous genes table" data files also in each folder.

You can read the homology summary table best this way. To answer the question what percent of all fruitfly genes have homologs in C.elegans, look at the Fruitfly row, and find the Worm column. To answer the reverse, the Worm row and Fruitfly column says the percent of all worm genes that have fruitfly homologs.

Please note that the summary you find here is based on a specific set of data and homology/similarity (BLAST) calculation parameters. Someone else may publish a different set of values, which would be equally valid, using a different set of data and options for determining "homology". The euGenes summary tries to stay up-to-date by using reference sequences identified by genome databases, and by matching current gene information from these databases with their current sequences.

These values are all moving targets; more data that is coming from genome sequencing projects will change these values. Especially in the euGenes summary, the fruitfly, worm (C.elegans) and yeast data is based on full genome sequences, while man, mouse, weed, and fish are based only on partial sequencing, which may skew values like percent homologies higher than in fact these organisms have.

Asymmetry is due to blast query filter.
Homology computations:
# BLAST command: blastall -b 30 -a 4 -e 1e-30 -p blastp -d meowprot.fa -i meowprot.fa
# No. sequences=61,635; No. letters=28,932,922; Expectation cutoff=1.0e-30
# Homologous genes for ecoli computed 28-November-2000 with BLASTP 2.1.2
# Homologous genes for fish computed 25-November-2000 with BLASTP 2.1.2
# Homologous genes for fly computed 25-November-2000 with BLASTP 2.1.2
# Homologous genes for man computed 25-November-2000 with BLASTP 2.1.2
# Homologous genes for mouse computed 25-November-2000 with BLASTP 2.1.2
# Homologous genes for weed computed 25-November-2000 with BLASTP 2.1.2
# Homologous genes for worm computed 25-November-2000 with BLASTP 2.1.2
# Homologous genes for yeast computed 25-November-2000 with BLASTP 2.1.2


Send comments to us at eugenes@iubio.bio.indiana.edu
euGenes uses Argos: A Replicable Genome infOrmation System