euGenes: Homologous Genes
euGenes .. Fish .. Fly .. Human .. Mouse .. Mosquito .. Rat .. Weed .. Worm .. Yeast Help .. Preferences

euGenes: Homologous Genes Summary Table

Genes with one or more similar genes or homologs in other organisms.
Number and percent with homologs, of total protein sequences available per organism.
  Homologs >
Source
FruitflyHumanMouseMosquitoWeedWormYeastZebrafishRatRiceE. coliChimp|AnyN genes
Fruitfly -- 5959
45%
5620
42%
8088
61%
2617
19%
4063
31%
1848
14%
1082
8%
0
0%
2445
18%
511
3%
336
2%
|8330
63%
13092
Human9451
42%
-- 17418
78%
9766
44%
4444
20%
7364
33%
3417
15%
3741
16%
0
0%
4135
18%
705
3%
2088
9%
|19136
86%
22058
Mouse7158
45%
14076
89%
-- 7334
46%
3445
21%
5565
35%
2427
15%
2642
16%
0
0%
3204
20%
556
3%
1201
7%
|14806
94%
15706
Mosquito9578
61%
7167
45%
6680
42%
-- 3216
20%
4946
31%
2257
14%
1337
8%
0
0%
2971
18%
1024
6%
420
2%
|10415
66%
15686
Weed4763
17%
5620
21%
5447
20%
5128
19%
-- 3886
14%
3569
13%
967
3%
0
0%
17894
67%
1383
5%
428
1%
|18013
67%
26595
Worm4457
24%
4948
26%
4642
25%
4751
25%
2461
13%
-- 1800
9%
961
5%
0
0%
2255
12%
498
2%
302
1%
|5692
30%
18467
Yeast1589
25%
1766
28%
1673
26%
1672
26%
1757
28%
1371
22%
-- 272
4%
0
0%
1627
26%
496
8%
123
1%
|2194
35%
6200
Zebrafish685
53%
1055
83%
1032
81%
667
52%
206
16%
487
38%
137
10%
-- 0
0%
194
15%
53
4%
160
12%
|1064
83%
1270
Rat12244
44%
25251
90%
23897
85%
12655
45%
5538
19%
9625
34%
4377
15%
5468
19%
-- 5184
18%
896
3%
3267
11%
|26986
97%
27811
Rice6719
12%
5779
11%
5119
9%
7553
14%
25239
48%
5175
9%
5160
9%
845
1%
0
0%
-- 1288
2%
337
0%
|25306
48%
52250
E. coli348
8%
406
9%
380
9%
854
20%
566
13%
317
7%
370
8%
60
1%
0
0%
578
13%
-- 18
0%
|1159
27%
4174
Chimp119
14%
770
95%
649
80%
125
15%
45
5%
98
12%
53
6%
261
32%
0
0%
42
5%
8
0%
-- |771
95%
810

Genes with one or more similar genes, or paralogs, in same organism.
  FruitflyHumanMouseMosquitoWeedWormYeastZebrafishRatRiceE. coliChimp
Paralogs4901
37%
13772
62%
8502
54%
7932
50%
19720
74%
9996
54%
2030
32%
814
64%
0
0%
35852
68%
1194
28%
668
82%

Source (rows): source protein for BLAST query
Homologs (columns): sequences in BLAST database match with probability <= 1e-30
Any column: Source organism protein have one or more homologs in any organism
N genes column: Number of protein sequences available for the organism

Methods:
This is a computed gene homology or similarity using reference protein sequences identified by the source databases. A BLAST of all these sequences is computed, each sequence against all others, using specific parameters given below.

The sequences used in these calculations are found in the "Reference proteins" data files in each organism's folder. This summary is determined from the "Homologous genes table" data files also in each folder.

The counts in this table are of the number of available genes for an organism, and those which have one or more significant homologs in the other organisms. Percentages are the count of genes with any homolog (one or several) in another organism, divided by the total available genes in that organism (x 100).

You can read the homology summary table best this way. To answer the question what percent of all fruitfly genes have any homologs in C.elegans, look at the Fruitfly row, and find the Worm column. To answer the reverse, the Worm row and Fruitfly column says the percent of all worm genes that have any fruitfly homologs.

Note the asymmetry in these counts and percents. Fruitfly genome may have 5600 of its 13000 genes (43%) showing some homology to Human genes, whereas the Human genome may have 8700 of its 47000 genes (18%) showing some homology to Fruitfly genes. These include the same gene-pairs.

Please note that the summary you find here is based on a specific set of data and homology/similarity (BLAST) calculation parameters. Someone else may publish a different set of values, which would be equally valid, using a different set of data and options for determining "homology". The euGenes summary tries to stay up-to-date by using reference sequences identified by genome databases, and by matching current gene information from these databases with their current sequences.

These values are all moving targets; more data that is coming from genome sequencing projects will change these values. Especially in the euGenes summary, the fruitfly, worm (C.elegans) and yeast data is based on full genome sequences, while man, mouse, weed, and fish are based only on partial sequencing, which may skew values like percent homologies higher than in fact these organisms have.


Homology computations:
blastall -v 10 -b 10 -m 9 -a 4 -p blastp -e 1e-30 -d org1/refprot.fasta -i org2/refprot.fasta
Homologous genes for fly computed 31-October-2003 with BLASTP 2.2.6
Homologous genes for fly computed 31-October-2003 with BLASTP 2.2.6
Homologous genes for fly computed 31-October-2003 with BLASTP 2.2.6
Homologous genes for fly computed 31-October-2003 with BLASTP 2.2.6
Homologous genes for fly computed 31-October-2003 with BLASTP 2.2.6
Homologous genes for fly computed 31-October-2003 with BLASTP 2.2.6
Homologous genes for fly computed 31-October-2003 with BLASTP 2.2.6
Homologous genes for fly computed 31-October-2003 with BLASTP 2.2.6
Homologous genes for fly computed 31-October-2003 with BLASTP 2.2.6
Homologous genes for fly computed 31-October-2003 with BLASTP 2.2.6
Homologous genes for fly computed 31-October-2003 with BLASTP 2.2.6
Homologous genes for fly computed 31-October-2003 with BLASTP 2.2.6


Send comments to us at eugenes@iubio.bio.indiana.edu
euGenes uses Argos: A Replicable Genome infOrmation System