euGenes .. Fish .. Fly .. Human .. Mouse .. Mosquito .. Rat .. Weed .. Worm .. Yeast Help .. Preferences

euGenes: Homologous Genes Summary Table August-2005

Genes with one or more similar genes or homologs in other organisms.
Number and percent with homologs, of total protein sequences available per organism.
Maximum e-value: 1e-30
NOTE: Read across table rows, not down columns (see below).
  Homologs >
Source
HumanChimpDogMouseRatChickenPufferfish (Fugu)ZebrafishFruitfly (D.melano.)MosquitoBeeWorm (C.elegans)Worm (C.briggsae)Mustard weedRiceSoil amoeba (Dicty.)Yeast (S.cerev.)Bacteria (E.coli)|AnyN genes
Human -- 29529
87%
27761
81%
26830
79%
23860
70%
22947
67%
22166
65%
21368
63%
13276
39%
13123
38%
11446
33%
10538
31%
10418
30%
5832
17%
5210
15%
5662
16%
3919
11%
838
2%
|31101
91%
33860
Chimp18898
87%
-- 16865
78%
16194
75%
14283
66%
13792
64%
13137
61%
12579
58%
7673
35%
7636
35%
6532
30%
6009
27%
5956
27%
3430
15%
3050
14%
3343
15%
2358
10%
499
2%
|19179
89%
21506
Dog28144
92%
27139
89%
-- 26740
88%
23816
78%
23302
76%
22771
75%
22053
72%
13992
46%
13947
46%
12092
39%
11259
37%
11157
36%
6249
20%
5492
18%
6013
19%
4177
13%
952
3%
|29034
95%
30308
Mouse16384
86%
15674
82%
16066
84%
-- 14067
74%
13069
68%
12662
66%
12231
64%
7887
41%
7789
41%
6759
35%
6331
33%
6268
33%
3630
19%
3196
16%
3558
18%
2432
12%
557
2%
|17167
90%
18941
Rat12409
94%
11907
90%
12184
92%
12420
94%
-- 10631
80%
10412
79%
10037
76%
6592
50%
6521
49%
5660
43%
5378
40%
5298
40%
3062
23%
2741
20%
2974
22%
2097
15%
484
3%
|12844
97%
13142
Chicken22452
79%
21676
76%
22190
78%
21528
75%
19697
69%
-- 20547
72%
19946
70%
12368
43%
12307
43%
10725
37%
9997
35%
9943
34%
5504
19%
4866
17%
5448
19%
3721
13%
889
3%
|23740
83%
28416
Pufferfish (Fugu)25223
76%
24411
73%
24984
75%
24492
74%
22639
68%
23945
72%
-- 25854
78%
14835
44%
14844
44%
13024
39%
12108
36%
11921
36%
6277
19%
5603
16%
6225
18%
4189
12%
991
3%
|29356
88%
33003
Zebrafish24213
75%
23096
72%
23729
74%
23224
72%
21507
67%
22744
70%
25550
79%
-- 13041
40%
13273
41%
11558
36%
10588
33%
10329
32%
5547
17%
5109
15%
5572
17%
3695
11%
860
2%
|28224
88%
32062
Fruitfly (D.melano.)6057
44%
5636
41%
5997
44%
5900
43%
5387
39%
5539
41%
5882
43%
5421
40%
-- 8161
60%
5980
44%
4632
34%
4581
34%
2742
20%
2421
17%
2708
20%
1925
14%
485
3%
|11259
83%
13472
Mosquito6856
43%
6327
40%
6759
42%
6662
42%
6099
38%
6288
39%
6680
42%
6212
39%
9437
59%
-- 6854
43%
5198
32%
5107
32%
3089
19%
2735
17%
2994
18%
2169
13%
778
4%
|10282
65%
15802
Bee8795
51%
8281
48%
8627
50%
8517
50%
7810
46%
8186
48%
8657
51%
8166
48%
10209
60%
10206
60%
-- 6525
38%
6411
37%
3292
19%
2957
17%
3286
19%
2315
13%
551
3%
|11563
68%
16948
Worm (C.elegans)5019
25%
4568
23%
4922
24%
4891
24%
4490
22%
4612
23%
4824
24%
4519
22%
4685
23%
4709
23%
3817
19%
-- 14883
75%
2567
12%
2297
11%
2495
12%
1829
9%
480
2%
|15003
75%
19764
Worm (C.briggsae)4949
25%
4504
23%
4860
24%
4811
24%
4419
22%
4588
23%
4771
24%
4424
22%
4615
23%
4597
23%
3757
19%
14329
73%
-- 2494
12%
2226
11%
2430
12%
1800
9%
469
2%
|14383
73%
19528
Mustard weed5527
19%
5040
17%
5410
18%
5460
18%
4881
16%
5024
17%
5404
18%
4904
16%
4753
16%
4831
16%
3812
13%
4491
15%
4399
15%
-- 17730
61%
5086
17%
3612
12%
1326
4%
|18356
63%
28860
Rice4649
9%
4208
8%
4532
9%
4584
9%
4151
8%
4174
8%
4634
9%
4438
9%
3937
8%
4079
8%
3148
6%
3708
7%
3637
7%
16808
34%
-- 4180
8%
3051
6%
1254
2%
|17453
36%
48467
Soil amoeba (Dicty.)3044
22%
2722
19%
2996
21%
2948
21%
2622
19%
2745
20%
2924
21%
2745
20%
2614
19%
2609
19%
2119
15%
2408
17%
2362
17%
2758
20%
2475
18%
-- 1900
13%
524
3%
|3939
28%
13671
Yeast (S.cerev.)1785
30%
1617
27%
1763
30%
1733
29%
1614
27%
1593
27%
1710
29%
1579
27%
1675
28%
1665
28%
1280
22%
1541
26%
1514
26%
1732
29%
1559
26%
1650
28%
-- 473
8%
|2833
49%
5777
Bacteria (E.coli)412
9%
355
8%
394
9%
388
9%
349
8%
372
8%
390
9%
348
8%
365
8%
668
15%
278
6%
350
8%
351
8%
574
13%
543
12%
431
10%
369
8%
-- |1062
25%
4242

Genes with one or more similar genes, or paralogs, in same organism.
  HumanChimpDogMouseRatChickenPufferfish (Fugu)ZebrafishFruitfly (D.melano.)MosquitoBeeWorm (C.elegans)Worm (C.briggsae)Mustard weedRiceSoil amoeba (Dicty.)Yeast (S.cerev.)Bacteria (E.coli)
Paralogs23366
69%
9941
46%
21728
71%
10735
56%
7205
54%
18226
64%
22907
69%
24431
76%
4993
37%
7037
44%
8411
49%
9240
46%
8988
46%
21461
74%
25802
53%
5276
38%
1930
33%
1155
27%

Source (rows): source protein for BLAST query
Homologs (columns): sequences in BLAST database match with probability <= 1e-30
Any column: Source organism protein have one or more homologs in any organism
N genes column: Number of protein sequences available for the organism

Methods:
This is a computed gene homology or similarity using reference protein sequences identified by the source databases. A BLAST of all these sequences is computed, each sequence against all others, using specific parameters given below.

The sequences used in these calculations are found in the "Reference proteins" data files in each organism's folder. This summary is determined from the "Homologous genes table" data files also in each folder.

The counts in this table are of the number of available genes for an organism, and those which have one or more significant homologs in the other organisms. Percentages are the count of genes with any homolog (one or several) in another organism, divided by the total available genes in that organism (x 100).

You can read the homology summary table best this way. To answer the question what percent of all fruitfly genes have any homologs in C.elegans, look at the Fruitfly row, and find the Worm column. To answer the reverse, the Worm row and Fruitfly column says the percent of all worm genes that have any fruitfly homologs.

Note the asymmetry in these counts and percents. Fruitfly genome may have 5600 of its 13000 genes (43%) showing some homology to Human genes, whereas the Human genome may have 8700 of its 47000 genes (18%) showing some homology to Fruitfly genes. These include the same gene-pairs.

Please note that the summary you find here is based on a specific set of data and homology/similarity (BLAST) calculation parameters. Someone else may publish a different set of values, which would be equally valid, using a different set of data and options for determining "homology". The euGenes summary tries to stay up-to-date by using reference sequences identified by genome databases, and by matching current gene information from these databases with their current sequences.

These values are all moving targets; more data that is coming from genome sequencing projects will change these values. Especially in the euGenes summary, the fruitfly, worm (C.elegans) and yeast data is based on full genome sequences, while man, mouse, weed, and fish are based only on partial sequencing, which may skew values like percent homologies higher than in fact these organisms have.


Homology computations:
BLAST: BLASTP 2.2.10 [Oct-19-2004]
CMDLINE: blastall -v 10 -b 10 -m 9 -p blastp -e 1e-3
DATE: August-2005
Database: Database: /gpfs/ux455375/prots/protdb/sanPF
minbits: 0
mineval: 1e-30
minident: 0


Send comments to us at eugenes@iubio.bio.indiana.edu
euGenes uses Argos: A Replicable Genome infOrmation System