Tuesday, July 19, 2011

Peak Fallacy: Evolutionists Citing Other Evolutionists or, How to Drink Your Own Bathwater

In my previous post I discussed a paper published in the leading journal Nature on protein evolution. In spite of the scientific evidence showing the evolution of proteins is unlikely, the paper finds that proteins are indeed excellent examples of evolution—past, present and future. For instance, the evolutionists write:

Figure 3 shows that, regardless of their similarity, ancient proteins are still diverging from each other and therefore have not yet reached the limit of their sequence divergence. … Our data reveal an ongoing expansion of the protein universe, such that most extant protein sequences are still diverging from each other and from the ancestral LUCA sequence, and have not yet reached the structural and functional limits in sequence space.

But in fact their results show no such thing. Indeed, that would be quite amazing given that science clearly shows protein evolution to be unlikely. It would be the reversal of a wealth of evidence. What they do show is the results of a circuitous analysis and comparison of many different proteins. It would require substantial scientific investigation to come to any firm conclusions about just what those comparisons portend.

What is obvious is that the evolutionists, in typical fashion, have presented a bizarre and heroic conclusion that is without scientific basis. There is nothing in the data about “ancient proteins” that are “still diverging” from an “ancestral LUCA.” Once again, the results are force fitted into the evolutionary narrative, over against scientific evidence and without warrant.

Evolutionists drink their own bathwater


In spite of its scientific problems, this paper is being cited by other evolutionists as an authoritative finding and confirmation of protein evolution. For example, one evolutionist, referring to this paper wrote “The protein universe is currently still expanding …”

In another instance, evolutionists wrote that their research “is entirely in accord with a recent insightful analysis of protein evolution that invoked extensive epistasis to account for the retarded divergence seen in ancient proteins.”

Call it garbage-in, garbage-out, or blowback, or drinking their own bathwater, this is how evolutionary claims propagate. It begins with religious and metaphysical claims for evolution. From there science is enlisted to service the religion. Empirical evidence is twisted and force-fit as necessary, and then cited uncritically as though it is legitimate science.

The duty of a scientist

All of this is a serious breach of scientific duty. For scientists must practice their profession with integrity and, above all, interpret and explain the evidence accurately to the rest of society. Unfortunately evolutionists not only promote a religious theory, they also misrepresent the many scientific failures of the theory. In fact, incredibly, they maintain there are no scientific problems with theory. All of the science, they insist, confirms their bizarre idea, with only the details left to be figured out.

Even this paper on protein evolution is staunchly defended as yet another scientific confirmation of evolution. One professor commented that the paper has a sound basis and that my review of the paper was due to my “profound misunderstanding of the article.” Regarding my point, that for two distant proteins, an amino acid is not likely to change the distance between them, the professor wrote:

It is well known that a single random walker moves away from the point of origin, on average: the distance squared is proportional to the number of steps taken. The same is true for two random walkers: the square of the distance between them grows, on average, no matter what the initial distance was.

This is true for random walkers in Euclidean space, and is typically taught in introductory material. This is probably what the professor was thinking of. But of course protein sequence space is non Euclidean. Amino acids are categorical and unordered. The professor continued:

Cornelius seems to think that closely related proteins should diverge fast and more distant relatives less so.

I think that because it is rather obvious, as I pointed out with some simple examples.

What Cornelius forgets is that we are dealing not with two proteins but with many. Suppose that the proteins have been mutating for a long time and have uniformly filled the configuration space. (For simplicity, think that there is one protein in every configuration.) … This factor, missing from Cornelius's analysis, makes the observed numbers of proteins mutating to and away from a reference equal.

Here the professor raises a meaningless distinction. First, while problems with one or two or three dimensions are common, protein sequence spaces are in the hundreds of dimensions. It would be impossible to have enough different species to have a protein in every configuration. For instance, for a 200 amino acid protein there are 10^260 different possible sequences (a one with 260 zeros after it).

But beyond that, the professor thinks that the effect of a substitution on the distance between two protein sequences is independent of the distance. Specifically, he thinks the substitution has an equal chance of moving toward or away from a reference sequence. But he gets his math all wrong trying to arrive at this impossible conclusion. Let’s have a look at his example.

Consider the five residue reference sequence: NLKIG. There are a total of 5*19 or 95 sequences that have only one amino acid different from the reference. So these 95 sequences completely fill the sequence space at this given distance from our five residue reference sequence.

Now for each of these neighboring sequences, consider a single amino acid substitution. There are 19 such substitutions possible per residue. So the number of possible substitutions, in each of the 95 sequences, is also 5*19 or 95.

Now of these 95 substitutions, exactly one of them changes the neighboring sequence toward the reference sequence. 18 of them change the neighboring sequence into a different neighboring sequence. And 4*19 or 76 move the neighboring sequence away from the reference sequence.

You can repeat these computations for the other 94 neighboring sequences, but of course the answers are the same. It makes no difference whether you look at one neighboring sequence or all 95 of them. The probability of a substitutions moving toward the reference sequence is 1/95 whereas the probability of a substitutions moving away the reference sequence is 76/95.

The professor continues with an analogy about people randomly walking about an island:

At any given moment, a friend's step can be toward or away from you with equal probabilities

That is false. In two dimensions it is close to equal, but with more dimensions the disparity increases. Nonetheless the professor concludes:

In equilibrium, the outward and inward fluxes are the same at any radius. This is a key idea behind the work of Povolotskaya and Kondrashov. Cornelius completely missed it.

Again this is false. It doesn’t matter how long sequences have been changing, or how full the sequence space is. The paper provides results which no doubt mean something, but a confirmation of evolution they are not.

How can evolutionists so consistently misinterpret science so badly? There is no mystery here. Evolutionists must have their theory so they will do whatever is necessary to force-fit the science to support it. Religion drives science and it matters.

 Addendum

We are now entering the irrational stage of the discussion. When evolutionists are confronted with their logical fallacies, metaphysical mandates or misrepresentations of science, the discussion inevitably does not end well.

In this case, the professor now says that his concerns have dealt with the academic problem of infinitesimal steps in his random walk. That’s strange, I thought he was discussing an example of people walking around on an island. That is a real-world problem with finite, not infinitesimal, step sizes.

This is important because the problem at hand, amino acid substitutions in proteins, also deals with finite step sizes (not to mention categorical and unordered variables). So now the professor can claim victory for an irrelevant problem.

84 comments:

  1. Cornelius,

    When you are in a hole, Rule # 1 is stop digging.

    Most of what you wrote in this post is nonsense. I don't have time to address everything now, but here are a couple of quick points.

    1. In a Euclidean space, an infinitesimal step in a random direction takes you to and away from the origin with equal probabilities. This is valid exactly for any number of dimensions. To see that for 3 dimensions, draw a sphere around the origin. A walker on that sphere takes a step in a random direction. For an infinitesimal size of the step, the sphere is well approximated by a flat surface normal to the radius connecting the origin and the walker. A random step has equal chances of landing inside and outside. The same works for any number of dimensions. If you want to argue about steps of length comparable to the radius, that will change, but that is not relevant to the example I discussed.

    2. When you do your calculation for five residues, you have to compare the number of moves (overlap M to overlap M+1) to the number of moves (overlap M+1 to overlap M), not (overlap M to overlap M-1). If you don't understand this, you still don't understand why equilibrium is important.

    ReplyDelete
  2. Shorter Cornelius:

    "Here's a really big number I made up.

    Here's another really big number I made up.

    When I multiply them I get a really really big number.

    Therefore all scientists are poopyheads!"

    ReplyDelete
  3. Oleg:

    When you do your calculation for five residues, you have to compare the number of moves (overlap M to overlap M+1) to the number of moves (overlap M+1 to overlap M), not (overlap M to overlap M-1). If you don't understand this, you still don't understand why equilibrium is important.

    I'm having trouble understanding this. We're not dealing with an equilibrium situation here. We have 2 sequences, an ancestor sequence A to 2 sister sequences S1 and S2, and a reference sequence R. The question here is about dynamics: do the mutations A->S1 and A->S2 tend to move away or towards R? And the answer to this question obviously depends on the distance D between A and R. But your calculation, (bizarrely) assuming a uniform distribution over sequences, only shows that Nt=Na in equilibrium, regardless of D. So that calculation is no help at all in trying to understand the empirical relation between D and Nt/Na.

    Do you see my point?

    ReplyDelete
  4. Correction: we have 4 sequences, an inferred ancestral sequence A of 2 sister sequences S1 and S2, and a reference sequence R.

    ReplyDelete
  5. troy,

    Equilibrium is reached after the proteins have had enough time to explore the space. Then they are distributed uniformly. Right? In that case, N_t=N_a*. Povolotskaya and Kondrashov make a point about that at the bottom of the first page. Do you understand why they see a need to point that out?

    *That is also confirmed by the combinatorial calculation in the other thread.

    ReplyDelete
  6. Oleg,

    No, it's not necessary to assume that proteins have had enough time to explore the space. Sure, a uniform distribution implies equilibrium, but it's not necessary, and given the vastness of the space it's not a good assumption.

    What P&K are saying at the bottom of the first page is that D is at an equilibrium when Nt=Na. That's a different kind of equilibrium than the kind you're talking about. You're talking about a stationary distribution (a uniform distribution) over sequence space. Your calculations do not allow a prediction of the equilibrium D-value. That makes them pretty useless.

    In contrast, my model predicts how Nt/Na depends on D, and what the equilibrium value of D is. I'm not saying it's a good model, since it neglects selection, but it's a dynamic model that does not make the very unrealistic assumption that the proteins have explored the entire vast sequence space.

    ReplyDelete
  7. troy,

    I am not sure what you mean by "the equilibrium value of D." Could you explain?

    ReplyDelete
  8. CH: "We are now entering the irrational stage of the discussion"

    Except of course CH now does not involve himself in actual discussions but prefers to take potshots from the sidelines in the form of "addendums". So to say "we" here is not exactly correct.

    Again, if CH believes himself to be so right about the errors or inaccuracies in this paper, why does he not write to Nature? If he is so concerned about the integrity of science as he says, why doesn't he do something other than write an obscure and irrelevant blog?

    ReplyDelete
  9. Toy protein universe again

    In this post, Cornelius attempted to compute the numbers N_t and N_a for a toy model of proteins with just 5 amino acids. He got it wrong as he still does not seem to understand the concept of equilibrium. Here is how it works.

    Our toy proteins contain only L=5 amino acids (AA). Each AA can take on one of the N=20 values. Assuming a flat fitness landscape (all proteins are allowed and equally functional), we can expect to find 20^5 = 3,200,000 different configurations.

    So our starting point will be 3,200,000 identical proteins, all of them NLKIG as in Cornelius's example. They will undergo random mutations, one AA at a time. We will record what happens at every state: the numbers of proteins for every distance from the original state NLKIG and the ratio N_t/N_a of the number of proteins moving "to" and "away" from the reference protein.

    At t=0, every protein randomly replaces one of its AAs with any of the 20 available AAs. With a probability p=1/20 the new AA will be the same as the old, so 160,000 proteins will stay unchanged. A vast majority, 3,040,000 proteins, will mutate (to something like NLKIY) ending up one step away from the original state. Here is a table showing the distribution:

    t=1
    D N Na Nt
    0. 1.6*10^(5) 3.04*10^(6) 0
    0.2 3.04*10^(6) 0 0
    0.4 0 0 0
    0.6 0 0 0
    0.8 0 0 0
    1. 0

    D = distance from the original protein,
    N = number of proteins at that distance,
    Na = number of proteins that moved away from D to D+1/5,
    Nt = number of proteins that moved back from D+1/5 to D.

    ReplyDelete
  10. Toy protein universe again (continued)

    Let's keep going. At the next step, proteins at distance 1/5 can mutate further away, to distance 2/5. To do so, the mutation has to occur in one of the 4 original AAs (p=4/5) and that mutation has to change the AA (p=19/20). The overall probability of that is 76/100. A total of 2,310,400 proteins will move away from the origin, from D=1/5 to 2/5.

    A small number of proteins will return to the origin. For that, the mutation has to occur in the newly acquired AA (p=1/5), which has to change to a specific value (p=1/20), for an overall probability of 1/100. So 30,040 proteins will go toward the origin, from D=1/5 to D=0.

    Here is the table:

    t=2
    D N Na Nt
    0. 3.84*10^(4) 1.52*10^(5) 3.04*10^(4)
    0.2 8.51*10^(5) 2.31*10^(6) 0
    0.4 2.31*10^(6) 0 0
    0.6 0 0 0
    0.8 0 0 0
    1. 0

    We can keep doing this by hand or we can entrust Mathematica with the task (in fact, the tables above were produced by Mathematica). Here is the distribution and the fluxes after 5 steps, to which I have added one more column, the ratio N_t/N_a:

    t=5
    D N Na Nt Nt/Na
    0. 1.22*10^(3) 3.14*10^(3) 1.06*10^(3) 0.336973
    0.2 4.56*10^(4) 8.04*10^(4) 1.81*10^(4) 0.224954
    0.4 5.02*10^(5) 5.16*10^(5) 5.06*10^(4) 0.0980592
    0.6 1.53*10^(6) 6.41*10^(5) 2.*10^(4) 0.03125
    0.8 1.03*10^(6) 9.51*10^(4) 0 0
    1. 9.51*10^(4)

    It can be seen that more proteins are moving away from the origin, from D to D+1/5, then moving back, from D+1/5 to D. For all distances. So we are not yet in equilibrium. Let's mutate a few more times.

    t=10
    D N Na Nt Nt/Na
    0. 4.03*10^(1) 6.5*10^(1) 3.69*10^(1) 0.566961
    0.2 2.31*10^(3) 2.8*10^(3) 1.4*10^(3) 0.499824
    0.4 4.79*10^(4) 3.99*10^(4) 1.64*10^(4) 0.410563
    0.6 4.23*10^(5) 2.08*10^(5) 6.09*10^(4) 0.293424
    0.8 1.43*10^(6) 2.89*10^(5) 5.29*10^(4) 0.182772
    1. 1.29*10^(6)

    After 20 iterations, the fluxes are getting closer:

    t=20
    D N Na Nt Nt/Na
    0. 2.23 2.46 2.1 0.852547
    0.2 1.84*10^(2) 1.59*10^(2) 1.34*10^(2) 0.838363
    0.4 6.03*10^(3) 3.81*10^(3) 3.13*10^(3) 0.820581
    0.6 9.69*10^(4) 3.96*10^(4) 3.16*10^(4) 0.797514
    0.8 7.63*10^(5) 1.5*10^(5) 1.15*10^(5) 0.766188
    1. 2.33*10^(6)

    After 30, they are almost equal:

    t=30
    D N Na Nt Nt/Na
    0. 1.12 1.09 1.06 0.972561
    0.2 1.04*10^(2) 8.06*10^(1) 7.84*10^(1) 0.971919
    0.4 3.86*10^(3) 2.23*10^(3) 2.17*10^(3) 0.971244
    0.6 7.16*10^(4) 2.75*10^(4) 2.67*10^(4) 0.970532
    0.8 6.64*10^(5) 1.27*10^(5) 1.23*10^(5) 0.969782
    1. 2.46*10^(6)

    After 40 mutations, the proteins have pretty much reached an equilibrium:

    t=40
    D N Na Nt Nt/Na
    0. 1.01 9.65*10^(-1) 9.62*10^(-1) 0.996725
    0.2 9.59*10^(1) 7.31*10^(1) 7.29*10^(1) 0.996715
    0.4 3.64*10^(3) 2.08*10^(3) 2.07*10^(3) 0.996705
    0.6 6.89*10^(4) 2.62*10^(4) 2.61*10^(4) 0.996695
    0.8 6.53*10^(5) 1.24*10^(5) 1.24*10^(5) 0.996685
    1. 2.47*10^(6)

    This is the kind of equilibrium Povolotskaya and Kondrashov mentioned in the paper. After proteins have dispersed through the space, fluxes N_t and N_a become equal, so their ratio is 1.

    Note that by now we are left with only protein at D=0, as we should expect, and the number of proteins at D=1/5 is close to 95, as Cornelius computed above.

    ReplyDelete
  11. So, what was wrong with Cornelius's argument?

    His "away" number N_a is the number of proteins moving from D to D+1/5, same as mine. However, his "to" number N_t is the number of proteins moving from D to D-1/5, whereas mine is D+1/5 to D. For this reason, he will never get N_t/N_a = 1, even in equilibrium. That is simply wrong.

    In equilibrium, proteins are still moving around. However, the number of proteins going from D to D+1/5 is the same as the number of returning proteins, i.e. those coming back from D+1/5 to D. And although it is easier for one protein to move away from the origin, there many more proteins farther away (see the last table). When the numbers reach equilibrium, the two factors cancel out and then N_t and N_a (properly defined) become equal. As they should—in equilibrium.

    ReplyDelete
  12. Oleg,

    With the equilibrium value of D I mean the average D-value in equilibrium. E.g., the mean of the D distribution you obtained in your simulation at t=40.

    The distribution of D-values in equilibrium according to my diffusion approximation should be

    C exp(-(210/19)(1-D))(1+19(1-D))^(39/361)

    Where C is a normalization constant. I think you'll find that this distribution fits your simulation data quite well.

    But still, the data for proteins in figure 3 are clearly not in equilibrium, and an interesting question is how Nt/Na depends quantitatively on D when the system is not in equilibrium. That you can't answer with equilibrium arguments.

    ReplyDelete
  13. I see what you mean, troy.

    The overlap with the original protein has a binomial distribution,
    P_M = (N-1)^{L-M} L!/[M!(L-M)!].
    (Derivation). It has a mean value of L/N, so the mean distance in equilibrium is 1-1/N = 0.95 for N=20 amino acids.

    Here is a table of mean distances as a function of time:
    t D_mean
    0 0
    1 0.190
    2 0.342
    5 0.639
    10 0.848
    20 0.939
    30 0.9488
    40 0.9499

    Your formula, by the way, does not provide a good fit for the equilibrium distribution. That might not be your fault: we have a short L and the discrete nature of the distances might make your diffusion approximation inapplicable. You can see yourself whether it works for larger L. It's a binomial distribution, so one should be able to derive the large-L limit with little effort. Could you outline how you derived the formula? I seem to have missed that.

    I think the orange data points in Figure 3 fall off the N_t/N_a = 1 line for a simple reason. When proteins diffuse throughout the space, very few of them (relatively speaking) stay close to the origin. Unless you have a very large ensemble of proteins, you will have a hard time finding proteins close to the original point. So it could simply be a problem of statistical noise at short distances.

    ReplyDelete
  14. Cornelius wrote: In this case, the professor now says that his concerns have dealt with the academic problem of infinitesimal steps in his random walk. That’s strange, I thought he was discussing an example of people walking around on an island. That is a real-world problem with finite, not infinitesimal, step sizes.

    Talk is cheap, Cornelius. Show us some math. Of course you won't, so let me do that for you.

    The average step size is 2.5 ft, or 0.76 m. An island is, say, 2 miles in diameter. So typical distances will be hundreds of meters. Draw a circle with a radius R = 100 m, put the walker onto its circumference and compute the probabilities of staying inside and stepping outside after one random step of length r = 0.76 m. Here they are:
    Inside: 1/2 - r/(2πR) = 0.4988.
    Outside = 1/2 + r/(2πR) = 0.5012.
    Approximating these as 0.5 is ain't bad.

    At any rate, this point is irrelevant to the discussion we are having. Equilibrium, or lack thereof, is one of the main concerns of the article. You totally missed that. Your calculation is half-baked and in no way shows that the authors did something silly.

    So, are you going to continue your speaker-in-the-ceiling routine or will you find courage to come down and argue mano a mano?

    ReplyDelete
  15. oleg:

    When you are in a hole, Rule # 1 is stop digging. Most of what you wrote in this post is nonsense.

    I appreciate your digging into this and analyzing the paper, but there isn’t much resemblance between your model and the paper’s approach.

    In a Euclidean space, an infinitesimal step in a random direction takes you to and away from the origin with equal probabilities.

    That’s irrelevant.

    Na = number of proteins that moved away from D to D+1/5,
    Nt = number of proteins that moved back from D+1/5 to D.


    No. In the paper Na and Nt are substitution counts, not protein counts.

    When you do your calculation for five residues, you have to compare the number of moves (overlap M to overlap M+1) to the number of moves (overlap M+1 to overlap M), not (overlap M to overlap M-1). If you don't understand this, you still don't understand why equilibrium is important.

    No, you’ve taken your diffusion analogy too far.

    His "away" number N_a is the number of proteins moving from D to D+1/5, same as mine. However, his "to" number N_t is the number of proteins moving from D to D-1/5, whereas mine is D+1/5 to D. For this reason, he will never get N_t/N_a = 1, even in equilibrium. That is simply wrong.

    Yes, from a diffusion perspective it may seem “wrong,” but my method reflects the paper’s approach. The final full paragraph on the first page gives sufficient information to see this. I’ll just paste in the final sentence, which should clear things up:

    Thus, from a cluster-reference alignment we can obtain the numbers of substitutions in the sister sequences that moved the sequence away from (Na) and towards (Nt) the reference sequence.

    In other words, Na and Nt are substitution counts for protein sequences that fall within given distance bins, from the reference. So Nt/Na is a the ratio of (i) the protein sequence substitutions that went toward a reference sequence to (ii) the protein sequence substitutions that went away a reference sequence, for sequences in given distance bins from the reference sequence.

    ReplyDelete
  16. Cornelius,

    Show me how, with your definition of Nt and Na, you can get them equal—in equilibrium. You can't. And it's a must. See the bottom of the first page, right column.

    If Nt/Na < 1 then the sister sequences are evolving away from the reference in sequence space, and, conversely, if Nt/Na > 1 then the sister sequences are evolving towards the reference. If Nt/Na = 1 then the distance in sequence space between them is at an evolutionary equilibrium.

    ReplyDelete
  17. The orange data points in Figure 3 are easily-mutating proteins in equilibrium. They are described by the theory with a flat landscape. They have N_t=N_a. How can you get that? Only with my approach.

    ReplyDelete
  18. CH: Again this is false. It doesn’t matter how long sequences have been changing, or how full the sequence space is. The paper provides results which no doubt mean something, but a confirmation of evolution they are not.

    And here we reach the point where Cornelius hedges his bets with an exit strategy.

    Should his interpretation of the paper be explicitly shown to be wrong, he'll simply fall back with a variant of his argument against the fossil record: The results of the experiment mean something, but they certainly are not observations of proteins evolving. Nor are they proof that evolution is True with a capital 'T'

    However, this would be based his empirically naive interpretation of science itself, along with hidden assumptions which he smuggles into his argument.

    Again, as physicist David Deutsch points out, the majority of scientific philosophers accept Karl Popper's explanation for scientific progress. New theories are not inferred from observations, but are hypotheses based on conjecture and which are discarded based on refutation. They also accept that progress made in this manner is found to be reliable. However, most scientific philosophers do not see why this should be the case. It cannot be justified due to the problem of induction.

    As such, that evolutionary theory can be justified using induction is a straw man.

    What Cornelius either fails to realize, or disingenuously omits, is that the problem of induction isn't limited to evolutionary theory. It's persuasive through all fields of science. And it comes into play at all stages, including observations, which are theory laden.

    Without a solution to the problem of induction, Cornelius' objections are yet another example of hand waving against a theory he personally disagrees with.

    However, the question remains, how do we justify theories? Deutsch, standing on Popper's shoulders, points out that we justify theories based on the quality of their underlying explanations. And given our current knowledge, evolutionary theory represents our best explanation for the biological complexity we observe.

    ReplyDelete
  19. oleg:

    The average step size is 2.5 ft, or 0.76 m. An island is, say, 2 miles in diameter. So typical distances will be hundreds of meters. Draw a circle with a radius R = 100 m, put the walker onto its circumference and compute the probabilities of staying inside and stepping outside after one random step of length r = 0.76 m. Here they are:
    Inside: 1/2 - r/(2πR) = 0.4988.
    Outside = 1/2 + r/(2πR) = 0.5012.
    Approximating these as 0.5 is ain't bad.

    At any rate, this point is irrelevant to the discussion we are having.


    No, this point is not irrelevant. The error you are treating as a close approximation that can be ignored is, in fact, neither. Yes, with infinitesimal step sizes, or very long distances, or low dimensions, the error is small. But none of these conditions hold. This is what I showed with your 5 residue example. At a distance of 0.2, the probability of a random step moving closer is 1/95 (0.0105) whereas the probability of a random step moving farther away is 76/95 (0.8000). Not exactly close. And this is with only 5 residues. With real proteins the probabilities become far more extreme.

    You can see this in Euclidean space as well. Change your radius value from100 m to something like a few meters, and change your 2D example to a 3D example. You’ll see the sensitivity to dimension. Of course it just gets worse in hyperspace.

    ReplyDelete
  20. It's good to see you responding to comments again, CH. I was getting a little worried about you.

    I've not said this to you before, but I've thought it often. Thanks for bucking the trend and providing an actual open forum for debate about evolution. So few anti-evolution sites allow any kind of dissent, and it's nice to see one that isn't just an echo chamber.

    ReplyDelete
  21. Cornelius, troy I have the impression that you are talking past oleg. Take the example from the OP. 95 is actually not the sequence space. And with one substitution you are not in equlibrium but the diffusion has just started. So of course most subsequent substituions will lead you away from the reference sequence.

    What is still not clear to me is how this paper demonstrates common descent. Maybe oleg could continue his explanation?

    ReplyDelete
  22. Cornelius,

    You are completely missing the point of the island example. The point was not to show that the walkers had equal probabilities of stepping to and away from the origin. (That happens to be the case in one Euclidean dimension and works well for large distances or infinitesimal steps in higher-dimensional Euclidean spaces.) The point was to illustrate how we can tell from the movements of walkers whether or not they have reached an equilibrium distribution.

    In an equilibrium state, the net flux of walkers crossing any line is zero. This is known as the principle of detailed balance, a fundamental concept in statistics. In models with continuous dynamics, you deal with gradients. In discrete models, you subtract the number of transitions from state i to state j and back, from state j to state i. When they become equal, you have reached an equilibrium.

    This is the reason why you have to compare the number of mutations taking you from distance D to D+1 to the number of mutations going back, from D+1 to D. Crossing the line at D+1/2, so to speak.

    ReplyDelete
  23. second opinion,

    I have already started commenting on that in the previous thread:
    Single origin vs. multiple origins in the island analogy.

    ReplyDelete
  24. Oleg:

    Your formula, by the way, does not provide a good fit for the equilibrium distribution.

    Are you kidding? I plotted the formula on top of your data points, and all 6 points but one touch the curve. I think that is remarkably good for such small L. But then, as a biologist, perhaps I have lower standards of goodness-of-fit than you physics guys. The average R-squared in ecology/evolution papers is only 0.04 after all!

    Here's how I derived the formula. Start with Kolmogorov forward equation

    pf/pt = -p/px(mu(x)f(x,t))+(1/2)p^2/px^2(sigma^2(x)f(x,t))

    p: partial differential
    f(x,t): distribution of state x at time t
    [x will be our 1-D]
    mu(x): drift coefficient
    sigma^2(x): diffusion coefficient

    I derive mu and sigma^2 from a Markov model for changes in the number M of overlapping amino acids, assuming at most a single mutation per unit time, with small mutation rate z (=O(1/L)).

    Probability of increasing overlap: z(1/N)(L-M)/L, probability decreasing zM/L. Let x=M/L, and scale time with 1/L^2 (found by trial and error),

    mu(x) = lim(L->inf) E(delta x) = zL(1-x-Nx)/N

    sigma^2(x) = lim(L->inf) E(delta x^2) = z(1-x+Nx)/N

    Higher order moments vanish with this scaling of time.

    The stationary distribution f*(x) is then (up to normalization), according to standard diffusion results [e.g. Allen 2003, Stochastic Processes with Applications to Biology]:

    exp(A(x))/sigma^2(x)

    where A(x)= integral(2 mu(x)/sigma^2(x))

    This gives

    f*(x)= C N exp(-2L(N+1)x/(N+1))(1-x+Nx)^(4LN/(N+1)^2-1)

    C is normalization. Replace x with 1-D to get the equilibrium D-distribution.


    I think the orange data points in Figure 3 fall off the N_t/N_a = 1 line for a simple reason. When proteins diffuse throughout the space, very few of them (relatively speaking) stay close to the origin. Unless you have a very large ensemble of proteins, you will have a hard time finding proteins close to the original point. So it could simply be a problem of statistical noise at short distances.

    I doubt it. I think the neutral nucleotides reach equilibrium faster than the proteins because the proteins are constrained by selection.

    ReplyDelete
  25. So, the data supports common descent, but not special creation. Got it. No wonder CH is hyperventilating.

    ReplyDelete
  26. troy,

    Here is a figure with the data at t=40 and your curve with the best fit. I might come back to that, but I am not particularly interested in this side issue. I don't doubt that diffusion with drift should work, particularly when L is sufficiently large. Let's not split points.

    ReplyDelete
  27. oleg:

    Show me how, with your definition of Nt and Na, you can get them equal—in equilibrium. You can't. And it's a must.

    Of course I can. In general, if we didn’t know D, a protein residue has 19/20 chance of being different, and 1/20 chance of being the same as a reference. In the case of being different, with a substitution, it has zero chance of moving farther away, and 1/19 chance of moving toward the reference. OTH, in the case of being the same, with a substitution, it has unity chance of moving farther away, and zero chance of moving toward the reference. So you have:

    Nt/Na = [19/20*1/19] / [1/20*1] = unity

    ReplyDelete
  28. oleg:

    The orange data points in Figure 3 are easily-mutating proteins in equilibrium. They are described by the theory with a flat landscape. They have N_t=N_a. How can you get that? Only with my approach.

    No, they are not easily-mutating proteins, but rather (under evolutionary assumptions) easily-changing nucleotides. And yes, most of the orange data points have Nt/Na of unity, and that is expected under the paper’s approach which I have been explaining. The first two data points, however, don’t make sense under evolution.

    First, these orange data points are of a different sort. Unlike the blue and green data, the orange are not directly linked to the abscissa. So interpreting these data is a bit different.

    For the first couple data points (D = 0.05 and 0.1), you do have significant correlation with the abscissa. IOW, just as the protein sequence distance to the reference is very short (very similar sequences), so too these fourfold-synonymous sites are very similar to the reference. So when you do find a substitution, it usually takes you away from the reference, and the result is small Nt/Na values. This makes no sense under evolution. After all this time these sites should have long since drifted away from the reference.

    Now in general the orange points should show a flat trend, with an average value of unity:

    Nt/Na|fourfold = [3/4*1/3] / [1/4*1] = 1

    This is because a given site has 3/4 probability of being different from the reference. With a substitution, it has zero chance of moving farther away, and 1/3 chance of moving toward the reference. OTH, the site has a 1/4 chance of being identical to the reference. With a substitution, it has unity chance of moving farther away, and zero chance of moving toward the reference.

    So the paper’s approach, which I have been explaining, expects the unity value, in spite of your claims. And the orange data present yet another mismatch with evolution at the short D values.

    ReplyDelete
  29. Cornelius:

    So when you do find a substitution, it usually takes you away from the reference, and the result is small Nt/Na values. This makes no sense under evolution. After all this time these sites should have long since drifted away from the reference.

    After all this time? How much time corresponds with D=0.1? How much time is needed to reach equilibrium for neutral nucleotides? Until you can answer these questions, why should we believe your assertion?

    ReplyDelete
  30. Oleg,

    On a log scale it looks worse than like this.

    But, as you say, let's not split points.

    ReplyDelete
  31. Oleg,

    thanks I must have overlooked it amongst all these irrelevant Neal Tedford posts. But if you don't mind I have a follow-up question:
    How would the curve look like if you had multiple ancestry? Would the slope be different?

    ReplyDelete
  32. That's a good question. Off the top of my head I can only tell you that it will get to equilibrium (N_t/N_a=1) for distances beyond the typical group separation. How it would behave for smaller distances, I am not sure, but it would be less than 1 in that range until all groups disperse enough to overlap. One could do numerical experiments to determine that.

    ReplyDelete
  33. troy:

    After all this time? How much time corresponds with D=0.1?

    Several billion years.

    How much time is needed to reach equilibrium for neutral nucleotides?

    Much less than several billion years.

    Until you can answer these questions, why should we believe your assertion?

    My assertion is not heroic under evolutionary assumptions. Under evolution a warm little pond can single-handedly construct life forms in a few 10s of millions of years. Throughout evolutionary history everything from completely new life forms to new proteins can arise in even shorter time frames.

    So yes, there certainly is sufficient time to tweak a few neutral nucleotides.

    ReplyDelete
  34. Cornelius:

    troy:

    After all this time? How much time corresponds with D=0.1?

    [Cornelius:] Several billion years.


    Really? Seems like a bit much when there are also D-values of about 0.9. How do you reckon?

    [troy:] How much time is needed to reach equilibrium for neutral nucleotides?

    [Cornelius:] Much less than several billion years.


    I agree, but could you provide a more precise estimate, based on evolutionary models?

    ReplyDelete
  35. troy:

    Really? Seems like a bit much when there are also D-values of about 0.9. How do you reckon?

    Well the paper is dealing with ancient proteins, going back to early life. As they write:

    “We applied this approach to alignments obtained from 572 clusters of orthologous groups (COGs) that have been previously inferred to have been present in the LUCA,” and “Thus, 3.5*10^9 yr has not been enough to reach the limit of divergent evolution of proteins.”

    I agree, but could you provide a more precise estimate, based on evolutionary models?

    Under neutral evolution the fixation rate is identical to the mutation rate. So for a given bacteria population for example, you can compute the number of substitutions expected at a site over 3.5 billion years by multiplying your favorite mutation rate (say 10^-8) by the generation time (say a few months) and time span. You easily end up with multiple substitutions.

    ReplyDelete
  36. Cornelius:
    "For the first couple data points (D = 0.05 and 0.1), you do have significant correlation with the abscissa. IOW, just as the protein sequence distance to the reference is very short (very similar sequences), so too these fourfold-synonymous sites are very similar to the reference. So when you do find a substitution, it usually takes you away from the reference, and the result is small Nt/Na values."


    Do you mean why the synonymous Nt/Na values are below one for points of D 0.05 and 0.1?

    ReplyDelete
  37. Cornelius:

    Of course I can. In general, if we didn’t know D, a protein residue has 19/20 chance of being different, and 1/20 chance of being the same as a reference. In the case of being different, with a substitution, it has zero chance of moving farther away, and 1/19 chance of moving toward the reference. OTH, in the case of being the same, with a substitution, it has unity chance of moving farther away, and zero chance of moving toward the reference. So you have:

    Nt/Na = [19/20*1/19] / [1/20*1] = unity


    Thank you, Cornelius. I thought proteins with 5 amino acids were a great toy model, but you did one better and showed us that even a "protein" with just one residue can be an excellent pedagogical device.

    Let's review what you did and compare your earlier calculation for L=5 (described in the opening post) and for L=1 (this comment). There is an important difference as we shall see.

    A protein of length L mutates away from the original state (distance D=0) with the probability 19/20. From distance 1, it mutates away (to D=2) with the probability (19/20)(L-1)/L and to the original state (D=0) with the probability (1/20)(1/L). In your previous (L=5) calculation, you compared the sat two probabilities, that for D=1 to D=2 and that for D=1 to D=0, whose ratio happens to be 19(L-1)=76 for L=4. For L=1, the protein cannot move further away from D=1 as there are no states with D=2, so the ratio is 19(L-1)=0, which you mention… and then do something else. You compare the probability for the move from D=1 to D=0, multiplied by 1, to that for the move from D=0 to D=1, multiplied by 1/19. This wasn't in your previous repertoire, so what did you do here? You followed my recipe for detailed balance!

    Let's see how detailed balance works for our very toy protein of length L=1. There are a total of 20 possible states, so we start with 20 proteins, all in the same configuration (say, G) at t=0.

    We next let each protein mutate randomly to one of the 20 states, including the original one. 19 of them do and 1 stays put. Here is the table:

    t=1
    D N Na Nt
    0 1 19 0
    1 19

    Clearly, the state at t=0 was not in equilibrium because the numbers of proteins going from 0 to 1 (Na) and from 1 to 0 (Nt) were not the same. The distribution has changed.

    What happens next? The lonely protein in the original state mutates away with the probability 19/20, so 19/20th of a protein marches forward. The 19 mutants mutate back with a probability 1/20, so 19/20th of a protein move back. Exactly the same numbers of proteins move from D=0 to D=1 as do from D=1 to D=0, so the numbers of proteins at D=0 and 1 remain unchanged! Here is the table:

    t=2
    D N Na Nt
    0 1 19/20 19/20
    1 19

    If we repeat the procedure, the table will look exactly the same. We have reached equilibrium in just two steps! Ta-da!

    Once again, here is what you did, my dear Cornelius. You multiplied the probability for a protein to move away from D=0 (19/20) by the number of proteins in that state at equilibrium (1) to obtain the flux Na from D=0 to D=1. Then you computed the flux Nt of proteins coming back, from D=1 to D=0, by multiplying the probability to mutate back (1/20) by the number of proteins at D=1 at equilibrium (19). The fluxes are equal in equilibrium.

    Now that you have done detailed balance for L=1, you should have no difficulty following my worked out example for L=5. If you need further assistance please let me know.

    ReplyDelete
  38. olegt: I thought proteins with 5 amino acids were a great toy model, but you did one better and showed us that even a "protein" with just one residue can be an excellent pedagogical device.

    It's a good idea to look at the one-dimensional case. If you can't explain that, then it's unlikely you will understand more complex cases.

    olegt: We have reached equilibrium in just two steps! Ta-da!

    It should be easy for most readers who made it this far to visual equilibrium for a population of one-residue proteins.

    ReplyDelete
  39. oleg:

    Thank you, Cornelius. I thought proteins with 5 amino acids were a great toy model, but you did one better and showed us that even a "protein" with just one residue can be an excellent pedagogical device.

    No, I was simply referring to a single residue within a given protein. You claimed that Nt/Na cannot be unity and I not only showed that unity is quite easy obtain, but that it is the expected value for a single substitution in a protein of unknown distance, D, to the reference.

    Let's review what you did and compare your earlier calculation for L=5 (described in the opening post) and for L=1. There is an important difference as we shall see.

    It is a distinction without a difference. In the OP I used your specific 5 residue segment as an example. In my comment to which you are responding, I used a generic residue. The Nt/Na calculation follows the same method in both examples.

    You compare the probability for the move from D=1 to D=0, multiplied by 1, to that for the move from D=0 to D=1, multiplied by 1/19. This wasn't in your previous repertoire, so what did you do here? You followed my recipe for detailed balance!

    No, in both the OP and the comment, I account for the two possibilities: the residue may be identical to the reference or different. The D value (before the substitution) is fixed. But there are, in general, some residues in the sequence that are identical to the reference, and the others are different. You need to account for both. I did that in your example segment in the OP, and I did that for a given, generic, residue, in the comment. Two ways of explaining the same thing.

    I could have just as easily have used an example sequence to demonstrate to you that there is no big mystery in generating a unity value for Nt/Na.

    Once again, here is what you did, my dear Cornelius. You multiplied the probability for a protein to move away from D=0 (19/20) by the number of proteins in that state at equilibrium (1) to obtain the flux Na from D=0 to D=1. Then you computed the flux Nt of proteins coming back, from D=1 to D=0, by multiplying the probability to mutate back (1/20) by the number of proteins at D=1 at equilibrium (19). The fluxes are equal in equilibrium.

    Now that you have done detailed balance for L=1, you should have no difficulty following my worked out example for L=5. If you need further assistance please let me know.


    Earlier I quoted from the paper. I thought that would clear things up. Here is the quote again:

    Thus, from a cluster-reference alignment we can obtain the numbers of substitutions in the sister sequences that moved the sequence away from (Na) and towards (Nt) the reference sequence.

    You seem to have ignored that quote, and other passages in the paper explaining their method. There is no mystery here. Yes, there are details of their method that are complicated and circuitous, but their basic approach is obvious. Their Na and Nt values are taken from protein alignments. They first infer a non synonymous mutation event (ie, a substitution in a protein sequence), and they then compare that to a reference sequence. The mutation can move away from the reference (thus contributing to Na), or it can move toward the reference (thus contributing to Nt).

    They repeat this for many residues in many proteins, and accrue their statistics. Figure 3 plots the Nt/Na ratio versus distance bins (distance from the sequence to the reference).

    There is information in their results, it just isn’t anything close to what the evolutionists claim. A correlation with D is to be expected, as is seen in Fig. 3. There is no independent evidence for common descent here, protein evolution, etc, as the evolutionists erroneously claim.

    ReplyDelete
  40. Evey Solara:

    Do you mean why the synonymous Nt/Na values are below one for points of D 0.05 and 0.1?

    Yes, I was referring to the orange data points at D 0.05 and 0.1.

    ReplyDelete
  41. Cornelius, and I thought you were so close to making progress! Too bad.

    But perhaps you can give us your calculation for the L=1 case. That would be fun.

    ReplyDelete
  42. And I have no reason to doubt that Kondrashov is familiar with the concept of detailed equilibrium and understands which transitions should be compared. In fact, in another article (Reference 30 in the paper we are discussing) he writes this:

    It is routinely assumed that extant proteins are in detailed equilibrium and their evolution is a stationary and reversible process: reciprocal fluxes of amino acid substitutions are equal, amino acid frequencies are constant, and nothing would change if time were to flow backwards.

    I. K. Jordan, F. A. Kondrashov, I. A. Adzhubei, Y. I. Wolf, E. V. Koonin, A. S. Kondrashov, and S. Sunyaev, "A universal trend of amino acid gain and loss in protein evolution," Nature 433, 633 (2005). doi:10.1038/nature03306

    ReplyDelete
  43. Do you understand the word reciprocal, Cornelius?

    ReplyDelete
  44. Cornelius:
    Yes, I was referring to the orange data points at D 0.05 and 0.1.

    If synonymous substitutions were actually neutral then we would expect Nt/Na to be close to 1 much sooner. However, synonymous sites are most likely not neutral in bacteria and the lag in reaching the equilibrium distance (Nt/Na=1) is probably due to selection slowing down their rate of divergence.

    ReplyDelete
  45. oleg:

    It is routinely assumed that extant proteins are in detailed equilibrium and their evolution is a stationary and reversible process: reciprocal fluxes of amino acid substitutions are equal, amino acid frequencies are constant, and nothing would change if time were to flow backwards.

    I. K. Jordan, F. A. Kondrashov, I. A. Adzhubei, Y. I. Wolf, E. V. Koonin, A. S. Kondrashov, and S. Sunyaev, "A universal trend of amino acid gain and loss in protein evolution," Nature 433, 633 (2005). doi:10.1038/nature03306.


    Good find, sounds like an interesting paper. But that doesn’t change this paper, and the fact that they obtained “the numbers of substitutions in the sister sequences that moved the sequence away from (Na) and towards (Nt) the reference sequence.”

    It’s strange that you consistently deny the obvious. What is it about substitutions “that moved the sequence away from (Na) and towards (Nt) the reference sequence” you don’t understand?

    A reader may misunderstand the paper on first reading, but I’ve pointed this out several times now.

    ReplyDelete
  46. oleg:

    Let me try to explain it yet again, as simply as possible:

    1. In Figure 3 they plot Nt/Na versus the distance, D, the distance between the sequence and a reference sequence.

    2. Na and Nt are the respective numbers of inferred substitutions that move the sequence away from and towards the reference sequence.

    3. They draw evolutionary conclusions from the fact that Nt/Na is sub unity at smaller D values.

    4. But a sub unity Nt/Na is a natural consequence of smaller D values.

    No doubt there is some information in their results, but not what they claim. Therefore their evolutionary conclusions are unwarranted.

    ReplyDelete
  47. I agree with 1 and 3, but 2 is just plain silly. You simply cannot determine whether the proteins are at equilibrium by comparing the from and to substitutions with D as a starting point. For that, you have to compare Na for D to D+1 and Nt for the reciprocal process D+1 to D, period. That is well established. It's a very basic thing. The authors are not so stupid as to use the right approach in multiple other papers and to perform a silly thing in this one.

    You're in denial.

    ReplyDelete
  48. Evey Solara:

    However, synonymous sites are most likely not neutral in bacteria and the lag in reaching the equilibrium distance (Nt/Na=1) is probably due to selection slowing down their rate of divergence.

    Keep in mind that the x-axis is not time. Therefore Fig. 3 does not reveal a lag, so to speak and your argument, as I understand it, should apply across all the D values.

    ReplyDelete
  49. oleg:

    You simply cannot determine whether the proteins are at equilibrium by comparing the from and to substitutions with D as a starting point.

    I didn’t say you can.

    You're in denial.

    So when you point out the inconvenient facts to the evolutionist he denies them and then says you are the one who is in denial.

    As I said earlier we’re in the irrational stage of the discussion. This couldn’t be a better example of it. Granted the paper is not always obvious, but on this point it is crystal clear. There is no room for misinterpretation as the authors explained precisely what their Nt and Na quantities are.

    I’ve repeatedly pointed this out to the evolutionist but he puts his fingers in his ears and says “no, no, no – you’re in denial.”

    As I said, these don’t end well. The evolutionist demands that A ~= A. He must be right and you must be wrong, and that’s all there is to it. I asked the evolutionist:

    It’s strange that you consistently deny the obvious. What is it about substitutions “that moved the sequence away from (Na) and towards (Nt) the reference sequence” you don’t understand?

    Of course he has no answer other than to insist I’m in denial. He’s ignoring the paper and yet I’m the one who is in denial. Apparently the authors didn’t really mean what they carefully wrote because, after all, they would know better.

    ReplyDelete
  50. CH,
    I think you're reading too much into the shorthand they used in the main text. If you read the methods, they say

    "With the Bayesian approach, for each substitution we used the distribution of posterior probabilities of all ancestral states, with Nt and Na representing the sum of all substitutions multiplied by their posterior probabilities."

    the "posterior probabilities" indicates that they used exactly the process oleg described, comparing Na for D to D+1 and Nt for the reciprocal process D+1 to D.

    ReplyDelete
  51. 4afb9302-32ec-11e0-becb-000bcdcb471e:

    "With the Bayesian approach, for each substitution we used the distribution of posterior probabilities of all ancestral states, with Nt and Na representing the sum of all substitutions multiplied by their posterior probabilities."

    the "posterior probabilities" indicates that they used exactly the process oleg described, comparing Na for D to D+1 and Nt for the reciprocal process D+1 to D.


    This is either humor, desparation or ignorance. Or a combination?

    ReplyDelete
  52. Oleg:

    I agree with 1 and 3, but 2 is just plain silly.

    I am astonished at the level of misunderstanding. Here's #2 again:

    2. Na and Nt are the respective numbers of inferred substitutions that move the sequence away from and towards the reference sequence.

    Cornelius has it exactly right. That is what the authors calculated. They had a whole bunch of pairs Ancestor-Reference, where the Ancestor refers to an inferred ancestor of 2 closely related "Sister" species (i.e. that diverged relatively recently), and where the Reference refers to a not-so-closely related species. By simply counting how many substitutions from Ancestor to Sister occurred away from the Reference (i.e. Na) and how many towards the Reference (i.e. Nt), and by averaging that over multiple pairs with roughly the same D between Ancestor and Reference, you get the results from figure 3. There is no need whatsoever to talk about equilibrium to understand what these numbers represent.

    If Nt/Na<1 the conclusion is that the protein "universe" expands because on average the change from Ancestor -> Sister proteins tends to be away from the Reference.

    Of course it's true that in equilibrium overlap M->M+1 has to equal M+1->M (M, not D; D is between zero and 1). But that is completely irrelevant for understanding what the Nt/Na ratios in figure 3 represent.

    I think we're talking past each other here for some reason.

    ReplyDelete
  53. The devil is in the detail, troy.

    I maintain that Nt for M->M+1 is compared to Na for M+1->M. With that, I accept 2. That way, one can meaningfully discuss whether the proteins are in equilibrium or not (at least in simple models without fitness).

    Cornelius disagrees with that. He thinks Na is computed for M->M-1. That makes no sense whatsoever. Such a comparison provides no information about the detailed balance (or lack thereof), which is a central part of the paper.

    That's the rub.

    ReplyDelete
  54. There are other papers by Fyodor Kondrashov, in which he explicitly states that he compares reciprocal fluxes. (I quoted one above.) That means he understands perfectly well how detailed balance works and he would not goof off.

    ReplyDelete
  55. oleg:

    That makes no sense whatsoever. Such a comparison provides no information about the detailed balance (or lack thereof), which is a central part of the paper.

    You are merely pointing out yet another problem with the paper.

    ReplyDelete
  56. Cornelius Hunter said...

    This is either humor, desparation or ignorance. Or a combination?


    Speaking of humor, desperation or ignorance, when will you be submitting your paradign shaking Evolution has been falsified evidence to any mainstream scientific journals?

    Don't you think it unfair to keep such an epic discovery from the masses?

    ReplyDelete
  57. It's not a problem with the paper. Its main point is to determine whether or not the ancient proteins have reached evolutionary equilibrium. These researchers are well aware how to probe for equilibrium, they have done so in the past.

    The problem is with your misunderstanding of a basic statistical concept known as detailed balance.

    ReplyDelete
  58. oleg:

    The problem is with your misunderstanding of a basic statistical concept known as detailed balance.

    Oh, right, I forgot, I’m the one who has the misunderstanding.

    The evolutionist thought steps had equal probability of moving toward or away, and when reminded that is false he said that he actually was referring to infinitesimal steps. And when reminded that he was wasn’t referring to infinitesimal steps he said it didn’t matter because for finite steps the approximation is pretty good. And when reminded that it isn’t a good approximation he said it was all my fault because I had missed the point anyway.

    Then the evolutionist thought Na and Nt were protein counts. But Na and Nt are not protein counts but sequence substitution counts.

    Then the evolutionist said the Na and Nt counts are really flux counts anyway, with Na indicating the number of proteins that move away from D to D+1/5, and Nt indicating the number of proteins that move back from D+1/5 to D. But that’s obviously false since the substitutions have very little effect on D, and so the flux values would be zero or near zero.

    Then the evolutionist said the paper is really about detailed balance even though (i) it doesn’t say that anywhere and (ii) it explicitly says something else. Also the methods wouldn’t support that anyway as the step sizes are way too small to cross the boundaries in most cases.

    But then again, it’s all my fault because I don’t understand basic statistical concepts. I wonder what it is that I don’t understand about them? Oops, yet another thing I don’t understand.

    ReplyDelete
  59. The evolutionist thought steps had equal probability of moving toward or away, and when reminded that is false he said that he actually was referring to infinitesimal steps. And when reminded that he was wasn’t referring to infinitesimal steps he said it didn’t matter because for finite steps the approximation is pretty good. And when reminded that it isn’t a good approximation he said it was all my fault because I had missed the point anyway.

    Yes, you did. My island story was preceded by the following: "If you find the above discussion too technical, here is a simple analogy. As any analogy, it is imperfect but it serves to illustrate the key difference between equilibrium and out-of-equilibrium situations."

    The aim of that analogy was not to convince the reader that the probabilities of moving to and away from the origin are the same in the protein case. It was to illustrate the concept of equilibrium. I could hardly have been clearer.

    Then the evolutionist thought Na and Nt were protein counts. But Na and Nt are not protein counts but sequence substitution counts.

    I called Nt "the expectation number for proteins moving toward the reference." That's the same as the number of substitutions. What's the problem?

    Then the evolutionist said the paper is really about detailed balance even though (i) it doesn’t say that anywhere and (ii) it explicitly says something else. Also the methods wouldn’t support that anyway as the step sizes are way too small to cross the boundaries in most cases.

    The last paragraph on the first page of the paper contains everything a person familiar with the subject needs to figure out what they do. Here it is, once again:

    "If Nt/Na < 1 then the sister sequences are evolving away from the reference in sequence space, and, conversely, if Nt/Na > 1 then the sister sequences are evolving towards the reference. If Nt/Na = 1 then the distance in sequence space between them is at an evolutionary equilibrium."

    I asked you specifically to show how your recipe for the fluxes reproduces the equilibrium condition, Nt/Na = 1. What you did in response was reproduce the standard detailed balance argument for a single protein.

    But then again, it’s all my fault because I don’t understand basic statistical concepts. I wonder what it is that I don’t understand about them? Oops, yet another thing I don’t understand.

    Yes, your understanding could have been better.

    ReplyDelete
  60. Cornelius,

    Your prescription simply cannot be squared with the key paragraph mentioned above:

    "If Nt/Na < 1 then the sister sequences are evolving away from the reference in sequence space, and, conversely, if Nt/Na > 1 then the sister sequences are evolving towards the reference. If Nt/Na = 1 then the distance in sequence space between them is at an evolutionary equilibrium."

    Because you consider fluxes from the same distance D, the numbers of substitutions to and away are Nt = Qt P(D) and Na = Qa P(D), where P(D) is the current number of proteins at distance D, whereas Qt and Qa are the probabilities for individual proteins to step from D to D-1 and from D to D+1, respectively. Because both fluxes are proportional to P(D), the current number of proteins cancels out when we take the ratio: Nt/Na = Qt/Qa. With this prescription, you are forever stuck in the same state: forever evolving away if QtQa and forever in equilibrium if Qt=Qa. This is of course nonsense. When proteins begin in the same original state (see my examples), they are out of equilibrium and are dispersing away. Eventually, they reach an equilibrium state. So the calculation must reproduce two situations, Nt/Na < 1 and Nt/Na = 1. And it is impossible with your prescription. The prescription makes no sense whatsoever. No competent computational biologist would suggest it. Not Kondrashov who obviously knows better.

    ReplyDelete
  61. Cornelius Hunter:

    Addendum: [on main page]

    "We are now entering the irrational stage of the discussion. When evolutionists are confronted with their logical fallacies, metaphysical mandates or misrepresentations of science, the discussion inevitably does not end well.

    "In this case, the professor now says that his concerns have dealt with the academic problem of infinitesimal steps in his random walk. That’s strange, I thought he was discussing an example of people walking around on an island. That is a real-world problem with finite, not infinitesimal, step sizes."

    "This is important because the problem at hand, amino acid substitutions in proteins, also deals with finite step sizes (not to mention categorical and unordered variables).

    So now the professor can claim victory for an irrelevant problem."
    ===

    Ultimately this is what it is all about. The truth of the matter is irrelevant when there is a game to be played, spun and won. If you head on over to the safe-haven forum where they all collectively get figuratively drunk on each others good'ol boy back slapping contest, they[including this Professor] are trashing your name with every play by play post over here. It's not about the science. It's about the religion, dogma, philosophy, ideology and ultimate politically correct worldview. These supposedly intellectually superior mature adults have an entire "Let's make fun of Cornelius thread" over there dedicated to doing nothing more than allowing folks[intellect wannabes] with no life to add some type of preverted meaning to their otherwise no purpose dull existance.

    The discusson was very interesting for a time, but as you've pointed out it's reached the mountain's summit and is down hill from here on out.

    ReplyDelete
  62. oleg:

    I called Nt "the expectation number for proteins moving toward the reference." That's the same as the number of substitutions. What's the problem?

    You do not in general have a one-to-one mapping. There may be one inferred substitution in a protein sequence that they analyze, but there could be more than one as well.

    The last paragraph on the first page of the paper contains everything a person familiar with the subject needs to figure out what they do. Here it is, once again:

    "If Nt/Na < 1 then the sister sequences are evolving away from the reference in sequence space, and, conversely, if Nt/Na > 1 then the sister sequences are evolving towards the reference. If Nt/Na = 1 then the distance in sequence space between them is at an evolutionary equilibrium."


    Huh? Your persistence is exceeded only by your creativity in interpreting this paper. The sentence before this quote of yours directly contradicts your creative reading exercise:

    Thus, from a cluster-reference alignment we can obtain the numbers of substitutions in the sister sequences that moved the sequence away from (Na) and towards (Nt) the reference sequence.

    And even the section you quote doesn’t support your strange interpretation. The paper only mentions “equilibrium” a couple of times. It is referring to a distance in sequence space being in evolutionary equilibrium.

    Your prescription simply cannot be squared with the key paragraph mentioned above:

    If you have sequences whose inferred substitutions have Nt/Na < 1, then an evolutionist could say the sequences are evolving away from the reference. Conversely, if Nt/Na > 1 then an evolutionist could say the sequences are evolving toward the reference. And if Nt/Na is near unity then an evolutionist could say the sequences are at a distance which is at an evolutionary equilibrium. Again, one doesn’t need your creative reading of the paper in order to understand it.

    That said, I am by no means defending the paper which is mostly junk science. I don’t agree with these evolutionary interpretations of the Nt/Na ratio because evolution is entirely superfluous, contradicted by immediate factors which the authors ignore, and contradicted by a wealth of other evidences.

    ReplyDelete
  63. Eocene:

    Ultimately this is what it is all about. The truth of the matter is irrelevant when there is a game to be played, spun and won. If you head on over to the safe-haven forum where they all collectively get figuratively drunk on each others good'ol boy back slapping contest, they[including this Professor] are trashing your name with every play by play post over here. It's not about the science. It's about the religion, dogma, philosophy, ideology and ultimate politically correct worldview. These supposedly intellectually superior mature adults have an entire "Let's make fun of Cornelius thread" over there dedicated to doing nothing more than allowing folks[intellect wannabes] with no life to add some type of preverted meaning to their otherwise no purpose dull existance.

    I forgive whoever does that.

    ReplyDelete
  64. Hunter:

    I don’t agree with these evolutionary interpretations of the Nt/Na ratio because evolution is entirely superfluous…

    Alternatively, should be believe that each of the 100s of protein variants analyzed by the authors were separately created? That seems improbable.

    ReplyDelete
  65. CH: That said, I am by no means defending the paper which is mostly junk science. I don’t agree with these evolutionary interpretations of the Nt/Na ratio because evolution is entirely superfluous, contradicted by immediate factors which the authors ignore, and contradicted by a wealth of other evidences.

    Cornelius,

    Isn't this essentially the same argument you make against "evolutionary interoperation" of the fossil record? Specifically, we could substitute 'Nt/Na' with 'fossil record' to obtain….

    I don’t agree with these evolutionary interpretations of the fossil record because evolution is entirely superfluous, contradicted by immediate factors which the authors ignore, and contradicted by a wealth of other evidences.

    Would this also be an accurate statement?

    ReplyDelete
  66. You do not in general have a one-to-one mapping. There may be one inferred substitution in a protein sequence that they analyze, but there could be more than one as well.

    We were discussing a pedagogical toy model in which substitutions happened one at a time. There is no difference in that case.

    Huh? Your persistence is exceeded only by your creativity in interpreting this paper. The sentence before this quote of yours directly contradicts your creative reading exercise:

    Once again, we were discussing a toy model whose purpose is to illustrate how proteins reach equilibrium and how one can check, by measuring the substitutions, whether the equilibrium has been reached. The models we have considered do not apply directly to the ancient proteins. However, they make it crystal clear that measuring approach to equilibrium must be done in terms of reciprocal fluxes. They also make it clear that the prescription you ascribe to the authors simply cannot be used to study equilibrium.

    The authors would have to be fools to follow your approach and fools they are not. Their other publications show that they are well aware of this "subtle" point. It is amazing that you still don't understand that equilibrium must be measures by observing reciprocal fluxes. Admit it, Cornelius. For the toy model I have discussed. Or show a calculation that your prescription can be used instead.

    If you have sequences whose inferred substitutions have Nt/Na < 1, then an evolutionist could say the sequences are evolving away from the reference. Conversely, if Nt/Na > 1 then an evolutionist could say the sequences are evolving toward the reference. And if Nt/Na is near unity then an evolutionist could say the sequences are at a distance which is at an evolutionary equilibrium. Again, one doesn’t need your creative reading of the paper in order to understand it.

    Forget the paper. Focus on the toy model of proteins mutating one AA at a time. How do you measure approach to equilibrium? By looking at reciprocal fluxes, that's how.

    ReplyDelete
  67. And even the section you quote doesn’t support your strange interpretation. The paper only mentions "equilibrium" a couple of times. It is referring to a distance in sequence space being in evolutionary equilibrium.

    You are technically correct that the word "equilibrium" only appears twice. But they also use the expression "limit of sequence divergence." (What sort of state do proteins reach when they have dispersed through protein space? A state of equilibrium.) That term appears four times on page 3.

    Whether the ancient proteins have reached an evolutionary equilibrium is the central question addressed by the paper. How can we tell this? Read Nature's Guide to Authors. Here is how the introductory paragraph is to be structured:

    This paragraph starts with a 2-3 sentence basic introduction to the field; followed by a one-sentence statement of the main conclusions starting 'Here we show' or equivalent phrase; and finally, 2-3 sentences putting the main findings into general context so it is clear how the results described in the paper have moved the field forwards.

    So what are the authors main conclusions? Here is the introductory paragraph. See if you can tell.

    "We formulate a computational approach to study the rate of divergence of distant protein sequences and measure this rate for ancient proteins, those that were present in the last universal common ancestor. We show that ancient proteins are still diverging from each other, indicating an ongoing expansion of the protein sequence universe… Thus, 3.5*10^9 yr has not been enough to reach the limit of divergent evolution of proteins, and for most proteins the limit of sequence similarity imposed by common function may not exceed that of random sequences."

    ReplyDelete
  68. Oleg,

    Definition of equilibrium:

    1. a stable condition in which forces cancel one another
    2. a state or feeling of mental balance; composure
    3. (Physics / General Physics) any unchanging condition or state of a body, system, etc., resulting from the balance or cancelling out of the influences or processes to which it is subjected See thermodynamic equilibrium
    4. (Physics / General Physics) Physics a state of rest or uniform motion in which there is no resultant force on a body
    5. (Chemistry) Chem the condition existing when a chemical reaction and its reverse reaction take place at equal rates
    6. (Physics / General Physics) Physics the condition of a system that has its total energy distributed among its component parts in the statistically most probable manner
    7. (Life Sciences & Allied Applications / Physiology) Physiol a state of bodily balance, maintained primarily by special receptors in the inner ear
    8. (Economics) the economic condition in which there is neither excess demand nor excess supply in a market.

    Please have mercy on the group here and tell me that this is not going to be another debate about creative word defintions. Have a nice weekend.

    ReplyDelete
  69. Neal,

    This is about equilibrium in sense 3 listed in your comment. More specifically, detailed equilibrium, or detailed balance. See, for example, this article, where the concept and its applications are explained.

    V. Mustonen and M. Lässig, "Fitness flux and ubiquity of adaptive evolution," PNAS 107, 4248 (2010). doi:10.1073/pnas.0907953107.

    Here is a brief excerpt:

    "We define evolutionary equilibrium as a stationary (time-independent) distribution Peq(x), which satisfies the so-called detailed balance condition
    G(x′, t′, x, t) Peq(x) = G(x, t′, x′, t) Peq(x′)
    for arbitrary times and frequencies. Detailed balance says that in equilibrium, the probability of any evolutionary transition equals the probability of the reverse transition. This definition is well-known in statistical physics, but it is more restrictive than the definitions in much of the population-genetics literature, where any stationary state is called equilibrium."

    ReplyDelete
  70. oleg:

    The authors would have to be fools to follow your approach and fools they are not.

    You didn’t do the work, nor did you document the work, nor do you seem interested in reading the explanations of the work. In spite of a crystal clear description of that aspect of the work, you impose your view of what you think should have been done, which by the way probably is not even feasible.

    It is amazing that you still don't understand that equilibrium must be measured by observing reciprocal fluxes. Admit it, Cornelius.

    This continues to be an example of the irrationality of evolutionary thinking and how these discussions degrade. In this case the evolutionist grossly misrepresents the paper, makes various mis statements, and then blames me for not understanding simple concepts.

    Forget the paper. Focus on the toy model of proteins mutating one AA at a time. How do you measure approach to equilibrium? By looking at reciprocal fluxes, that's how.

    Sure in a diffusion problem. But selection changes all that. The diffusion analogy takes you only so far, a fact you continue to resist.

    You are technically correct that the word "equilibrium" only appears twice. But they also use the expression "limit of sequence divergence." (What sort of state do proteins reach when they have dispersed through protein space? A state of equilibrium.) That term appears four times on page 3.

    Unbelievable. This is becoming comical. The expression “limit of sequence divergence” is in an evolutionary context, as is their use of “equilibrium.” You have passed over crystal clear explanations and are divining hidden meanings in phrases, which mysteriously the authors don’t expand upon, to arrive at the “real” method which in fact probably is not even feasible.

    ReplyDelete
  71. Sure in a diffusion problem. But selection changes all that. The diffusion analogy takes you only so far, a fact you continue to resist.

    Great. You finally concede that equilibrium (more precisely, detailed balance) requires the reciprocal fluxes to be equal in a flat landscape. We can call this progress.

    Unfortunately for you, detailed balance applies to nontrivial landscapes as well. In statistical mechanics, thermal equilibrium sets in when reciprocal fluxes between microstates become equal. That includes pairs of microstates with different energies. That's the equivalent of selection pressure.

    The above mentioned paper by Mustonen and Lässig deals with detailed balance in the presence of selection.

    ReplyDelete
  72. oleg:

    Great. You finally concede that equilibrium (more precisely, detailed balance) requires the reciprocal fluxes to be equal in a flat landscape. We can call this progress.

    No, progress would be if you read the methods used, as described in the paper. They did not measure reciprocal fluxes of proteins, no matter how many times you say otherwise. The paper doesn’t say that, it explicitly states otherwise, and in any case it would be quite a trick to do so. Another way I would measure progress would be in a reduction of your silly accusations. But this is what evolutionists do—just throw mud in all directions.

    ReplyDelete
  73. They did not measure reciprocal fluxes of proteins, no matter how many times you say otherwise. The paper doesn’t say that, it explicitly states otherwise, and in any case it would be quite a trick to do so.

    For me, this paper has been a mental challenge, and my math skills are rudimentary, but from the discussion above, it seems clear that the authors measured reciprocal fluxes of mutations in proteins. What other possible meaning could Nt/Na have?

    Where do the authors explicitly state otherwise?

    Why would those measurements be hard to make?

    Doesn't micro-evolution happen all the time? Or is it special creation that's happening all the time?

    ReplyDelete
  74. They did not measure reciprocal fluxes of proteins, no matter how many times you say otherwise. The paper doesn’t say that, it explicitly states otherwise, and in any case it would be quite a trick to do so.

    Now, this is plain silly, Cornelius. Comparing pairs of fluxes (D,D+1) and (D+1,D) is no more difficult than comparing pairs (D,D+1) and (D-1,D). You need to measure the same fluxes for either comparison, namely fluxes (1,2), (2,3), (3,4) and so on and fluxes (2,1), (3,2), (4,3) etc. If one comparison could be done, so could be the other.

    ReplyDelete
  75. Pedant:

    For me, this paper has been a mental challenge, and my math skills are rudimentary, but from the discussion above, it seems clear that the authors measured reciprocal fluxes of mutations in proteins. What other possible meaning could Nt/Na have?

    Well, how about that Nt and Na are the numbers of substitutions toward and away from the reference sequence? That is, after all, exactly what they say. If you are correct that they measured reciprocal fluxes then why wouldn’t they explain how they calculated those quantities. There are several non trivial issues on how one would compute a flux from Nt or Na. Wouldn’t it be strange that they never say a peep about this?


    Where do the authors explicitly state otherwise?

    Well, that would be on the first page. They write:

    Thus, from a cluster-reference alignment we can obtain the numbers of substitutions in the sister sequences that moved the sequence away from (Na) and towards (Nt) the reference sequence.



    Why would those measurements be hard to make?

    Well there’s the issue that you have many different types of proteins which, as you know, according to evolution must evolve as significantly different rates. Why would the evolutionists never mention anything about that if they were computing fluxes? Setting that aside, there is the problem of how to compute the fluxes and step sizes.

    Let me give an example of how one might do it (absent details from oleg or the paper). Let’s say in one case your proteins are 300 residues long. And you infer, say, 10 substitutions in one of the sequences. Now you compare with the reference sequence which, let’s say, is at a D of 0.78 (66 residues in common). You find that of those 10 substitutions, 6 were toward the reference and 4 were away. So how are you going to obtain a flux from that?

    Well, presumably you would look at the effect of those 10 substitutions on D. You’re now at 0.78, and before those 10 substitutions you would have been at 0.787 (64 residues in common). So what do you do with that? A move from 0.787 to 0.78 doesn’t cross any of the D boundaries they are using. Would you just ignore it and say it made no contribution to any flux, and just take those proteins that happened to be close enough to a boundary to cross it?

    I’m by no means guaranteeing my calculations are correct here. I’m simply trying to do what oleg won’t do. He imagines a vast set of calculations going on behind the scenes with no mention of them in the paper. So I’m guessing at what he’s thinking of.

    My point is simply that there are plenty of modeling issues here and potentially issues of statistical significance, depending on how many proteins they end up with crossing the boundaries. And of course this would raise the issue of D bin size selection, and related tradeoffs.

    Don’t you think the evolutionists would have at least given a brief mention to these issues? Why would they write something different in their paper, but actually be doing this, and without any mention of it?

    ReplyDelete
  76. oleg:

    =====
    Cornelius wrote: They did not measure reciprocal fluxes of proteins, no matter how many times you say otherwise. The paper doesn’t say that, it explicitly states otherwise, and in any case it would be quite a trick to do so.

    oleg replies: Now, this is plain silly, Cornelius. Comparing pairs of fluxes (D,D+1) and (D+1,D) is no more difficult than comparing pairs (D,D+1) and (D-1,D). You need to measure the same fluxes for either comparison, namely fluxes (1,2), (2,3), (3,4) and so on and fluxes (2,1), (3,2), (4,3) etc. If one comparison could be done, so could be the other.
    =====


    Except the minor detail that they did neither. The problem here is that you are committed to defending evolution, regardless of the evidence. That’s dogma.

    There are no (D,D+1) and (D-1,D) flux values given. The paper’s Fig. 3 gives Na/Nt ratios which are not fluxes. You have no idea what the data are saying and are making absurd claims as you rush to defend evolution.

    The real problem here is not someone making a few mistakes. We all make mistakes. The problem is the dogma that mandates and drives evolution. The paper simply has no basis for its claims and ignores massive problems. It has no place in a scientific journal. And yet here we see it, in a leading journal.

    ReplyDelete
  77. Cornelius Hunter said...

    The real problem here is not someone making a few mistakes. We all make mistakes. The problem is the dogma that mandates and drives evolution. The paper simply has no basis for its claims and ignores massive problems. It has no place in a scientific journal. And yet here we see it, in a leading journal.


    Then why don't you write up your critique and send it to Nature? Maybe at the same time you can submit your 'Falsification of Evolution' iron clad evidence.

    What are you afraid of CH? You make big noises on this little backwater blog but you're nowhere to be found in the real scientific world. That's why you get laughed at in so many other science forums.

    ReplyDelete
  78. Thorton:

    "Then why don't you write up your critique and send it to Nature?"
    ===

    Why Does he have to disprove your religion to you ??? Don't you have to first prove your articles of FAITH first ???
    ---

    Thorton:

    "What are you afraid of CH?"
    ===

    I believe it's clear here that he is not afraid of anything. You've all called him out these past weeks and insisted he come back and wear that T-Shirt with the Bulls-Eye zeros on it front and back for you all to sling mud and virtual feces at. So he did, but then he cleans your collective materialists clocks and the best you can do is shout insults, filth and vulgarities at him which ultimately proves you have nothing to really fall back on as far as foundational unbiased scientific conclusions and instead we see what is nothing more than religious faith-based statement making dogmatically defended.
    ---

    Thorton:

    "You make big noises on this little backwater blog but you're nowhere to be found in the real scientific world."
    ===

    I'm affraid the only really big noises heard are those from the safe-haven forums. It's funny however that the local resident geniuses here take time out of their big important scientific research projects which no doubt are for humankind's benefit, to actually waste their time on what is considered nothing more than a "backwater blog" as you say.
    ---

    Thorton:

    "That's why you get laughed at in so many other science forums."
    ===

    Really ??? The "other science forums" ??? I didn't realize TalkRational , AntiEvolution , Athiest-Forums , TalkOrigins , etc were actually considered academic scientific forums. I've been around many important researchers and scientists over the last 30 years and I've never seen or heard them arrive at profound scientific conclusions using perverted filthy insulting humor and degrading personal attacks like some wounded vicious animal.

    ReplyDelete
  79. Dr Hunter, thank you for your reply, which makes your objections much clearer. You said:

    Pedant: What other possible meaning could Nt/Na have?

    Hunter: Well, how about that Nt and Na are the numbers of substitutions toward and away from the reference sequence? That is, after all, exactly what they say.
    ...

    They write:

    "Thus, from a cluster-reference alignment we can obtain the numbers of substitutions in the sister sequences that moved the sequence away from (Na) and towards (Nt) the reference sequence."


    This looks like a "flux," a flow, to me:

    [movement away from]/[movement towards].

    [The tide goes out/the tide comes in].

    Over time, the average level of the tide changes.

    [Amino acids move away from a reference sequence/amino acids move towards that sequence.]

    The numbers of substitutions in either direction are snapshots in sequence time, from which flow is inferred.

    Pedant: Why would those measurements be hard to make?

    Hunter: Well there’s the issue that you have many different types of proteins which, as you know, according to evolution must evolve as significantly different rates. Why would the evolutionists never mention anything about that if they were computing fluxes?


    I'm not seeing this as an issue. Didn't the authors examine many different types of proteins (in clusters of orthologous groups)? Wasn't that the point of the study?

    Hunter: Let me give an example of how one might do it (absent details from oleg or the paper). Let’s say in one case your proteins are 300 residues long. And you infer, say, 10 substitutions in one of the sequences.

    Why "infer"? I thought the substitutions in each direction were "counted."

    Hunter: Well, presumably you would look at the effect of those 10 substitutions on D.

    Why would you want to do that? That would get you into an endless circle. The authors say (p. 923):

    "To investigate the dynamics and the limits of protein divergence, we relate Nt/Na to D, the protein distance between the ancestral state of the sister sequences and the reference sequence."

    I take this to mean that for any given D, as defined above, there is a set of Nt/Na to be determined. D is the independent variable in this x/y comparison.

    ReplyDelete
  80. Pedant:

    The numbers of substitutions in either direction are snapshots in sequence time, from which flow is inferred.

    Well I wouldn’t say “snapshots in sequence time.” First, you would have to define what you mean by “sequence time.” But in any case, I would hardly refer to substitutions occurring over 3.5 billion years as “snapshots.” And also, the substitutions are also inferred. But otherwise, yes, the flow must be computed from the inferred substitutions.


    I'm not seeing this as an issue. Didn't the authors examine many different types of proteins (in clusters of orthologous groups)? Wasn't that the point of the study?

    Well I’m not saying it is a huge issue. My point was merely that was that it might have merited some discussion.


    Why "infer"? I thought the substitutions in each direction were "counted."

    Well one can do both, right? I can infer how many cars have arrived this morning by counting them in the parking lot. I’m counting, but it is nonetheless an inference. I’m assuming none of the cars have been in the parking lot over night.

    Regarding the protein substitutions, they are assuming evolution happened. So there is a religious inference, going against the scientific evidence, right up front. Beyond that there is also a minor, much more reasonable inference at work. Taking evolution as a given, the substitutions in a given sequence must be inferred by aligning with sister sequences. But in such comparisons you don’t know which residue did the changing. If three sequences have an alanine and one has a valine, then you assume, via parsimony, that the valine had switched from an alanine. Nothing wrong with that, but it is an inference.


    Hunter: Well, presumably you would look at the effect of those 10 substitutions on D.

    Pedant: Why would you want to do that? That would get you into an endless circle.


    I don’t see why. oleg wants the paper to compute fluxes (even though it doesn’t), so you would have to compute those fluxes from the substitutions. As you said yourself above:

    The numbers of substitutions in either direction are snapshots in sequence time, from which flow is inferred.


    The authors say (p. 923):

    "To investigate the dynamics and the limits of protein divergence, we relate Nt/Na to D, the protein distance between the ancestral state of the sister sequences and the reference sequence."

    I take this to mean that for any given D, as defined above, there is a set of Nt/Na to be determined. D is the independent variable in this x/y comparison.


    Yes, you are describing what they did in the paper. I was referring to oleg’s desire for fluxes to be computed for the different D boundaries.

    ReplyDelete
  81. Eocene said...


    Waaaah! Waaaah! Waaaah!


    LOL! No one does chest-thumping righteous indignation like you do Eocene. You're a pompous blowhard's pompous blowhard.

    ReplyDelete
  82. Rich Huges:

    "Ah - real science!"
    ===

    Thorton:

    "Heh. Nothing clears out a room full of Creationists quicker than the introduction of scientific details."
    ===

    ETC, ETC, ETC, ETC, ETC and on and on.

    This is funny. I mean the above snarks were common a few days ago when all seemed supreme in evo-world. But then Cornelus did come back and turned the lights on and then suddenly all the cockroaches scattered.

    ReplyDelete
  83. This comment has been removed by the author.

    ReplyDelete
  84. Thanks for your further responses, Dr Hunter. It looks like you and I, at least, are in reasonable agreement. Some feedback:

    Pedant (on the issue of different types of proteins evolving at different rates): I'm not seeing this as an issue. Didn't the authors examine many different types of proteins (in clusters of orthologous groups)? Wasn't that the point of the study?

    Hunter: Well I’m not saying it is a huge issue. My point was merely that was that it might have merited some discussion.


    See Supplementary Information - Supplementary Figure 8 | Rate of protein divergence of proteins with different rates of evolution:

    http://www.nature.com/nature/journal/v465/n7300/extref/nature09105-s1.pdf

    Hunter: Yes, you are describing what they did in the paper.

    Thanks be to Darwin that I got something right.

    ReplyDelete