Thursday, December 15, 2016

Is genetics still metaphysical? Part II. Is that wrong?

What is the role of theory vs empiricism in science?  How do these distinctions apply to genetics?

Yesterday, we discussed some of the history of contesting views on the subject.  Much of the division occurred before there was systematically theoretical biology.  In particular, when creationism, or divine creative acts rather than strictly material processes, was the main explanation for life and its diversity, the issues were contended in the burgeoning physical sciences, with its dramatic technological advances, and experimental settings, and where mathematics was a well-established part of the science and its measurement aspects.


Around the turn of the 20th century, Darwinian evolution was an hypothesis that not even all the leading biologists could accept.  Inheritance was fundamental to any evolutionary view, and inherited somethings seemed obviously to be responsible for the development of organisms from single cells (fertilized eggs). Mendel had shown examples of discretely inherited traits, but not all traits were like that.  Ideas about what the inherited units were (Darwin called them gemmules, Mendel called them Elements, and hereafter I'll use the modern term 'genes') were simply guesses (or just words).  They were stand-ins for what was assumed to exist, but in the absence of their direct identification they were, essentially, only metaphysical or hypothetical constructs.


The cloak of identity had serious implications.  For example, evolution is about inherited variation, but genes as known in Darwin's time and most of the later 19th century didn't seem to change over generations, except perhaps due to grotesquely nonviable effects called 'mutations'.  How could these 'genes', whatever they were, be related to evolution, which is inherently about change and relative positive effects leading to selection among organisms that carried them?


Many critics thought the gene was just a metaphysical concept, that is, used for something imagined, that could not in a serious way be related to the empirical facts about inherited traits. The data were real, but the alleged causal agent, the 'gene', was an unseen construct, yet there was a lot of dogma about genes.  Many felt that the life sciences should stick to what could be empirically shown, and shy away from metaphysical speculation.  As we saw yesterday, this contention between empiricism and theory was a serious part of the debate about fundamental physics at the time.


That was more than a century ago, however, and today almost everyone, including authors of textbooks and most biologists themselves, asserts that we definitely do know what a gene is, in great detail, and it is of course as real as rain and there's nothing 'metaphysical' about it.  To claim that genes are just imagined entities whose existential reality cannot be shown would today be held to be not just ignorant, but downright moronic.  After all, we spend billions of dollars each year studying genes and what they do!  We churn out a tsunami of papers about genes and their properties, and we are promised genetically based 'precision' medicine, and many other genetic miracles besides, that will be based on identifying 'genes for' traits and diseases, that is enumerable individual genes that cause almost any trait of interest, be it physical, developmental, or behavioral.  That's why we're plowing full budget ahead to collect all sorts of Big Data in genetics and related areas.  If we know what a gene is then the bigger the data the better, no?


Or could it be that much of this is marketing that invokes essentially metaphysical entities to cover what, despite good PR to the contrary, remains just empiricism?  And if it is just empiricism, why the 'just'?  Isn't it good that, whatever genes 'are', if we can measure them in some way we can predict what they do and live to ripe old ages with nary a health problem?  Can't we in fact make do with what is largely pure empiricism, without being distracted by any underlying law of biological causation, or the true nature of these causative entities--and deliver the miraculous promises? The answer might be a definitive no!


The metaphysical aspects of genes, still today

In essence, genes are not things, they are not always discrete DNA sequence entities with discrete functions, and they are not independently separable causative agents.  Instead, even the term 'gene' remains a vague, generically defined one.  We went through decades in the 20th century believing that a gene was a distinct bit of DNA sequence, carrying protein code. But it is not so simple.  Indeed, it is not simple at all. 

It is now recognized by those who want to pay attention to reality, that the concept of the 'gene' is still very problematic, and to the extent that assertions are made about 'genes' they are metaphysical assertions, no matter how clothed in the rhetoric of empiricism they may be.  For example, many DNA regions code for functional RNA rather than protein.  Much DNA function has to do with expression of these coding regions.  Many coding regions are used in different ways (for example, different exon splicing) in different circumstances.  Some DNA regions act only when they are chemically modified by non-DNA molecules (and gene expression works exclusively in that way). Some of 'our' DNA is in microbes that are colonizing us.  And 'traits' as we measure them are the result of many--often hundreds or more--DNA elements, and of interactions among cells.  Each cell's DNA is different at least in some details from that of its neighbors (due to somatic mutation, etc.).  And then there is 'the' environment!  This is central to our biological state but typically not accurately measurable.


Some discussion about these issues can be seen in a report of a conference on the gene concept in 2011 at the Santa Fe Institute.  Even earlier, in 2007 when it seemed we had really learned about genomes, hardly suspecting how much more there was (and is) still to be learned, a review in Genome Research was defined in an almost useless way as follows: 

Finally, we propose a tentative update to the definition of a gene: A gene is a union of genomic sequences encoding a coherent set of potentially overlapping functional products. Our definition sidesteps the complexities of regulation and transcription by removing the former altogether from the definition and arguing that final, functional gene products (rather than intermediate transcripts) should be used to group together entities associated with a single gene. It also manifests how integral the concept of biological function is in defining genes.
Really?!  Is that a definition or an academically couched but empty kicking of the can down the road while seeming to be knowledgeable and authoritative?  Or is it simply so empty as to be risible?

There are many now who advocate a 'Third Way' that in a rather generic sense of advocating less dogma and more integrative and indeed innovative or integrative approaches.  But even this doesn't say what the Third Way actually is, though one thing for sure is that it's every Third Way member's favorite way of coopting the concept of biological causation as his or her own.  I'm being cynical, and I'm associated with the Third Way myself and believe that serious rethinking about biological causation and evolution is in order, but that doesn't seem to be too unfair a way to characterize the Third Way's characterization of mainline genome-centered or perhaps genome-obsessed thinking. At least, it acknowledges that we don't just have 'genes' and 'environment', but that biological causality is based fundamentally on interactions of many different kinds. 

DNA is basically an inert molecule on its own
In genetic terminology, DNA is basically an inert molecule.  That is, whatever you want to call genes act in a context-specific way, and this goes beyond what is known as cis interactions among local DNA elements (like regulatory sequences flanking coding sequences) along a given strand. Instead, genetic function is largely a trans phenomenon, requiring interaction among many or even countless other parts of DNA on the different chromosomes in the cell.  And often if not typically, nothing happens until the coded product--RNA or protein--itself is modified by or interacts with other compounds in the cell (and responds to external things the cell detects).

Beyond even that complexity provides comparable evolutionary or physiological complexity.  There are many, perhaps often also countless alternative biological pathways to essentially the same empirical result (say, height or blood pressure or intelligence).  These causally equivalent combinations, if we can even use the term 'causal', are many and un-enumerated, and perhaps un-enumerable.  The alternatives may be biochemically different, but if it they confer essentially no difference in terms of natural selection, they are evolutionarily as well as physiologically equivalent. Indeed, the fact is that every cell, and hence every organism is different in regard to the 'causal' bases of traits.  We may be able to define and hence measure some result, such as blood pressure or reproductive fitness; but to speak of causes as if they are individually distinct or discrete entities is still essentially being metaphysical. Yet, for various sociocultural and economic reasons, we seem unwilling to acknowledge this.

You might object by saying that in fact most geneticists, from Francis Collins down to the peons who plead for his funding support, are being essentially empirical and not indulging in theory.  Yes, they drop words like 'gene' and 'epigenome' and 'microbiome' or 'network' or 'system', but this are on or over the edge of metaphysics (speculative guessing).  Many who feed at the NIH (and NSF) trough might proudly proclaim that they are in fact not dealing with airy-fairy theory, but simply delivering empirical and hence practical, useful results.  They do genomewide mapping because, or even proudly declaring, they have no causative theory for this disease or that behavioral trait.  Usually, however, they confound statistical significance with formal theory, even if they don't so declare explicitly.

For example, most studies of genotypes and genetic variation relative to traits like disease, are based on internal comparisons (cases vs control, tall vs short, smart vs not-smart, criminal vs non-criminal, addictive vs sober, etc.).  They don't rest on any sort of theory except that they do implicitly identify entities like 'genes'.  Often this is so metaphysical as to be rather useless, but it is only right to acknowledge that these results are occasionally supported by finding an indicated 'gene' (DNA sequence element), whose manipulation or variation can be shown to have molecular function relevant to the trait, at least under some experimental conditions.  But this causative involvement is usually quite statistical, providing only weak causative effects, rather than in any clear sense deterministic.  We are enabled by this largely pure empiricism to argue that the association we saw in our retrospective study is what we'll see prospectively as causation in the future.  And we now know enough to know that when it seems to work it is (as, indeed, in Mendel's own time) it's only the simplest tip of the causative iceberg.

We are tempted to believe, and to suggest, that this 'gene' (or genetic variant, an even cruder attempt at identifying a causative element) will be predictive of, say, a future disease at least in some above-average sense. That is, even if we don't know the exact amount of associated risk.  But even that is not always the case: the associated risks are usually small and data-specific and often vary hugely from study to study, over time, or among populations.  That means, for example, that people--typically by far most people--carrying the risk variant will not get the associated disease! It may often do nothing when put into, say, a transgenic mouse.  The reason has to be context, but we usually have scant idea about those contexts (even when they are environmental, where the story is very similar). That is a profound but far under-appreciated (or under-acknowledged) fact with very widespread empirical support!


Indeed, the defense of pure empiricism is one of convenience, funding-wise among other reasons; but perhaps with today's knowledge all we can do if we are wedded to Big Data science and public promises of 'precision' genomic prediction.  When or if we have a proper theory, a generalization about Nature, we can not only test our empirical data agains the theory's predictions, but also use the theory to predict new, future outcomes with a convincing level of, yes, precision. Prediction is our goal and the promises (and, notably, research funding) rest on prediction, not just description. So, as Einstein (and Darwin) felt, an underlying theory of Nature makes data make sense. Without it we are just making hopeful guesses.  Anyone who thinks we have such a theory based on all the public rhetoric by scientists is, like most of the scientists themselves, confusing empiricism with theory, and description with understanding. Those who are thoughtful know very well that they are doing this, but can't confess it publicly.  Retired people (like me) are often less inhibited!

Or could there perhaps be another way to think about this, in which genetics as currently understood remains largely metaphysical, that genetics is real but we simply don't yet have an adequate way of thinking that will unite empiricism to some underlying global reality, some theory in the proper scientific sense?


Tomorrow we'll address the possibility that genetics is inherently metaphysical in that there isn't any tractably useful universal natural law out there to be discovered.

1 comment:

Anonymous said...

I do hope a book emerges from these thoughtful post-retirement posts :)