Genomics euphoria: ramblings of a scientist on genetics, genomics and the meaning of life

Category Archives: Scientific method

Genome editing via the CRISPR system: the triumph of basic research

I wanted to write a quick note about the use of CRISPR/Cas system in gene editing. Several labs in parallel have developed a CRISPR system (Clustered Regularly Interspaced Short Palindromic Repeats) for gene editing purposes in a variety of organisms from zebra fish to humans. The CRISPR/Cas systems and their function as an immunity-type response in bacteria is on its own very exciting and I encourage you to read about it, if you haven’t (e.g. see this paper in Science from 2010 or simply visit wikipedia).

In short, this system records foreign DNA and uses an RNA intermediate (crRNA) to target other encounters of the same invasive DNA species via specific nucleases (i.e. Cas). But how unbelievably “cool” this system is aside, recently it has been adopted for gene editing. In this context, the crRNA is replaced with a target sequence (a sequence that we want to alter in the genome). The activity of the CAS/crRNA complex, if properly expressed, then results in double stranded breaks at the site of interest. The cell then uses end-joining repair system to correct the break; however, the error-prone nature of this mechanism results in deletions at the site of action. Now if the target sequence was selected from an active gene, this mechanism would effectively mutate the gene into an inactive copy.

CAS system structure

CAS system structure

Even better, by modifying the CAS enzymes, we can limit the nuclease activity to a single nick in the DNA, as opposed to double-stranded breaks. In this case, the cell employs homologous recombination to repair the nick and if we provide a mutated homologous sequence in trans, the system may use it as a template to correct the nicked site and in effect transfer the mutation to the genome with surgical precision.

Not only these are all very exciting, this is a prime example of how important basic research is. Just imagine the first grant written for studying the CRISPRs, and I’m paraphrasing here, “ahem…, there are these repetitive sequences in bacteria and some obscure archaea… we have no idea what they do, but we kind of wanna know… so fund us may be?”. Pursuing this simple curiosity  however, has resulted in a promising method for genome editing in humans and is poised to transform how we do genetics (mainly due to its low cost of implementation). This is a very good example of where targeted funding of “translational research” fails. Ultimately, leaps in life sciences (and science in general) come from systems that we don’t even know exist. The same goes for other amazing tools that have become mainstays of molecular biology. Similarly, I assume the proposal for studying fluorescent proteins went something like this: “well… we have this cool organism that glows in the dark. We kind of wanna know why. Will it cure cancer? probably not…”. But all kidding aside, these are all very good reminders of how important basic research is to our collective knowledge.

Sources: “Cong et al, 2013, Multiplex Genome Engineering Using CRISPR/Cas Systems, Science DOI: 10.1126/science.1231143″ among others.

The triumph of mathematics (or how Nate Silver got drunk)

“Drunk Nate Silver stumbles through traffic on the Jersey Turnpike, screaming out what time each driver will get home.” @davelevitan

I know… I am late to the game… let’s chalk it up to a very busy schedule in the lab. But I want to write about the elections (queue eyes rolling).

I arrived in the US in 2006, so I was fortunate enough to witness the Obamania that swept this nation in 2008. I was quite fascinated with the dynamism of the elections and I was watching it VERY closely. That was the first time I came across, a blog started by a sports statistician named Nate Silver. His simple yet elegant model correctly predicted the election outcome in 49 out of 50 states. Despite his rise, the one-sided 2008 election was not a very good indicator for the supremacy of his model. In 2012, however, everyone believed the race to be a significantly close one. While pundits called the race a virtual toss-up, Nate Silver (and other statistician/bloggers like him, including Sam Wang of Princeton Election Consortium) were assigning very high chances of winning to president Obama throughout the campaign season. This made Nate Silver a punching bag for the TV hosts, and the punditry in general, in the run up to the elections… however, the accuracy of his statistical model proved to be quite impressive (it called all 50 states and all but one of the senate races). This made him the true winner of the elections… with his book becoming a best-seller and #drunknatesilver becoming a popular hashtag on twitter (where I got the quote in the beginning of this post from).

Nate Silver’s model is fairly simple and is very similar to the models used by other poll aggregators (who predicted pretty much the same outcomes). I think, anyone with an adequate knowledge of statistics would have come up with a comparable model. I really don’t want to talk about the model or why it worked so well (which I don’t think is very surprising to any scientists). What caught my attention however, was the extent to which people were shocked by the efficacy of these statistical models. This, I think, clearly indicates that people underestimate science and its ability to deliver. I think, as scientists, we should be worried about this. Why this is the case, I really don’t know… is it the successful war on science? Is it the botched PR dramas by fraudulent scientists? Is it going head-to-head with religion and losing? I don’t know… what I do know, is that Nate Silver is not an extraordinary researcher/mathematician. He has a job and he does it well, but what he’s doing is not groundbreaking. Nevertheless, in this election, science squared off against ideology and won a decisive victory. And we should take this as an opportunity and build upon this. How? I am again not sure… I just know that this opportunity should not be wasted.

DrunkNateSilver from Gawker

Dan Levitan started a game on Twitter: #DrunkNateSilver: things Nate Silver might do/say when he’s drunk.

Shut up and take my money

Probably everyone knows that science funding is not doing ok in the US (or anywhere else for that matter). The grant application success rates have dipped below 15% or even 10% for larger grants. Scientists have been reduced to grant writers: a long and seemingly futile endeavor that is taking more and more away from research time. Basically, people are spending more and more time explaining what they want to do, and less and less time actually doing it. This is not the only problem… with low success rates, the funding process becomes conservative, less imaginative and the word “feasible” transforms into an utterly subjective concept in the mind of the reviewer. Basically, as a young scientist, you need a proposal that is both conventional and innovative at the same time… which seems like a paradox. To be honest, scientists themselves are part of the problem… like any other fraction, every scientist comes with biases, convictions and unfounded belief-systems that clouds his/her judgment. And as the number of grants per researcher shrinks, these biases become an important factor in rankings and scoring applications. The funding problem needs to be dealt with, and I think it will be dealt with in one form or another in the next 4-5 years (things simply cannot go on like this). But those who have power to change anything have not felt the problem yet and like any other profession, the young and less-established investigators suffer the most. Now Ethan Perlstein and his colleagues have come up with a short term solution to fund their innovative ideas. They have started a project in Rockethub to crowd fund their project. I think this is a step forward in the right direction. At this point, they are half way there (their goal is 25,000 dollars)… if you are reading this, head to their project, read their statement and consider fueling this study.

Shut up and take my money

Shut up and take my money

Decoding the ENCODed DNA: You get a function, YOU get a function, EVERYBODY gets A function

It has been almost half a century… since we started drilling the concept of “central dogma” (which is DNA->RNA->protein in some sense equals life) into the psyche of the scientific community and human population as a whole. The idea was that everything which makes us human, or a chimp a chimp, is encoded in As, Gs, Cs and Ts, efficiently packaged into the nuclei of every cell. Every cell, it went, has the capacity to reproduce the complete organism. What seemed to be missing in our daily conversations (or conveniently omitted) was how is it that the cells in our body have such different cellular fates, if they start with the same information which they hang on to for the entirety of their lifespan. The answer came, miraculously enough, from the Jacob and Monod and their work on lac operon in E. coli: it is not the book, but how it is read that defines the fate of every cell. Which parts of this genomic library is transcribed (into RNA) and expressed (via the protein products) is ultimately decided by the “regulatory” agents toiling away in the cell. These regulatory agents come in many forms, the first generation were themselves proteins (first repressors and then enhancers). Then came micro-RNAs, small RNA molecules that can locate specific target sequences on RNA molecules and affect their expression (for example through changing the life-span of an RNA molecule). And now, we have identified an arsenal of these regulatory mechanisms: chromatin structure (how DNA is packaged and marked affects its accessibility), transcription factors, miRNAs, long non-coding RNAs and… In the end of the day, it seems that the complexity of an organism largely stems from the diversity and complexity of these regulatory agents rather than the number of protein-coding genes in the genome. It’s like chemistry: the elements are there but what you do with them and how you mix them in what proportions gives you a functional and miraculous product.

Genome Project

The “Human genome project” was the product of the classic “central dogma” oriented view-point. Don’t get me wrong… this was a vital project and what we know now largely depended on it; however, this project was initially sold as the ultimate experiment. If we read the totality of the human DNA, the reasoning went, we’ll know EVERYTHING about humans and what makes them tick. But obviously, that wasn’t the case. We realized that it is not the DNA but the regulatory networks and interactions that matter (hence the birth and explosion of the whole genomics field).

The ENCODE project


The ENCODE project was born from this more modern and regulation-centric view of genomics. And the recent nature issue has published a dozen papers from ENCODE along with accompanying papers in other journals. This was truly an accomplishment for science this year, rivaled only by the discovery of Higgs boson (if it is in fact Higgs boson) and the Curiosity landing on Mars. At the core, what they have done in this massive project is simple: let’s throw whatever we have in terms of methods for mapping regulatory interactions at the problem. From DNAse I footprints to chromatin structure and methylation. And what they report as their MAIN big finding is the claim that there are in fact no junk DNA in the genome, since for 80% of the genomic DNA they find at least one regulatory interaction, which they claim as “functional”.

As I said, this was a great project and will be a very good resource for our community for many years to come. But there are some issues that I want to raise here:

  1. I think we’re over-hyping this. Not every observed interaction means “functionality”. We already know from ChIP-seq datasets that for example, transcription factors bind to regions other than their direct targets. Some of these sites are in fact neutral and their interactions may very well be a biochemical accident. Now one might claim that if the number of transcription factors is limited, these non-functional sites may show some functionality through competing with actual sites to decrease the effective concentration of the transcription factor in vivo.
  2. The take-home message from the ENCODE project seems to be debunking the existence of “junk-DNA”. But to be honest, not many of us thought the genome had significant amount of junk anyways. I am sure that ENCODE provided us with a great resource, but pointing to this as its major achievement does not seem logical. To be honest, I think a resource project like this doesn’t really have an immediate obvious ground breaking discovery; however, the policy makers want to see something when they fund these types of projects… and this is one way of giving it to them.
  3. Funding is another issue here. This was a very expensive endeavor (200 million dollars, was it?). Now I am all for spending as much money on science as possible; however, this is not happening and funding in biosciences seems to be tight nowadays. We can legitimately ask if this amount of money may have been better spent on 200 projects in different labs as opposed to one big project. A project, let me remind you, that would have been significantly cheaper to do in near future due to the plummeting sequencing costs. I’m not saying ENCODE was a waste of money, I just think we’re at a point that things like this should be debated across the community.

Nevertheless, the ENCODE consortium should be commended on performing one of the most well-coordinated projects in the history of biosciences with astounding quality. I think compared to the human genome project, this was a definite success. I have never seen the community this amped up, with everyone poring through the gorgeous interactive results, going over their favorite genes and making noise on twitter. This is a proud moment to be a biologist… I think we have officially entered the post-“central dogma” age of biology.

Living an “organic” life: debunking the supremacy of the organic produce

A couple of years ago, I took a course titled “The use of science in public policy” taught by Prof. Lee Silver at Princeton University. The goal of the course was to bring basic science and policy students together and expose them to the challenges generally faced by each group. The take home message for me, as a scientist, was the paucity of black and white issues and how complex even mundane policies become when they’re applied to the entirety of a society. However, what shocked me the most was how difficult it is to bring the scientific world-view, which is some times counter-intuitive, into policy making. And this is nowhere more obvious than issues like “genetically modified food”, “homeopathic medicine” or even “organic food”. Organic produce, which have been supposedly grown chemical-free, is branded as “natural food” and has formed one of the most successful industries in the US with a growth of about 10,000% over 10 years. What distinguishes the organic food from conventional products however is for the most part in its sticker price. But the consumers are buying organic products in droves, assuming that its health benefits far outweighs the cost. There have been studies looking at this claim but a recently published study in Annals of Internal Medicine does a good job of bringing together all the available data over many years to perform an effective meta-analysis of this subject.


I think the main challenge in discussing organic produce is the laxity of standards in its definition. From what I understand, every farm has its own way of defining organic food. And without a national standard, it is more difficult (although not impossible) to study the health benefits of this category. Nevertheless, organic farms have been very successful in convincing the consumers about this matter, so much so that other industries are taking notes. Case in point, the rise of the “organic laundry” and “organic detergents” that (spoiler alert) has nothing to do with “organic” in the sense we use in our everyday lives.

In general, the industries seem to be very good at persuading the public about health benefits of food (e.g. brand names like Vitamin water). Also, as humans, it seems it is difficult for us to take into account non-linear relationships; for example, if a little bit Vitamin E is essential for your health, adding as much as you can to your diet should be even more beneficial. Teaching the public that there isn’t a linear relationship between health impact and intake seems to be a daunting task. Labels that connotate “nature” are very potent, which for me is very counter-intuitive. Nature is full of dangerous toxins… not every natural product is beneficial.

Another important challenge has to do with how well we can disentangle “organic food” consumption from other aspects of life. We can assume that people who care enough to pay 2-3 times more for organic produce just because they think it’s healthier, are also more likely to go for their annual checkups, exercise and in general be concerned about their health. Statistically speaking, it would be very difficult to correct for these covariates without doing a true double-blind experiment. Double-blind experiments need sponsors, which leads me to ask whether the government may in fact have a role to play in this matter. At this point, FDA doesn’t regulate anything that is “natural” which puts organic food outside of its jurisdiction.

Given these challenges, it is rather obvious why we needed several decades worth of data to be able to perform a decent meta-analysis. And to be honest, I still think this study could be much better and more distant from academic hype.


Despite these challenges, these researchers, I think, have done a decent job of analyzing the data. They find very little evidence in support of organic produce. Sure, they see marginally higher pesticide levels in conventional produce, but the levels are far lower than the risky threshold. Also, organic farms do use pesticides, but use natural ones instead of chemical ones and we don’t know if natural pesticides are in any way safer than chemical ones. More importantly, because natural pesticides are less potent, higher quantities needs to be used (adding more stuff to the soil and environment).

There isn’t enough data to fully debunk the perceived value of organic food, but at this point, I’m pretty sure it’s not worth the significantly higher sticker price.

The best book of all time

What is the best book of all time? That seems like a stupid question. It’s vague, subjective and quite meaningless. However, before I get replies like “Harry Potter”, “Twilight” or “Fifty shades of grey”, let me provide some context for this question. As this is a scientific blog, first and foremost, the answer should be a scientific book… and no, I am not talking about “On the origin of species”. I am looking for something grander, something without which Darwin’s publication was not even possible. The answer to this question, I think resoundingly, is Novum Organum by Francis Bacon, the 17th century philosopher scientist. Why is that? Well, I don’t know… may be because it formulated scientific method in its most basic form. For centuries, thinkers, authors and philosophers were expanding upon the ideas put forth by Aristotle and Plato. However, Francis Bacon made the case that, for the most part, these were mostly just ideas (what we now call hypotheses). He advocated a new setting in which philosophers could rise above these ideas and formulate new ones. However, he also clearly indicated that the validity of these ideas need to be tested through a rigorous “method” involving data collection, its interpretation, and even designing new experiments.

Novum Organum

Now why do I think Novum Organum is the best scientific book of all time? Because it made a case for scientific method, before it was cool. And most important of all, he didn’t only talk the talk… he actually walked the walk. He contracted pneumonia while studying the effects of freezing on preservation. While I don’t condone working oneself to literal death, we should realize that the scientific process owes an undeniable debt of gratitude to giants who devoted their lives to science. Novum Organum is available on Amazon for $1 (the Kindle edition that is), I don’t have to tell you that it is worth every penny. I don’t believe in the “Great man theory“. I am not saying that had he not published this body of work, we would still follow an Aristotelian method. If Edmund Hillary hadn’t climbed Everest, someone else would have. But this fact does not make his trial more trivial and his adventure less dangerous. The same holds for Francis Bacon and his seminal work “Novum Organum”.

Francis Bacon

Francis Bacon