Improve your data management in clinical research

Data management is boring and quite likely, you are not happy about how you do it today. Research data collected in practice typically evokes complaints when it finally reaches the statisticians or bioinformaticians as they claim that it is not being properly organized. That’s not because you made a mistake: it’s simply hard to do when you set up your small scale study. Tools like Excel will give you practically infite freedom with little guidance and every study seems to be so different from the last. Basic data management isn’t really taught in most research environments unless you’re talking about clinical trials that require elaborate and specialized software systems. Luckily, there is a Coursera course starting on June 2nd, 2014 that teaches the basics of data management. While it is focussed on the tool the organizers provide, the contents of the course should allow you to build better data structures.

Continue reading

The final EuroEPINOMICS General Assembly – Impressions from Helsinki

Time flies by. Last week, we have had the final General Assembly of the EuroEPINOMICS project in Tuusula, Finland. All four projects of the EuroEPINOMICS consortium presented the current, ongoing projects and it’s good to hear that there are multiple publications in various stages coming up. Over the three years of the consortium, the diverse groups grew closer together. During this meeting many unpublished results were shown, including extension of studies on genes such as HCN1, CHD2GRIN2A, GRIN2B or RBFOX1 as well as more data on epigenetics in acquired epilepsy.

Continue reading

The wonders of Medical Neuroscience

MOOC. People have been hailing Massive Open Online Courses (MOOCs) as the next big thing in higher education. Accordingly, the number of people complaining about their failures is now substantial. MOOCs are following the usual hype cycle and we could close the post here. Then again, I recently became a MOOC disciple and need to vent some  praise of a course on the Coursera platform that people reading this blog should be aware of: Medical Neuroscience presented by Leonard White (Duke).

Continue reading

The genetics of emergent phenotypes

This article was written Kevin Mitchell and first published on his blog “Wiring The Brain” and appears here with his consent.

Why are some brain disorders so common? Schizophrenia, autism and epilepsy each affect about 1% of the world’s population, over their lifetimes. Why are the specific phenotypes associated with those conditions so frequent? More generally, why do particular phenotypes exist at all? What constrains or determines the types of phenotypes we observe, out of all the variations we could conceive of? Why does a system like the brain fail in particular ways when the genetic program is messed with? Here, I consider how the difference between “concrete” and “emergent” properties of the brain may provide an explanation, or at least a useful conceptual framework. Continue reading

The cat in the bag

And the hairball. What is the value of network analysis of genetic data except for being an undefined label for any work including the use of external data sources for the evaluation of hmm, some genetic data? Let’s be specific: what is the value of this recent high-profile paper in Nature Neuroscience describing the distribution of variants in a schizophrenia network? Continue reading

SpotOn 2012 is on air

The biggest European meeting on Science online – policy, outreach, tools – started this Sunday. SpotOn brings  open source coders, librarians, scientists from a variety of fields, and publishers together in London.

You can follow the keynotes and sessions online and evaluation and comments can be followed in real time on Twitter. #solo12 is hashtag of the overall conference, the individual sessions have their own tags. Continue reading

Gamification of life science – The CAGI challenge

Everybody wins. The scientific publication process is not ideal to find the best bioinformatics methodology for a given problem. Most predictions are not performed blind as our data sets are so small that separating them in to several disjoints sets for training and testing purposes is not possible or sensible. The structural biology community has started to tackle the problems by establishing a competition called Critical Assessment of protein Structure Prediction (CASP). For example, the solution of the 3D structure of a protein is announced but the data withheld for a couple of months to give computational groups time to submit a prediction which is then evaluated by an independent team. A concluding conferences crowns the best prediction groups. In recent years, systems biology and sequence interpretation produce sufficient data to make similar challenges possible. Continue reading

To do: read ENCODE papers

ENCODE will change the way we analyse genomes. The comparison of long non-coding RNA and transcription factor binding sites will require more CPU time. Anything else? I don’t know, I am only writing this because Ingo asked me to. It’ll take time to study the 30+ papers, sift through the data and discuss it with colleagues. Only then, something like that understanding we hear so much about can happen and I am sure it will in journal clubs around the globe in the next weeks. But smaller things might already be interesting.

Continue reading