TL;DR. Job hunting advice from a guy with a good track record of landing bioinformatics jobs.
I woke up this morning with the urge to give some bioinformatics career advice. The bioinformatics job market right now is tight. It is difficult to land a good job, particularly for those who are ill-prepared.
Assumptions. You know what bioinformatics is and you
are interested in working in the field.
Old Bioinformatician. I am one of the oldest career
bioinformatic scientists out there. Not the oldest by any means, but I’ve been
in the field of bioinformatics since around 1999 and did a fair amount of
wet-lab before that.
I probably shouldn’t, but I read comments on Reddit
r/bioinformatics and r/bioinformaticscareers all the time. And those two
subreddits are full of people who don’t understand the field. You will see
comments like “I applied to x hundred jobs and didn’t get a single interview”.
Buddy, you didn’t apply to a hundred jobs, you spammed a
hundred employers. I have a master’s student who was looking for a job. We
would look at each job description. I’d help him understand what they were
looking for and what the work would be like. He would then craft a cover letter
and resume for that position. He applied for about a half-dozen jobs. Got two
interviews and an offer. (Of course he was well qualified and a great student.)
To apply for a job, you need to make it clear to the hiring
manager why you are the right person for the role. If you just spam them with
some premade documents and force them to try and infer your qualifications,
they will ignore you in favor of the person who put the time into applying for
the only role the hiring manager is currently looking to fill.
There’s an App for That. But that’s not what motivated
this post. There is a population of bioinformaticians who approach
bioinformatics incorrectly. I’ve been struggling for a long time for a way to
articulate the problem, and recently I was talking with my wife about a student
I had a consult with, and she came up with the phrase that summarizes the
problem, “There is an app for that.”
In translational medicine we very frequently come across
omic studies with complex experimental designs. We might have multiple samples
from a patient, before and after treatment, affected and unaffected tissue, and
blood. And they are almost always unbalanced designs. And there are frequently
repeated measures and time courses and
all sorts of good stuff.
I work in a Medical College so the task of analyzing the
data will go to some master’s student who has no specialized bioinformatics
training. They will learn a little programming and some statistics, and they
will lean into their assigned problem. These students are not the problem. My
job is to help them learn how to correctly analyze the data. But it was one of
these students that helped me articulate my concern.
This student had a complex experimental design with hundreds
of samples all run through the appropriate discovery omic, for conversation,
let’s say it was bulk RNASeq. They need to perform a differential expression
analysis to find those molecular entities that are changing between before and
after treatment etc.
They come in, we do a little prep work to get them ready,
and then I routed them to our drop-in statistical coaching service. In this student’s
case, they got a little over an hour with one of our PhD biostatisticians. I
joined in for a bit of the conversation. They were given some instructions on
what to do for next week (Drop-in stats is every Tuesday at Two) and they went
away. They were to articulate a list of all the specific questions they wanted
to ask the data and some other stuff. And then come back and we would work on
the model statements for their analysis.
Three days later, they swing by my office. “Hey Rich, did I
do this right?”
“What did you do?”
“I ran it through a specialized R library I found. It’s
written for this type of data. I ran it with the default parameters. It should
be right, right?”
If you do the math wrong, you get the wrong answer.
The probability that a bioinformatic tool will default to
the correct parameters for your obscure experimental design are asymptotically
approaching nil. I sent him off to see if he could find what model statements
the tool has auto-magically generated before it sent them into LME for him –
and we hope it was LME, it might have passed the differential expression
analysis to a different library. Who knows.
The student assumed that if there was an “app” for the
job, it would just work. And when it spit out p values, they must be the
correct p values.
But this is my job, not a problem, they are a cleaver
student and will quickly see where the biostatistician and I are guiding them.
That’s not the problem.
The Real Problem. The problem is that a lot of
bioinformaticians are learning to hit data with “apps”. Like the student, they
are trusting that the tool will do all the correct steps without the operator needing
to know what those steps are.
“If it is bulk RNASeq data, I run it through DESeq2, I get
the differential expression results and I send it on to the next step in the
workflow.”
But bioinformatics is mathematical inference—each computational
step needs to be undertaken for a reason—and the researcher needs to understand
both the reason for the step and how to perform it.
Nowadays, at least in translational medicine, almost all experiments
are performed by a large team. Not every researcher on the team needs to
understand every step in the computation, but they need to be able to trust
that their bioinformatician does.
And there are an endless number of tools being written which
do some neat, and worthwhile computation, and then the tool’s authors just tack
onto it a few other steps so the tool can complete a full workflow. Some people
treat these as “Apps”. They assume that the tool will do “its job” and that the
numbers that come out are correct.
Unfortunately, you have to understand 1) what the tools is
actually doing, 2) what computation you need in order to make your inference,
and 3) what the question actually is that you are trying to make an inference
about. So, in order, you need to understand the computation, statistics, and
biology of the problem.
Career Advice. I see many students who are taking
programs which teach you that this tool reads in this kind of data and spits
out those numbers. “There’s an App for that.” This is fine for two situations. First, if the
experimental design of your data exactly matches that of the tool, it should
work. Second, if there is a researcher above you who knows what they are doing
and gives you the exact right instructions. In either of these cases, just
knowing how to manipulate data to form the correct input format and graphing
the output will be sufficient.
But, in my experience, most people who are looking to hire a
bioinformatician aren’t looking for either of those (usually, those types of
jobs do occasionally come up). Instead, they typically are researchers who
understand the bench part of the job, and they need someone who can understand
the computation and inference parts.
So, my career advice is, 1) show that you understand their
needs. Don’t make them guess or try and read between the lines. Write your
cover letter to exactly match the tasks in the job description. 2) Be brutally
honest with your own limitations. Most hiring managers want to know what you
can and cannot do. If you claim you can do everything, you end up looking like
a duffus that no one would want to hire. For example, I can’t write complex model
statements in R. I’m a SAS guy. So, if we need a complex model in R with nested
effects and random this and fixed that, I defer to the biostatisticians. But I
can at least understand the design. 3) Remember that you are applying for a
role in a team. Show that you can work well with others. That you listen to
what people are saying and that you understand your own strengths and
weaknesses. And the best way to do this is to listen to what they put in the
job description.
Anyone can write some R code or Google what tool to use.
Showing that you will be an asset to the project will increase the probability
of getting the interview. And interviews lead to offers.
Good luck And, as always I thank you for reading my posts
and remind you that I welcome comments or questions below.
No comments:
Post a Comment