Sunday, August 31, 2025

Need a Bioinformatics Job?

 


TL;DR. Job hunting advice from a guy with a good track record of landing bioinformatics jobs.


I woke up this morning with the urge to give some bioinformatics career advice. The bioinformatics job market right now is tight. It is difficult to land a good job, particularly for those who are ill-prepared.

Assumptions. You know what bioinformatics is and you are interested in working in the field.

Old Bioinformatician. I am one of the oldest career bioinformatic scientists out there. Not the oldest by any means, but I’ve been in the field of bioinformatics since around 1999 and did a fair amount of wet-lab before that.

I probably shouldn’t, but I read comments on Reddit r/bioinformatics and r/bioinformaticscareers all the time. And those two subreddits are full of people who don’t understand the field. You will see comments like “I applied to x hundred jobs and didn’t get a single interview”.

Buddy, you didn’t apply to a hundred jobs, you spammed a hundred employers. I have a master’s student who was looking for a job. We would look at each job description. I’d help him understand what they were looking for and what the work would be like. He would then craft a cover letter and resume for that position. He applied for about a half-dozen jobs. Got two interviews and an offer. (Of course he was well qualified and a great student.)

To apply for a job, you need to make it clear to the hiring manager why you are the right person for the role. If you just spam them with some premade documents and force them to try and infer your qualifications, they will ignore you in favor of the person who put the time into applying for the only role the hiring manager is currently looking to fill.

There’s an App for That. But that’s not what motivated this post. There is a population of bioinformaticians who approach bioinformatics incorrectly. I’ve been struggling for a long time for a way to articulate the problem, and recently I was talking with my wife about a student I had a consult with, and she came up with the phrase that summarizes the problem, “There is an app for that.”

In translational medicine we very frequently come across omic studies with complex experimental designs. We might have multiple samples from a patient, before and after treatment, affected and unaffected tissue, and blood. And they are almost always unbalanced designs. And there are frequently repeated  measures and time courses and all sorts of good stuff.

I work in a Medical College so the task of analyzing the data will go to some master’s student who has no specialized bioinformatics training. They will learn a little programming and some statistics, and they will lean into their assigned problem. These students are not the problem. My job is to help them learn how to correctly analyze the data. But it was one of these students that helped me articulate my concern.

This student had a complex experimental design with hundreds of samples all run through the appropriate discovery omic, for conversation, let’s say it was bulk RNASeq. They need to perform a differential expression analysis to find those molecular entities that are changing between before and after treatment etc.

They come in, we do a little prep work to get them ready, and then I routed them to our drop-in statistical coaching service. In this student’s case, they got a little over an hour with one of our PhD biostatisticians. I joined in for a bit of the conversation. They were given some instructions on what to do for next week (Drop-in stats is every Tuesday at Two) and they went away. They were to articulate a list of all the specific questions they wanted to ask the data and some other stuff. And then come back and we would work on the model statements for their analysis.

Three days later, they swing by my office. “Hey Rich, did I do this right?”

“What did you do?”

“I ran it through a specialized R library I found. It’s written for this type of data. I ran it with the default parameters. It should be right, right?”

If you do the math wrong, you get the wrong answer.

The probability that a bioinformatic tool will default to the correct parameters for your obscure experimental design are asymptotically approaching nil. I sent him off to see if he could find what model statements the tool has auto-magically generated before it sent them into LME for him – and we hope it was LME, it might have passed the differential expression analysis to a different library. Who knows.

The student assumed that if there was an “app” for the job, it would just work. And when it spit out p values, they must be the correct p values.

But this is my job, not a problem, they are a cleaver student and will quickly see where the biostatistician and I are guiding them. That’s not the problem.

The Real Problem. The problem is that a lot of bioinformaticians are learning to hit data with “apps”. Like the student, they are trusting that the tool will do all the correct steps without the operator needing to know what those steps are.

“If it is bulk RNASeq data, I run it through DESeq2, I get the differential expression results and I send it on to the next step in the workflow.”

But bioinformatics is mathematical inference—each computational step needs to be undertaken for a reason—and the researcher needs to understand both the reason for the step and how to perform it.

Nowadays, at least in translational medicine, almost all experiments are performed by a large team. Not every researcher on the team needs to understand every step in the computation, but they need to be able to trust that their bioinformatician does.

And there are an endless number of tools being written which do some neat, and worthwhile computation, and then the tool’s authors just tack onto it a few other steps so the tool can complete a full workflow. Some people treat these as “Apps”. They assume that the tool will do “its job” and that the numbers that come out are correct.

Unfortunately, you have to understand 1) what the tools is actually doing, 2) what computation you need in order to make your inference, and 3) what the question actually is that you are trying to make an inference about. So, in order, you need to understand the computation, statistics, and biology of the problem.

Career Advice. I see many students who are taking programs which teach you that this tool reads in this kind of data and spits out those numbers. “There’s an App for that.”  This is fine for two situations. First, if the experimental design of your data exactly matches that of the tool, it should work. Second, if there is a researcher above you who knows what they are doing and gives you the exact right instructions. In either of these cases, just knowing how to manipulate data to form the correct input format and graphing the output will be sufficient.

But, in my experience, most people who are looking to hire a bioinformatician aren’t looking for either of those (usually, those types of jobs do occasionally come up). Instead, they typically are researchers who understand the bench part of the job, and they need someone who can understand the computation and inference parts.

So, my career advice is, 1) show that you understand their needs. Don’t make them guess or try and read between the lines. Write your cover letter to exactly match the tasks in the job description. 2) Be brutally honest with your own limitations. Most hiring managers want to know what you can and cannot do. If you claim you can do everything, you end up looking like a duffus that no one would want to hire.  For example, I can’t write complex model statements in R. I’m a SAS guy. So, if we need a complex model in R with nested effects and random this and fixed that, I defer to the biostatisticians. But I can at least understand the design. 3) Remember that you are applying for a role in a team. Show that you can work well with others. That you listen to what people are saying and that you understand your own strengths and weaknesses. And the best way to do this is to listen to what they put in the job description.

Anyone can write some R code or Google what tool to use. Showing that you will be an asset to the project will increase the probability of getting the interview. And interviews lead to offers.

Good luck And, as always I thank you for reading my posts and remind you that I welcome comments or questions below.

No comments:

Post a Comment

Most Recent

Need a Bioinformatics Job?

  TL;DR. Job hunting advice from a guy with a good track record of landing bioinformatics jobs. I woke up this morning with the urge to giv...

Most Popular