Pinned toot

I'm a PhD candidate in Medical Biophysics at the University of Toronto studying epigenetics and cancer. Math and physics background taking a step into bioinformatics and evolution

Snarking at techbros 

I wasn't aware that Linux kernel developers were doing unethical research on , or defending those who did so. Maybe the victims of unethical research should sympathise with each other and work together against the problem?

I submitted my journal article earlier this week and I got approval from my committee to write my thesis and plan my oral examination! The end is in sight!

hello everyone, I'm stephanie (she/her)!

I'm a researcher at a big tech company, based in the UK. I did my phd in computational biology+medicine. I focus very much on ML for health, specifically critical care, physiological time series, "how to do ML well in health", etc.

Non-research activities include rollerskating/derby, gaming (computer/table-top/board), sewing, looking at birds.

I gave a talk yesterday as part of the Fragile Nucleosome Seminar Series. The talk was recorded and put online, so if anyone wants to learn about prostate cancer, the 3D genome, and structural variants, take a look!

Like all my scientific work, feedback and questions are always appreciated

Today on the #bioinformatics chat, Lindsay Pino discusses the difficulties of comparing proteomic (mass spec) measurements across different samples, potentially acquired in different labs, as well as a method she has developed recently for calibrating these measurements without the need for expensive reagents

This was a really neat discussion of various computer processes and how they can get slowed down or take up lots of space.

The most interesting one, for me, was about the slowness of zlib. In bioinformatics, I use gzipped files all the time, and I've always wondered about the silly FASTQ format (zipped or unzipped). I wonder how much faster and more compressed things would be if we used zstandard instead

All too often I hear "how can we justify funds to maintain research software?"

But really, we should be asking "How can we justify _not_ funding it?"

My colleagues and I wrote a thing:

Wow, I just had my first interaction with GitHub's dependabot.

It was able to
1. detect the requirements.txt file with Python packages
2. recognize the vulnerable version
3. test for compatibility with the other package requirements
4. open a new branch and PR that addresses the security issue
5. display all the relevant information in an easy to read format

That's amazing, to be honest. Just wow

I got my hands on an old Illumina NextSeq flow cell and got to tear it apart.

There's a lot of cool design choices and stuff in here! As a computational biologist, I don't see this kind of stuff that often, but it was fun to see this side of sequencing machines.

The latest bioinformatics chat episode is up! We interviewed Molly Gasperini about the essential, yet sometimes underappreciated, problem of defining enhancer elements and subsequently identifying them.

A good comparison of csv and for parsing large tables of data.

tl;dr Pandas is way easier to work with since working with tabular data in Rust is still immature. As expected, though, certain operations in Rust are WAY faster, showing its got potential for building efficient crates for processing data like this

This is a good description of how to work well with your manager

This also works as good advice for effectively communicating with your academic supervisor!

Here's a blog post I wrote about small p-values in hypothesis testing.

There are lots of ways your analysis can go wrong, and here I make the argument that absurdly small p-values is one heuristic that can tell you if something is off.

Paper got rejected, oh well. This time reviewer 1 was the harsh one, not reviewer 2.

Time to take the useful criticisms and try again

I first started using D3 in 2012. I've followed Mike Bostock's work since then, and he's got a lot of great stuff. He summarizes a lot of good lessons in this blog post

Good discussions about open source, different kinds of data visualization, and how to have fun with data

For academics answering questions after a seminar they give (or asking questions at a seminar they're attending), this article might be useful to keep in mind.

You are soliciting feedback when presenting. Not all feedback is equal, and you are free to ignore certain comments or questions because they're not applicable, or low priority.

There will be certain comments or questions that are critical to handle. Focus on those

This is just cool science. Precious DNA collected from mammoth molars from > 1 M years ago

On the new bioinformatics chat episode, Bárbara Bitarello helps us understand how polygenic risk scores (PRS) work and why they don’t transfer well across ancestries.

Interesting post by Paul Graham:

My most (and least) favourite quote:

> In other words, like many a grad student, I was working energetically on multiple projects that were not my thesis.

Show older
Scholar Social

Scholar Social is a microblogging platform for researchers, grad students, librarians, archivists, undergrads, academically inclined high schoolers, educators of all levels, journal editors, research assistants, professors, administrators—anyone involved in academia who is willing to engage with others respectfully.