@eleonsan That's cool stuff, and the reverse direction of what I was talking about, I think. They're using insight from outside machine learning to build better models in machine learning

I've managed to put some order to my thoughts about interpretable models, how they're used in biological research, and why many published articles using machine learning fail to excite biologists in any meaningful way.

jrhawley.github.io/2020/05/28/

Reminder: You don't need to justify taking a break in terms of it being an investment in future productivity

wait
I just realized I have a fucking Masters in political science now

technically I am a 2020 graduate except not really cause I'm a PhD student

A fun experience of watching remote seminars is seeing the Java Updater popping up in the middle of a presentation

Genome sequencing data from consortia is staggeringly large. A recent release of the project released whole genome sequencing data from 15 708 samples.

nature.com/articles/d41586-020

If you have high quality whole genome sequencing data, say at 30X coverage, 1 person's entire genome, encoding only the base calls, would be ~ 1GB. With quality controls and other information, this can be ~ 3 GB of raw sequencing data.

That means this dataset would be ~ 46 TB of raw data, alone

covid, privacy 

@VictorVenema Fair point. I've seen those comics and related content about the decentralized tracing algorithms, like the DP3T protocol.

github.com/DP-3T/documents/blo

I have high hopes for those. And I understand that some of the permissions required by these apps are necessary, like accessing file storage to store the messages from nearby devices.

But access to email addresses, location, etc, are easy to go under the radar for most apps in this initial burst

covid, privacy 

> 16 of the 50 apps indicate that the user’s data will be made anonymous, encrypted and secured and will be transmitted online and reported only in an aggregated format.
> What is not clear is whether any of the data collected are protected by any laws or regulations such as the Health Insurance Portability and Accountability Act or electronic protected health information.

Show thread

covid, privacy 

> In addition, some apps explicitly state that they will collect information about the person’s age, email address, phone number and postal code; the device’s location, unique device identifiers, mobile IP address and operating system; and the types of browsers used on the mobile device.

Show thread

covid, privacy 

> We found that 30 of the 50 apps require permission for numerous types of access to users’ mobile devices. For example, some demand access to contacts, photos, media, files, location data, the camera, the device ID, call information, the WiFi connection, the microphone, full network access, the Google service configuration, and the ability to change network connectivity and audio settings, to name just a few types of access.

Show thread

covid, privacy 

So it turns out that all those privacy advocates who were warning about COVID contact tracing apps being an easy way to breach people's privacy were totally right

nature.com/articles/s41591-020

I've had a bit of trouble installing some crates on my machine due to not having a C compiler properly configured.

So I wrote a tutorial on how to install everything you need to get and as a part of your Rust toolchain on Windows.

jrhawley.github.io/2020/05/25/

Supporting open-source 

I just donated to NumFOCUS! Join me and show your support for the open source projects we love. #NumFOCUS #pydata #opensource #python #rstats #openscience #datascience numfocus.salsalabs.org/donate

Why does every paper have the word "reveals" in the title?

If every single-cell technique reveals that there's heterogeneity in cells and that there's a heterogeneous response to your experimental treatment, are any of them "revealing" anything?

My lab mate and good friend, @StanInToronto@twitter.com just defended his PhD thesis! Congrats Dr. Zhou!

vd is the tool that I have been looking for for a long time

github.com/saulpw/visidata/

Got tabular data that you want to view in the terminal? This is the way to do it

Science reporters, provide a link to the primary article 2020 challenge

Show more
Scholar Social

The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!