The ethical considerations are really something.
@jaranta The database appears to have been retracted already
@bgcarlisle That was fast. Good.
@jaranta Thanks for the actual paper link, let's see it. :P I'll look at the ethics once I see it, but that title and its "inappropriate" is hilarious in itself from a methods perspective - this is why you don't rely on automated data collection without actually having a living human enter the context and look around first.
@werekat It's as if you need to understand the topic your researching.
@ansugeisler Doesn't really smell like "we needed to get this result" to me - at the very least I can't imagine a practical goal for that particular result. So my money's on "people who know how to write data crawlers, but not do social science".
@werekat Well, an article that says "That strange social network that grabbed some headlines a while ago but isn't Commercial is bad, actually" is an article that will likely get some headlines, which in turn is what some researchers think will help them in their moribund careers (because they have no scruples and don't understand science).
@jaranta Yes. I only read the abstract but this is painfully clear.
At the same time, I've seen some absolute crap get presented at conferences because all they require is a 250-word abstract and the author didn't follow through with the project after acceptance.
@jaranta just wow.
@jaranta Well, it's all written by computer scientists, so would you expect the ethical component of the paper to actually be meaningful?
@jaranta They anonymized the data, right ... and those were public posts.
What am i missing?
@qcat It's not possible to anonymise textual data so that the original text is impossible to find afterwards, meaning that it is impossible to anonymise textual data. There are methodological work-arounds that are used in internet research, but these computer scientists probably did not know about them.
@jaranta Yeah, they said they anonymised the user data, but if nobody knows who posted what, then how relevant is the content? Also, I don't see them using any explicit text in their analysis. Of course if they keep that dataset lying around (or even worse, the non-anonymised version) and provide it to third parties, that would indeed be problematic.
@qcat They published the dataset. It's since been taken down.
@jaranta Ah, ok. And the implication being that anyone can find the original posts even if you don't know the user and imstance if you search through the instance list tjey used. That's the concern i guess?
The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!