Want to study the #fediverse or write about it?
Ask to talk to people. Many will talk to you. They have voices and know how to use them
@firstname.lastname@example.org I sort of scrap but the data is very heavily mangled because it is not my goal to actually store the text itself but only it's structure and this is used later to generate brand new text.
as you can see an example here...
@logan Yeah, I guess my concern is more direct quoting exact people/accounts without consent.
I do know there is leeriness about scraping -- there was a paper a few years back that raised hackles for scraping Mastodon, but I think they failed to anonymize
@email@example.com well... i don't really have to anonymize the data thankfully because it is very scrabbled . at most i do remove the @ mentions though but as you can see...
you can really get anything useful from this.
@firstname.lastname@example.org their not specific . they just happen to come into my bots timeline .
the bot considers them words but if needed i can add this to what i also weed out.
@IceWolf@meow.social @email@example.com if you want to know what it can see you can look here
because this is what it uses.
@IceWolf@meow.social @firstname.lastname@example.org well i can try to figure out how to get a list of people that my bot follows.
this is very early development. this data that i am collecting will likely just be deleted anyways as i adjust how it reads it.
so now i have 2 things to implement for the next adjustments.
scrub urls and figure out how to get a list of people the bot follows.
You're not going to be able to escape the heritage of the Emotional Contagion study or Cambridge Analytica et al unless you behave differently here.
Gaining informed consent is the first task.
@Segebodo@chaos.social @IceWolf@meow.social @email@example.com the tricky part with that is i still would have to try and figure out how to read that information.
I don't go to other peoples accounts and scan them for details i only use what happens to come across the global timeline (at least that is what the bot currently does)
once i setup the following limitation (i was probably going to anyways at some point just this post kind of speed me up on this)
i can just ask people to go follow my bot if they don't mind having their posts gathered and ripped apart for their individual words.
@Segebodo@chaos.social @firstname.lastname@example.org because i would have to manually input the data which goes against the point of a self learning chat bot.
all the data it collects is smashed into 1 large array of words , not even entire sentences . and than it randomly chooses words from this list.
a word is in this case is anything separated by a space character.
@robertwgehl Thanks for the reminder for me to add "All public or unlisted posts on this profile may be cited, embedded or otherwise shared freely, unless the intent is for me to be harassed." to my bio.
Scholar Social is a microblogging platform for researchers, grad students, librarians, archivists, undergrads, academically inclined high schoolers, educators of all levels, journal editors, research assistants, professors, administrators—anyone involved in academia who is willing to engage with others respectfully.