Decided to move my dev blog over to my website, at Will try to keep writing a post here whenever I post over there, though!


In any event I think it'll be helpful for me to keep thinking through this stuff somewhere, and here is as good as most places! Talking to myself about what I've done and what I'm going to do usually makes me a better developer.

So now we have a central source of data that we can check for validity (very quickly in ). Today's goal was a super-quick-and-dirty way to mirror the contents of the MongoDB server over to Solr (, more Go). It only checks version numbers and presence/absence, but it'll be good enough to keep the Solr server current.

And hey: in the intervening years since I last redid our schema, Solr now has decent nested-document support! Woohoo!

Now, we've had real data integration problems in the project over the years. The "central data store" was XML files on my NAS. That's cool and all for long-term archiving (and it works for feeding stuff into ), but it *sucks* for long-term project maintenance.

The fix: baptize our instance as the "canonical" data store. I've got a JSON schema for our data ( + and a tool to verify Mongo against it (

I'm going to try to pick up an idea that I had on here before about using my Mastodon account to replace my (very!) old Advogato account. (Anybody else remember ? Good times.) So here comes the first of many development posts about .

So first, for those of you who don't know me, I'm the head developer on the project formerly known as RLetters and evoText, now known as Sciveyor. It's a system for performing textual analysis on journal articles.

Looks like via a WordPress plugin exploit, literally all the data from Parler has been breached, including DMs, the photo of a driver's license you needed to verify your account, and even some deleted posts. They weren't taking out EXIF metadata, so also you've got loads of GPS-tagged photos.


I still feel like I'm settling into this community, but happy new year, all. As a colleague just put it: at this point, I'd be ecstatic for a normal year.

You can even wind up with a pretty decent looking output if you invest some time into the Beamer customization:

I finally finished updating the latest version of my lecture notes scripts, which build nice PDFs of slides and notes from a single, combined, easy-to-write Markdown file.

Does that sound like something you'd like? Check it out on @codeberg, all you need is Pandoc, and you can customize it to your heart's content:

Belated birthday present from BJPS – project that took a year's work with two of my lab members finally coming out soon.

Will post a summary here when it's actually online, but short version: What kinds of philosophy-of-science inferences can be justified on the basis of text analysis of scientific journals?

No, but really, though, the only project to create a CLI version of Notational Velocity was written in Python and is now defunct. I'm about 2/3 done with a new one on a day's work, and pretty stoked about it...

me: I should look at the way I handle notes again, I've been having trouble with information retrieval

me: so I've written an app for searching and filtering notes by content and tags but I'm waiting on a pull request to get merged in the CLI toolkit for Ruby—

Okay, so Salesforce is probably buying Slack. I know at the very least I should try out Mattermost, that's the main OSS alternative. Anything else worth looking at? My lab mostly uses the chat and file sharing features; is there an open Discord alternative that might fit the bill?

