/dev/log
I'm going to try to pick up an idea that I had on here before about using my Mastodon account to replace my (very!) old Advogato account. (Anybody else remember #Advogato? Good times.) So here comes the first of many development posts about #dighum #digitalhumanities.
So first, for those of you who don't know me, I'm the head developer on the project formerly known as RLetters and evoText, now known as Sciveyor. It's a system for performing textual analysis on journal articles.
/dev/log
So now we have a central source of data that we can check for validity (very quickly in #golang). Today's goal was a super-quick-and-dirty way to mirror the contents of the MongoDB server over to Solr (https://codeberg.org/sciveyor/mongo-solr, more Go). It only checks version numbers and presence/absence, but it'll be good enough to keep the Solr server current.
And hey: in the intervening years since I last redid our schema, Solr now has decent nested-document support! Woohoo!
/dev/log
Now, we've had real data integration problems in the project over the years. The "central data store" was XML files on my NAS. That's cool and all for long-term archiving (and it works for feeding stuff into #Solr), but it *sucks* for long-term project maintenance.
The fix: baptize our #MongoDB instance as the "canonical" data store. I've got a JSON schema for our data (https://data.sciveyor.com/schema/ + https://codeberg.org/sciveyor/json-schema) and a tool to verify Mongo against it (https://codeberg.org/sciveyor/schema-tool)