Thomas Hodgson is a user on scholar.social. You can follow them or interact with them if you have an account anywhere in the fediverse. If you don't, you can sign up here.
Thomas Hodgson @twsh

What can I read to learn about best practice for storing data in a digital humanities project? In particular, a database of historical records.

· Web · 11 · 4

@twsh Hi! I'm a data management librarian. It's not specific to humanities, but there's a good rule for storing *any* research data called the 3-2-1 rule.

Basically, keep *3* copies of any important file (primary + 2 backups). Your backups should be on *2* different storage media (e.g. 1 cloud & 1 external hard drive). Lastly, *1* of your backups should be in a different place from you geographically (typically cloud but check) bookwitty.social/media/YbyPuUw

@twsh we haven't really kept it up in any way but a while back i was part of an ILEAD Ohio team that put together this website: preservedigitalohio.com/

if you click through to the standards and good practices pages they're both basically just a collection of resources for digital preservation

hope it helps!

@ebeth That looks very useful. Thanks.

@twsh

could you be a bit more specific? are you asking about backup best practices, or designing a project so that the data is future proof and accessible?

I know some things about the latter, but most of what I've learned I've picked up from people in the field.

@omniadisce I was thinking more about best practice than backups. Not that backups aren't good.

@twsh @omniadisce I can think of worst practices for data sets that I've seen in the medical literature :P

@twsh

if what you're asking about is the way to take a set of historical records, edit them, and publish them in a robust and futureproof digital format, I think most people would agree that TEI-XML is the currently the standard for scholarly textual markup, which can then be served to users in a variety of ways

detailed examples here:
teibyexample.org

XML presupposes you're OK with data in a hierarchical data model. if you're working with literary texts this can be problematic

@omniadisce Thanks! That's going on my reading list.

@twsh

do let me know if you want specific advice about tools, programs, or further reading

as it happens, I'm in the early stages of planning a TEI project for some historical materials myself

@omniadisce I might well do that when I have a clearer idea of what it is I'm doing.

@twsh

all of the replacements for TEI are pretty bleeding edge and/or experimental at this point

but for historians, the issues with the XML data model may be unimportant (and in fact, many many lit scholars are fine with the tradeoff)