I found what appears to be a major data entry error in the Drugs@FDA database

Or rather, about 1400 data entry errors

Then in :rstats:

ApplicationDocs <- read_delim(
escape_double = FALSE,
col_types = cols(
ApplicationDocsDate = col_date(format = "%Y-%m-%d 00:00:00"),
ApplicationDocsID = col_integer(),
ApplicationDocsTitle = col_character(),
ApplicationDocsTypeID = col_integer(),
SubmissionNo = col_integer()
trim_ws = TRUE

And also in :rstats:

x = ApplicationDocsDate
data = ApplicationDocs
) + geom_histogram(
binwidth = 365.25 * 5
) + labs (
title = "Histogram of Drugs@FDA application document dates",
x = "Application document date",
y = "Frequency"

@bgcarlisle gotta love incorrect handling of missing data!

Sign in to participate in the conversation
Scholar Social

Scholar Social is a microblogging platform for researchers, grad students, librarians, archivists, undergrads, academically inclined high schoolers, educators of all levels, journal editors, research assistants, professors, administrators—anyone involved in academia who is willing to engage with others respectfully. Read more ...