Follow

I found what appears to be a major data entry error in the Drugs@FDA database

Or rather, about 1400 data entry errors

Β· Web Β· 3 Β· 3 Β· 3

Then in :rstats:

ApplicationDocs <- read_delim(
"~/Downloads/fda/ApplicationDocs.txt",
"\t",
escape_double = FALSE,
col_types = cols(
ApplicationDocsDate = col_date(format = "%Y-%m-%d 00:00:00"),
ApplicationDocsID = col_integer(),
ApplicationDocsTitle = col_character(),
ApplicationDocsTypeID = col_integer(),
SubmissionNo = col_integer()
),
trim_ws = TRUE
)

And also in :rstats:

ggplot(
aes(
x = ApplicationDocsDate
),
data = ApplicationDocs
) + geom_histogram(
binwidth = 365.25 * 5
) + labs (
title = "Histogram of Drugs@FDA application document dates",
x = "Application document date",
y = "Frequency"
)

@bgcarlisle gotta love incorrect handling of missing data!

Sign in to participate in the conversation
Scholar Social

The social network of the future: No ads, no corporate surveillance, ethical design, and decentralization! Own your data with Mastodon!