This is a summary of the the drugs on the DEA website. It include drug class (i.e., stimulants, depressants, hallucinogens), categories (i.e., amphetamines, cocaine, barbiturates) and drug synonyms. Synonyms include brand/generic names (Adderall®, Vallium®, clonazepam) and street names (Apache, White Girl). Eventually the ontology needs to be expanded to add a grouping variable for related synonyms. For example, the generic clonazepam, the brand name Klonopin® and the street name k-pin all should be grouped with the general name clonazepam
Two files, dea_factsheets.rda and dea_brands.rda, were created by scraping the “fact sheets” on the DEA website on September 10th 2020. The fact had 12 records added because the DEA slang file has additional “categories” (e.g., crack cocaine, mushrooms, PCP, etc.). These extra “categories” are problematic for the DEA ontology because they also include a specific brand name (i.e., ritalin) and two benzodiazepines (i.e., alprazolam, clonazepam). Eventually these need to be added as part of a “generic name” level in the ontology.
|drugs of concern|
dea_street_names.rda (N = 1731 records) contains 26 drug categories and 3 brands (e.g., Klonopin, Percocet, Xanax).
substance: Chemical and/or brand names
schedule: I, II, III, IV or V
narcotic: Y or N
synonym: Chemical and/or brand names
names variables will be very difficult because the delimiters between drugs are not at all consistent.
There are several things (in the black box) that need to be placed in the tree. Things with the red call out are inconsistencies.
This is a summary of the the drugs on the no slang website.
One file, noslang_street_names.rda , was created by scraping the Drug Slang Dictionary October 12th 2020.
Additional drugs found while processing data from IQVIA. Thanks to Edward Nunes M.D. for providing notes on how to better classify these drugs. This data introduced new classes such as reversal agents and treatment drugs.
This is a vector of drug-specific stop words that have been observed while processing data from the aforementioned data sources. This vector is primarily used with the
parse() function and is used to remove irrelevant words (i.e., “pills”, “syringe”) or strings such as units or dosages of drugs (i.e., “mg”).