ujpdate sections

2023-05-27 16:46:49 -04:00 · 2023-05-27 16:46:49 -04:00 · 213c2ef790
commit 213c2ef790
parent 1a9dcc571d
4 changed files with 167 additions and 42 deletions
--- a/ML/3_model_outputs.R
+++ b/ML/3_model_outputs.R
@ -56,7 +56,27 @@ ggplot2::autoplot(
                     ) +
  ggplot2::theme_bw() +
  ggplot2::scale_color_manual(values = rep("black", times = 5)) +
-  ggplot2::theme(legend.position = "none")
+  ggplot2::theme(legend.position = "none") +
+  ggplot2::labs(
+    title = "Regression Model Screening"
+    ,y = "RMSE"
+  ) +
+  ggplot2::theme(plot.title = ggplot2::element_text(hjust = 0.5))
+
+
+gp2$ggsave(
+  here("figures","reg_screen.emf")
+  ,width  = 7
+  ,height = 7
+  ,dpi    = 300
+  ,device = devEMF::emf
+)
+gp2$ggsave(
+  here("figures","reg_screen.png")
+  ,width  = 7
+  ,height = 7
+  ,dpi    = 300
+)

 class_results <- screen_workflows_class %>%
  workflowsets::rank_results()
@ -74,8 +94,26 @@ ggplot2::autoplot(
  ) +
  ggplot2::theme_bw() +
  ggplot2::scale_color_manual(values = rep("black", times = 5)) +
-  ggplot2::theme(legend.position = "none")
+  ggplot2::theme(legend.position = "none") +
+  ggplot2::labs(
+    title = "Classification Model Screening"
+    ,y = "Accuracy"
+  ) +
+  ggplot2::theme(plot.title = ggplot2::element_text(hjust = 0.5))

+gp2$ggsave(
+  here("figures","class_screen.emf")
+  ,width  = 7
+  ,height = 7
+  ,dpi    = 300
+  ,device = devEMF::emf
+)
+gp2$ggsave(
+  here("figures","class_screen.png")
+  ,width  = 7
+  ,height = 7
+  ,dpi    = 300
+)


 # best results ------------------------------------------------------------
--- a/chapter1.qmd
+++ b/chapter1.qmd
@ -11,8 +11,8 @@ American Medical Association survey later showed that 48% of U.S.
 hospitals had clinical laboratories by 1923 [@berger1999]. Before 1960,
 almost all testing in the laboratory was performed using manual methods.
 In the mid-1960s, a limited amount of automated analyzers became
-available, allowing for more rapid testing and running multiple tests at
-the same time [@park2017].
+available, allowing for more rapid testing and running multiple tests
+simultaneously [@park2017].

 Since these early days of automation in the last fifty years, the
 clinical laboratory has rapidly expanded automation techniques. These
@ -31,17 +31,17 @@ to go along with the advances in instrument technology.

 Over the past few decades, Laboratory Information Systems (LIS) have
 evolved from relatively narrow, often arcane, or home-grown systems into
-sophisticated systems that are more user-friendly and support a broader
-range of functions and integration with other technologies that
-laboratories deploy [@henricks2015]. Modern LISs consist of complex,
-interrelated computer programs and infrastructure that support
-laboratories' vast array of information-processing needs. LISs have
-functions in all phases of patient testing, including specimen and test
-order intake, specimen processing and tracking, support of analysis and
-interpretation, and report creation and distribution. In addition, LISs
-provide management reports and other data that laboratories need to run
-their operations and to support continuous improvement and quality
-initiatives [@henricks2015].
+sophisticated, more user-friendly systems that support a broader range
+of functions and integration with other technologies that laboratories
+deploy [@henricks2015]. Modern LISs consist of complex, interrelated
+computer programs and infrastructure that support laboratories' vast
+array of information-processing needs. LISs have functions in all phases
+of patient testing, including specimen and test order intake, specimen
+processing and tracking, support of analysis and interpretation, and
+report creation and distribution. In addition, LISs provide management
+reports and other data that laboratories need to run their operations
+and to support continuous improvement and quality initiatives
+[@henricks2015].

 The clinical laboratory's primary business purpose is to provide testing
 results requested by physicians and other healthcare professionals. In a
@ -72,21 +72,19 @@ testing using machine learning algorithms.

 ## Purpose and Research Statement

-<!--# need to rewrite this question -->
-
 Develop and test a machine learning algorithm to establish if said
-algorithm can perform better than current hard-coded rules to reduce
-unnecessary patient testing.
+algorithm can predict either the FT4 result or the laboratory diagnosis
+of hyper or hypothyroidism.

 ## Significance

-Health spending in the U.S. increased by 4.6% in 2019 to \$3.8 trillion
-or \$11,582 per capita. This growth rate is in line with 2018 (4.7
-percent) and slightly faster than what was observed in 2017 (4.3
-percent) [@americanmedicalassociation2021]. Although laboratory costs
-comprise only about 5% of the healthcare budget in the United States, it
-is estimated that laboratory services drive up to 70% of all downstream
-medical decisions, which encompass a substantial portion of the budget
+U.S. health spending increased by 4.6% in 2019 to \$3.8 trillion or
+\$11,582 per capita. This growth rate is in line with 2018 (4.7 percent)
+and slightly faster than what was observed in 2017 (4.3 percent)
+[@americanmedicalassociation2021]. Although laboratory costs comprise
+only about 5% of the healthcare budget in the United States, it is
+estimated that laboratory services drive up to 70% of all downstream
+medical decisions, encompassing a substantial portion of the budget
 [@ma2019]. As healthcare budgets increase, payers, including Medicare,
 commercial insurers, and employers, will demand accountability and
 eliminate the abuse and misuse of ineffective testing strategies
@ -110,7 +108,7 @@ thyroid dysfunction in most clinical settings [@plebani2020].
 Traditionally the cutoff for reflex testing was simply the reference
 range for a patient's sex and race. However, recent studies have
 suggested that widening these ranges reduces reflex testing by up to 34%
-[@plebani2020]. In an additional study, the authors concluded that the
+[@plebani2020]. In additional research, the authors concluded that the
 TSH reference range leading to reflex Free T4 testing could likely be
 widened to decrease the number of unnecessary Free T4 measurements
 performed. This reduction would reduce overall costs to the medical
--- a/chapter2.qmd
+++ b/chapter2.qmd
@ -117,7 +117,7 @@ computed by aggregating the errors across the entire validation data set
 <!--# Table needs to be modified -->

 | **Author and Year** | **Objective and Machine Learning Task**                                                                         | **Best Model** | **Major Themes** |
-|-------------|---------------------------------|-------------|-------------|
+|---------------------|-----------------------------------------------------------------------------------------------------------------|----------------|------------------|
 | Azarkhish (2012)    | Predict iron deficiency anemia and serum iron levels from CBC indices                                           | Neural Network | Prediction       |
 | Cao (2012)          | Triage manual review for urinalysis samples                                                                     | Tree-based     | Automation       |
 | Yang (2013)         | Predict normal reference ranges of ESR for various laboratories based on geographic and other clinical features | Neural Network | Interpretation   |
--- a/chapter3.qmd
+++ b/chapter3.qmd
@ -1,10 +1,23 @@
 # Methods

-need brief description and IRB statement
+## IRB
+
+Based on the information you submitted for this project, the Campbell
+University Institutional Review Board (Campbell IRB) determined this
+submission is Not Human Subjects Research as defined by 45 CFR
+46.102(e).

 ## Population and Data

-This study was designed using the Medical Information Mart for Intensive Care (MIMIC) database [@johnsonalistair]. MIMIC (Medical Information Mart for Intensive Care) is an extensive, freely-available database comprising de-identified health-related data from patients who were admitted to the critical care units of the Beth Israel Deaconess Medical Center. The database contains many different types of information, but only data from the patients and laboratory events table are used in this study. The study uses version IV of the database, comprising data from 2008 - 2019.
+This study used the Medical Information Mart for Intensive Care (MIMIC)
+database [@johnsonalistair]. MIMIC (Medical Information Mart for
+Intensive Care) is an extensive, freely-available database comprising
+de-identified health-related data from patients who were admitted to the
+critical care units of the Beth Israel Deaconess Medical Center. The
+database contains many different types of information, but only data
+from the patients and laboratory events table are used in this study.
+The study uses version IV of the database, comprising data from 2008 -
+2019.

 ## Data Variables and Outcomes

@ -15,19 +28,37 @@ source(here::here("ML","1-data-exploration.R"))

 ```

-A total of 18 variables were chosen for this study. The age and gender of the patient were pulled from the patient table in the MIMIC database. While this database contains some additional demographic information, it is incomplete and thus unusable for this study. 15 lab values were selected for this study, this includes:
+A total of 18 variables were chosen for this study. The age and gender
+of the patient were pulled from the patient table in the MIMIC database.
+While this database contains some additional demographic information, it
+is incomplete and thus unusable for this study. 15 lab values were
+selected for this study, this includes:

-   **BMP**: BUN, bicarbonate, calcium, chloride, creatinine, glucose, potassium, sodium
+-   **BMP**: BUN, bicarbonate, calcium, chloride, creatinine, glucose,
+    potassium, sodium

-   **CBC**: Hematocrit, hemoglobin, platelet count, red blood cell count, white blood cell count
+-   **CBC**: Hematocrit, hemoglobin, platelet count, red blood cell
+    count, white blood cell count

 -   TSH

 -   Free T4

-The unique patient id and chart time were also retained for identifying each sample. Each sample contains one set of 16 lab values for each patient. Patients may have several samples in the data set run at different times. Rows were retained as long as they had less than three missing results. These missing results can be filled in by imputation later in the process. Samples were also filtered for those with TSH above or below the reference range of 0.27 - 4.2 uIU/mL. These represent samples that would have reflexed for Free T4 testing. After filtering, the final data set contained `r nrow(ds1)` rows.
+The unique patient id and chart time were also retained for identifying
+each sample. Each sample contains one set of 16 lab values for each
+patient. Patients may have several samples in the data set run at
+different times. Rows were retained as long as they had less than three
+missing results. These missing results can be filled in by imputation
+later in the process. Samples were also filtered for those with TSH
+above or below the reference range of 0.27 - 4.2 uIU/mL. These represent
+samples that would have reflexed for Free T4 testing. After filtering,
+the final data set contained `r nrow(ds1)` rows.

-Once the final data set was collected, an additional column was created for the outcome variable to determine if the Free T4 value was diagnostic. This outcome variable was used for building classification models. The classification variable was not used in regression models. @tbl-outcome_var shows how the outcomes were added
+Once the final data set was collected, an additional column was created
+for the outcome variable to determine if the Free T4 value was
+diagnostic. This outcome variable was used for building classification
+models. The classification variable was not used in regression models.
+@tbl-outcome_var shows how the outcomes were added

 | TSH Value     | Free T4 Value | Outcome             |
 |---------------|---------------|---------------------|
@ -38,7 +69,14 @@ Once the final data set was collected, an additional column was created for the

 : Outcome Variable {#tbl-outcome_var}

-. @tbl-data_summary shows the summary statistics of each variable selected for the study. Each numeric variable is listed with the percent missing, median, and interquartile range (IQR). The data set is weighted toward elevated TSH levels, with 80% of values falling into that category. Glucose and Calcium have several missing values at `r gtsummary::inline_text(summary_tbl, variable = GLU, column = n)` and `r gtsummary::inline_text(summary_tbl, variable = CA, column = n)`, respectively.
+. @tbl-data_summary shows the summary statistics of each variable
+selected for the study. Each numeric variable is listed with the percent
+missing, median, and interquartile range (IQR). The data set is weighted
+toward elevated TSH levels, with 80% of values falling into that
+category. Glucose and Calcium have several missing values at
+`r gtsummary::inline_text(summary_tbl, variable = GLU, column = n)` and
+`r gtsummary::inline_text(summary_tbl, variable = CA, column = n)`,
+respectively.

 ```{r}
 #| label: tbl-data_summary
@ -50,19 +88,40 @@ summary_tbl %>% gtsummary$as_kable()

 ## Data Inspection

-By examining @tbl-data_summary several important data set characteristics quickly come to light without explanation. The median age across the data set, as a whole, is quite similar, with a median age across all categories of 62.5. Females are better represented in the data set, with higher percentages in all categories. Across all categories, the median values for each lab result are quite similar. The expectation for this is Red Blood cells, which show larger variation across the various categories.
+By examining @tbl-data_summary several important data set
+characteristics quickly come to light without explanation. The median
+age across the data set, as a whole, is quite similar, with a median age
+across all categories of 62.5. Females are better represented in the
+data set, with higher percentages in all categories. Across all
+categories, the median values for each lab result are pretty similar.
+The expectation for this is Red Blood cells, which show more
+considerable variation across the various categories.

-![Distribution of Variables](figures/distrubution_histo){#fig-distro_histo}
+![Distribution of
+Variables](figures/distrubution_histo){#fig-distro_histo}

-When examining @fig-distro_histo, many clinical chemistry values do not show a standard distribution. However, the hematology results typically do appear to follow a standard distribution. While not a problem for most tree-based classification models, many regression models perform better with standard variables. Standardizing variables provides a common comparable unit of measure across all the variables [@boehmke2020]. Since lab values do not contain negative numbers, all numeric values will be log-transformed to bring them to normal distributions.
+When examining @fig-distro_histo, many clinical chemistry values do not
+show a standard distribution. However, the hematology results typically
+do appear to follow a standard distribution. While not a problem for
+most tree-based classification models, many regression models perform
+better with standard variables. Standardizing variables provides a
+common comparable unit of measure across all the variables
+[@boehmke2020]. Since lab values do not contain negative numbers, all
+numeric values will be log-transformed to bring them to normal
+distributions.

 ![Variable Correlation Plot](figures/corr_plot){#fig-corr_plot}

-@fig-corr_plot shows a high correlation between better Hemoglobin, hematocrit, and Red Blood Cell values (as would be expected). While high correlation does not lead to model issues, it can cause unnecessary computations with little value. However, due to the small about of variables to begin with
+@fig-corr_plot shows a high correlation between better Hemoglobin,
+hematocrit, and Red Blood Cell values (as would be expected). While high
+correlation does not lead to model issues, it can cause unnecessary
+computations with little value. However, due to the small about of
+variables to begin with

 ## Data Tools

-All data handling and modeling were performed using R and R Studio. The current report was rendered in the following environment.
+All data handling and modeling were performed using R and R Studio. The
+current report was rendered in the following environment.

 ```{r}
 #| label: tbl-platform-info
@ -116,4 +175,34 @@ knitr::kable(

 ## Model Selection

-This section will be updated to explain how both the final classification and regression model were chosen. The model building is still in progress. Also will be updating the research question to reflect the final product and just better wording.
+Both classification and regression models were screened using a random
+grid search to tune hyperparameters. The models were tested against the
+training data set to find the best-fit model. @fig-reg-screen shows the
+results of the model screening for regression models, using root mean
+square error (RMSE) as the ranking method. Random Forest models and
+boosted trees performed similarly and were selected for further testing.
+A full grid search was performed on both models, with a Random Forest
+model as the final selection. The final hyperparameters selected were:
+
+-   mtry: 8
+
+-   trees: 1000
+
+-   minimum nodes: 2
+
+![Regression Model Screen](figures/reg_screen){#fig-reg-screen}
+
+@fig-class-screen shows the results of the model screen for
+classification models using accuracy as the ranking method. As with
+regression models, boosted tress and random forest models performed the
+best. After completing a full grid search of both model types, a random
+forest model was again chosen as the final model. The final
+hyperparameters for the model selected were:
+
+-   mtry: 8
+
+-   trees: 2000
+
+-   minimun nodes: 2
+
+![Classification Model Screen](figures/class_screen){#fig-class-screen}