# Discussion Intro Paragraph - In progress ## Summary of Results ## Study Limitations Section overview - In progress ### MIMIC Database While the MIMIC-IV database allowed for a first run of the study, it does suffer from some issues compared to other patient results. The MIMIC-IV database only contains results from ICU patients. Thus the result may not represent normal results for patients typically screened for hyper or hypothyroidism. In a study by Tyler et al., they found that laboratory value ranges from critically ill patients deviate significantly from those of healthy controls [-@tyler2018]. In their study, distribution curves based on ICU data differed significantly from the hospital standard range (mean \[SD\] overlapping coefficient, 0.51 \[0.32-0.69\]) [@tyler2018]. The data ranges from 2008 to 2019. During this time, there could have been several unknown laboratory changes. Often laboratories change methods, reference ranges, or even vendors. None of this data is available in the MIMIC database. A change in method or vendor could cause a shift in results, thus causing the algorithm to assign incorrect outcomes. The dataset also sufferers from incompleteness. Due to the fact the database was not explicitly designed for this study, many patients do not have complete sets of lab results. The study also had to pick and choose lab tests to allow for as many sets of TSH and Free T4 results as possible. For instance, in a study by Luo et al., a total of 42 different lab tests were selected for a Machine Learning study, compared to only 16 selected for this study [-@luo2016]. The patient demographic data also suffered from the same incompleteness. Due to this fact, only the age and gender of the patient were used in developing the algorithm. An early study by Schectman et al. found the mean TSH level of Blacks was 0.4 (SE .053) mU/L lower than that for Whites after age and sex adjustment, race explaining 6.5 percent of the variation in TSH levels [-@schectman1991]. This variation in results should potentially be included in developing a future algorithm. However, as it stands, the current data set has incomplete data for patient race and ethnicity. ### Other Limitations Should I write about my computer? - It is not capable of running the more powerful algorithm ### Future Studies Explain how to fix these issues. ## Real World Applications While the current algorithm did not quite achieve an accuracy ready for deployment, it is hypothesized that a system like this could be implemented in clinical decision-making systems. As stated previously, current practice is a physician (or other care providers) orders a TSH, and if the value is outside laboratory-established reference ranges, the Free T4 is added on. In the current study database, this reflex testing was non-diagnostic 76% of the time for elevated TSH values and 67% for decreased TSH values. Using clinical decision support first to predict whether the Free T4 would be diagnostic, the care provider can use this prediction and other patient signs and symptoms to determine if running a Free T4 lab test is needed. Similarly to Luo et al., the idea that the diagnostic information offered by Free T4 often duplicates what other diagnostic tests provide suggests a notion of "informationally" redundant testing [-@luo2016]. It is speculated that informationally redundant testing occurs in various diagnostic settings and diagnostic workups. It is much more frequent than the more traditionally defined and narrowly framed notion of redundant testing, which most often includes unintended duplications of the same or similar tests. Under this narrow definition, redundant laboratory testing is estimated to waste more than \$5 billion annually in the United States, potentially dwarfed by the waste from informationally redundant testing [@luo2016]. However, since Free T4 and all other tests used in this study are performed on automated instruments, the cost savings to the lab and patient may be minimal.