102 lines
5.4 KiB
Text
102 lines
5.4 KiB
Text
# Discussion
|
|
|
|
Intro Paragraph - In
|
|
progress<!--# Write after I write everything else -->
|
|
|
|
## Summary of Results
|
|
|
|
The findings of this study indicate that within another commonly ordered
|
|
laboratory testing, the diagnostic value of Free T4 can be predicted
|
|
accurately 80% of the time. While examining only the elevated TSH
|
|
results, the algorithm had a false positive rate of 2% and a false
|
|
negative rate of 16%. In the original data, 76% of the time, the result
|
|
was non-diagnostic for Hypo-Thryodism. For the decreased TSH results,
|
|
the algorithm had a false positive rate of 8% and a false negative rate
|
|
of 20%. In the original data, 67% of the time, the result was
|
|
non-diagnostic for Hyper-Thryodism.
|
|
|
|
While TSH was expected to be the most important variable in building
|
|
random forest models, it was quite unexpected that the next three values
|
|
would be Hematology results. In the clinical laboratory, TSH and CBCs
|
|
are often run on different analyzers and often in completely different
|
|
departments. Finding this slight correlation could be valuable to
|
|
building further algorithms.
|
|
|
|
## Real World Applications
|
|
|
|
While the current algorithm did not quite achieve an accuracy ready for
|
|
deployment, it is hypothesized that a system like this could be
|
|
implemented in clinical decision-making systems. As stated previously,
|
|
current practice is a physician (or other care providers) orders a TSH,
|
|
and if the value is outside laboratory-established reference ranges, the
|
|
Free T4 is added on. In the current study database, this reflex testing
|
|
was non-diagnostic 76% of the time for elevated TSH values and 67% for
|
|
decreased TSH values. Using clinical decision support first to predict
|
|
whether the Free T4 would be diagnostic, the care provider can use this
|
|
prediction and other patient signs and symptoms to determine if running
|
|
a Free T4 lab test is needed.
|
|
|
|
Similarly to Luo et al., the idea that the diagnostic information
|
|
offered by Free T4 often duplicates what other diagnostic tests provide
|
|
suggests a notion of "informationally" redundant testing [-@luo2016]. It
|
|
is speculated that informationally redundant testing occurs in various
|
|
diagnostic settings and diagnostic workups. It is much more frequent
|
|
than the more traditionally defined and narrowly framed notion of
|
|
redundant testing, which most often includes unintended duplications of
|
|
the same or similar tests. Under this narrow definition, redundant
|
|
laboratory testing is estimated to waste more than \$5 billion annually
|
|
in the United States, potentially dwarfed by the waste from
|
|
informationally redundant testing [@luo2016]. However, since Free T4 and
|
|
all other tests used in this study are performed on automated
|
|
instruments, the cost savings to the lab and patient may be minimal.
|
|
|
|
As Rabbani et al. study showed, Machine Learning in the Clinical
|
|
Laboratory is an emerging field. However, few existing studies relate to
|
|
predicting laboratory values based on other results [-@rabbani2022]. The
|
|
few studies that do exist follow a similar premise. All are trying to
|
|
reduce redundant laboratory testing, thus lowering the patient's cost
|
|
burden.
|
|
|
|
## Study Limitations
|
|
|
|
While the MIMIC-IV database allowed for a first run of the study, it
|
|
does suffer from some issues compared to other patient results. The
|
|
MIMIC-IV database only contains results from ICU patients. Thus the
|
|
result may not represent normal results for patients typically screened
|
|
for hyper or hypothyroidism. In a study by Tyler et al., they found that
|
|
laboratory value ranges from critically ill patients deviate
|
|
significantly from those of healthy controls [-@tyler2018]. In their
|
|
study, distribution curves based on ICU data differed significantly from
|
|
the hospital standard range (mean \[SD\] overlapping coefficient, 0.51
|
|
\[0.32-0.69\]) [@tyler2018]. The data ranges from 2008 to 2019. During
|
|
this time, there could have been several unknown laboratory changes.
|
|
Often laboratories change methods, reference ranges, or even vendors.
|
|
None of this data is available in the MIMIC database. A change in method
|
|
or vendor could cause a shift in results, thus causing the algorithm to
|
|
assign incorrect outcomes.
|
|
|
|
The dataset also sufferers from incompleteness. Due to the fact the
|
|
database was not explicitly designed for this study, many patients do
|
|
not have complete sets of lab results. The study also had to pick and
|
|
choose lab tests to allow for as many sets of TSH and Free T4 results as
|
|
possible. For instance, in a study by Luo et al., a total of 42
|
|
different lab tests were selected for a Machine Learning study, compared
|
|
to only 16 selected for this study [-@luo2016]. The patient demographic
|
|
data also suffered from the same incompleteness. Due to this fact, only
|
|
the age and gender of the patient were used in developing the algorithm.
|
|
An early study by Schectman et al. found the mean TSH level of Blacks
|
|
was 0.4 (SE .053) mU/L lower than that for Whites after age and sex
|
|
adjustment, race explaining 6.5 percent of the variation in TSH levels
|
|
[-@schectman1991]. This variation in results should potentially be
|
|
included in developing a future algorithm. However, as it stands, the
|
|
current data set has incomplete data for patient race and ethnicity.
|
|
|
|
As Machine learning algorithms become more and more powerful, it is
|
|
additionally important from an infrastructure standpoint to have the
|
|
processing power capable of handling the algorithms. This becomes even
|
|
more important in an attempt to put the algorithm into practice, as the
|
|
computer must be able to process results in mere milliseconds.
|
|
|
|
## Future Studies
|
|
|
|
Explain how to fix these issues.
|