chapter 2 updates

This commit is contained in:
Kyle Belanger 2022-09-10 18:06:12 -04:00
parent 1f96e35379
commit 1c2d279910
3 changed files with 232 additions and 10 deletions

View file

@ -1 +1,90 @@
# Literature Review
The application of machine learning in medicine has garnered enormous
attention over the past decade [@rabbani2022]. Artificial intelligence
(AI) and especially the subdiscipline of machine learning (ML) have
become hot topics that are generating increasing interest among
laboratory professionals. AI is a rather broad term and can be defined
as the theory and development of computer systems to perform complex
tasks normally requiring human intelligence, such as decision-making,
visual perception, speech recognition, and translation between
languages. ML is the science of programming, which gives computers the
ability to learn from data without being explicitly programmed
[@debruyne2021]. The ever wider use of ML in clinical and basic medical
research is reflected in the number of titles and abstracts of papers
indexed on PubMed and published until 2006 as compared to 2007--2017,
with a nearly 10-fold increase from 1000 to slightly more than 9000
articles in the that time frame [@cabitza2018]. A literature review by
Rabbani et al. found 39 articles pertaining to the field of clinical
chemistry in laboratory medicine between 2011 and 2021 [-@rabbani2022].
## A Brief Primer on Machine Learning
While the aim of this literature review is not to provide an extensive
representation of the mathematics behind ML algorithms, some basic
concepts will be introduced to allow a sufficient understanding of the
topics discussed in the paper. ML models can be classified into broad
categories based on several criteria, such as the type of supervision,
whether are not the algorithm can learn incrementally from an incoming
stream of data (batch and online learning), and how they generalize
(instance-based versus model-based learning) [@debruyne2021]. Rabbani et
al. further classified the specfic clinical chemistry uses into five
board categories, predicting laboratory test values, improving
laboratory utilization, automating laboratory processes, promoting
precision laboratory test interpretation, and improving laboratory
medicine information systems [-@rabbani2022].
### Supervised vs Unsupervised Learning
Four important categories can be distinguished based on the amount and
type of supervision the models receive during training: supervised,
unsupervised, semi-supervised, and reinforcement learning. In supervised
learning, training data are labeled and data samples are predicted with
knowledge about the desired solutions [@debruyne2021]. They are
typically used for classification and regression purposes. Some of the
most important supervised algorithms are Linear Regression, Logistic
Regression, K-Nearest Neighbors (KNN), Support Vector Machines (SVMs),
Decision Trees (DTs), Random Forests (RFs), and supervised neural
networks. In unsupervised learning, training data are unlabeled. In
other words, observations are classified without any prior data sample
knowledge [@debruyne2021]. Unsupervised algorithms can be used for
clustering (e.g. k-means clustering, density-based spatial clustering of
applications with noise, hierarchical cluster analysis), visualization
and dimensionality reduction (e.g. principal component analysis (PCA),
kernel PCA, locally linear embedding, t-distributed stochastic neighbor
embedding), anomaly detection and novelty detection (e.g. one-class SVM,
isolation forest) and association rule learning (e.g. apriori, eclat).
However, some models can deal with partially labeled training data (i.e.
semi-supervised learning). At last, in reinforcement learning, an agent
(i.e. the learning system) learns what actions to take to optimize the
outcome of a strategy (i.e. a policy) or to get the maximum cumulative
reward [@debruyne2021]. This system resembles humans learning to ride a
bike and can typically be used in learning games, such as Go, chess, or
even poker, or settings where the outcome is continuous rather than
dichotomous (i.e. right or wrong)[@debruyne2021]. The proposed study
will use supervised learning, as the data is labeled and an particular
outcome is expected.
### Machine Learning Workflow
Since this study will focus of supervised learning the review will focus
on that. Machine learning can be broken into three board steps, data
cleaning and processing, training and testing the model, finally the
model is evaluated, deployed, and monitored [@debruyne2021]. In the
first phase data is collected, cleaned, and labeled. Data cleaning or
pre-processing is one of the most important steps in designing a
reliable model [@debruyne2021]. Some examples of common pre-processing
steps are handling of missing data, detection of outliers, and encoding
of categorical data. Data at this stage is also split into training and
testing data, typically following somewhere near a 70-30 split. These
two data sets are used for different portions of the rest of model
building. The Training set data is used to develop feature sets, train
our algorithms, tune hyperparameters, compare models, and all of the
other activities required to choose a final model (e.g., the model we
want to put into production) [@boehmke2020]. Once the final model is
chosen the test set data is used to estimate an unbiased assessment of
the model's performance, which we refer to as the generalization error
[@boehmke2020]. Most time (as much as 80%) is invested into the data
processes stage.
####

View file

@ -1,31 +1,108 @@
# Introduction
The primary business purpose of the clinical laboratory is to provide results of testing requested by physicians and other healthcare professionals. This testing in a broad sense is used to help solve diagnostic problems [@verboeket-vandevenne2012]. To continue to add value to the business purpose of the laboratory, laboratory professionals can add value beyond just running the provided tests. Laboratory professionals can add value through both reflective and reflex testing. Automated analyzers add most tests based on rules (algorithms) established by laboratory professionals; this is defined as 'reflex testing.' Clinical biochemists add the remainder of tests after considering a more comprehensive range of information than can readily be incorporated into reflex testing algorithms; this is defined as 'reflective testing' [@srivastava2010]. Both reflex and reflective testing became possible with the advent of laboratory information systems (LIS) that were sufficiently flexible to permit modification of existing test requests at various stages of the analytical process [@srivastava2010]. This research study will focus specifically on reflex testing, those tests added automatically by a set of rules established in each laboratory. In most current clinical laboratories, reflex testing is performed with a 'hard' cutoff, using a specifically established range with no means of flexibility [@murphy2021].
The primary business purpose of the clinical laboratory is to provide
results of testing requested by physicians and other healthcare
professionals. This testing in a broad sense is used to help solve
diagnostic problems [@verboeket-vandevenne2012]. To continue to add
value to the business purpose of the laboratory, laboratory
professionals can add value beyond just running the provided tests.
Laboratory professionals can add value through both reflective and
reflex testing. Automated analyzers add most tests based on rules
(algorithms) established by laboratory professionals; this is defined as
'reflex testing.' Clinical biochemists add the remainder of tests after
considering a more comprehensive range of information than can readily
be incorporated into reflex testing algorithms; this is defined as
'reflective testing' [@srivastava2010]. Both reflex and reflective
testing became possible with the advent of laboratory information
systems (LIS) that were sufficiently flexible to permit modification of
existing test requests at various stages of the analytical process
[@srivastava2010]. This research study will focus specifically on reflex
testing, those tests added automatically by a set of rules established
in each laboratory. In most current clinical laboratories, reflex
testing is performed with a 'hard' cutoff, using a specifically
established range with no means of flexibility [@murphy2021].
<!--# Rewrite this section -->
This study will examine the use of Machine learning to develop algorithms to allow flexibility for automatic reflex testing in clinical chemistry. The goal is to fill the gap between hard coded reflex testing and fully manual reflective testing using machine learning algorithms.
This study will examine the use of Machine learning to develop
algorithms to allow flexibility for automatic reflex testing in clinical
chemistry. The goal is to fill the gap between hard coded reflex testing
and fully manual reflective testing using machine learning algorithms.
<!--# -->
## Statement of Problem
## Purpose and Research Question
## Purpose and Research Statement
Develop and test a machine learning algorithm to further reduce unnecessary reflex testing.
Develop and test a machine learning algorithm to establish if said
algorithm can perform better then current hard coded rules to reduced
unnecessary patient testing.
## Significance
Health spending in the U.S. increased by 4.6% in 2019 to \$3.8 trillion or \$11,582 per capita. This growth rate is in line with 2018 (4.7 percent) and slightly faster than what was observed in 2017 (4.3 percent) [@americanmedicalassociation2021]. Although laboratory costs comprise only about 5% of the healthcare budget in the United States, it is estimated that laboratory services drive up to 70% of all downstream medical decisions, which encompass a substantial portion of the budget [@ma2019]. As healthcare budgets increase, payers, including Medicare, commercial insurers, and employers, will demand accountability and eliminate the abuse and misuse of ineffective testing strategies [@hernandez2003]. Increasingly, payers demand to know the value of the tests, with value equaling quality per unit of cost. Payers want laboratories to prove that tests are cost-effective; as reimbursement rates decline for many standard laboratory tests, the incentives for automated reflex testing rise for many clinical laboratories [@hernandez2003]. Unnecessary laboratory tests are a significant source of waste in the United States healthcare system. Prior studies suggest that 20% of labs performed are unnecessary, wasting 200 billion dollars each year [@li2022].
Health spending in the U.S. increased by 4.6% in 2019 to \$3.8 trillion
or \$11,582 per capita. This growth rate is in line with 2018 (4.7
percent) and slightly faster than what was observed in 2017 (4.3
percent) [@americanmedicalassociation2021]. Although laboratory costs
comprise only about 5% of the healthcare budget in the United States, it
is estimated that laboratory services drive up to 70% of all downstream
medical decisions, which encompass a substantial portion of the budget
[@ma2019]. As healthcare budgets increase, payers, including Medicare,
commercial insurers, and employers, will demand accountability and
eliminate the abuse and misuse of ineffective testing strategies
[@hernandez2003]. Increasingly, payers demand to know the value of the
tests, with value equaling quality per unit of cost. Payers want
laboratories to prove that tests are cost-effective; as reimbursement
rates decline for many standard laboratory tests, the incentives for
automated reflex testing rise for many clinical laboratories
[@hernandez2003]. Unnecessary laboratory tests are a significant source
of waste in the United States healthcare system. Prior studies suggest
that 20% of labs performed are unnecessary, wasting 200 billion dollars
each year [@li2022].
A typical example of reflex testing is thyrotropin (TSH), relaxing to free thyroxine (Free T4 or FT4). TSH measurement is a sensitive screening test for thyroid dysfunction. Guidelines from the American Thyroid Association, the American Association of Clinical Endocrinologists, and the National Academy of Clinical Biochemistry have endorsed TSH measurement as the best first-line strategy for detecting thyroid dysfunction in most clinical settings [@plebani2020]. Traditionally the cutoff for reflex testing was simply the reference range for a patient's sex and race. However, recent studies have suggested that widening these ranges reduces reflex testing by up to 34% [@plebani2020]. In an additional study, the authors concluded that the TSH reference range leading to reflex Free T4 testing could likely be widened to decrease the number of unnecessary Free T4 measurements performed. This reduction would reduce overall costs to the medical system without likely causing negative consequences of missing the detection of people with thyroid hormone abnormalities [@whitneyw.woodmansee2018].
A typical example of reflex testing is thyrotropin (TSH), relaxing to
free thyroxine (Free T4 or FT4). TSH measurement is a sensitive
screening test for thyroid dysfunction. Guidelines from the American
Thyroid Association, the American Association of Clinical
Endocrinologists, and the National Academy of Clinical Biochemistry have
endorsed TSH measurement as the best first-line strategy for detecting
thyroid dysfunction in most clinical settings [@plebani2020].
Traditionally the cutoff for reflex testing was simply the reference
range for a patient's sex and race. However, recent studies have
suggested that widening these ranges reduces reflex testing by up to 34%
[@plebani2020]. In an additional study, the authors concluded that the
TSH reference range leading to reflex Free T4 testing could likely be
widened to decrease the number of unnecessary Free T4 measurements
performed. This reduction would reduce overall costs to the medical
system without likely causing negative consequences of missing the
detection of people with thyroid hormone abnormalities
[@whitneyw.woodmansee2018].
<!--# This paragraph should be written -->
<!--# This paragraph should be written and most removed-->
The reduction in testing aside, the hard-coded rule still exists. Additionally, machine learning may predict missing values in a patient's record or even suggest further testing on a particular patient. In a study at Massachusetts General hospital, researchers predicted ferritin results based on already run laboratory testing [@charnaalbert2020].
The reduction in testing aside, the hard-coded rule still exists.
Additionally, machine learning may predict missing values in a patient's
record or even suggest further testing on a particular patient. In a
study at Massachusetts General hospital, researchers predicted ferritin
results based on already run laboratory testing [@charnaalbert2020].
## Purposed Study Set Up
Using the Medical Information Mart for Intensive Care (MIMIC) IV Database develop and test a machine learning algorithm to determine if TSH reflex testing can be further reduced.
Using the Medical Information Mart for Intensive Care (MIMIC) IV
Database develop and test a machine learning algorithm to determine if
TSH reflex testing can be further reduced.
The MIMIC-IV database contains patient records from 2008 to 2019 for patients admitted to the critical care units of Beth Israel Deaconess Medical Center. It is a common database used for various studies. The data will be cleaned and tided to contain various patient demographics, and all available laboratory testing for each patient. The exact structure of the cleaned data will be determined later. Once cleaned the data will be split into a training and testing data set. The training data will be used to develop various machine learning algorithms to attempt to develop an algorithm that can perform better then the hard coded rules in place today. The study will primarily focus on TSH reflex testing as this is the most common reflex test used in most laboratories. The hypothesis however is that this model could be used for many different types of reflex testing in the lab.
The MIMIC-IV database contains patient records from 2008 to 2019 for
patients admitted to the critical care units of Beth Israel Deaconess
Medical Center. It is a common database used for various studies. The
data will be cleaned and tided to contain various patient demographics,
and all available laboratory testing for each patient. The exact
structure of the cleaned data will be determined later. Once cleaned the
data will be split into a training and testing data set. The training
data will be used to develop various machine learning algorithms to
attempt to develop an algorithm that can perform better then the hard
coded rules in place today. The study will primarily focus on TSH reflex
testing as this is the most common reflex test used in most
laboratories. The hypothesis however is that this model could be used
for many different types of reflex testing in the lab.

View file

@ -135,3 +135,59 @@ PMCID: PMC7961679}
url = {https://www.captodayonline.com/can-machine-learning-algorithms-predict-lab-values/},
langid = {canadian}
}
@article{rabbani2022,
title = {Applications of machine learning in routine laboratory medicine: Current state and future directions},
author = {Rabbani, Naveed and Kim, Grace Y. E. and Suarez, Carlos J. and Chen, Jonathan H.},
year = {2022},
month = {05},
date = {2022-05-01},
journal = {Clinical Biochemistry},
pages = {1--7},
volume = {103},
doi = {10.1016/j.clinbiochem.2022.02.011},
url = {https://www.sciencedirect.com/science/article/pii/S0009912022000595},
langid = {en}
}
@article{cabitza2018,
title = {Machine learning in laboratory medicine: waiting for the flood?},
author = {Cabitza, Federico and Banfi, Giuseppe},
year = {2018},
month = {04},
date = {2018-04-01},
journal = {Clinical Chemistry and Laboratory Medicine (CCLM)},
pages = {516--524},
volume = {56},
number = {4},
doi = {10.1515/cclm-2017-0287},
url = {https://www.degruyter.com/document/doi/10.1515/cclm-2017-0287/html},
note = {Publisher: De Gruyter},
langid = {en}
}
@article{debruyne2021,
title = {Recent evolutions of machine learning applications in clinical laboratory medicine},
author = {De Bruyne, Sander and Speeckaert, Marijn M. and Van Biesen, Wim and Delanghe, Joris R.},
year = {2021},
month = {02},
date = {2021-02-17},
journal = {Critical Reviews in Clinical Laboratory Sciences},
pages = {131--152},
volume = {58},
number = {2},
doi = {10.1080/10408363.2020.1828811},
url = {https://doi.org/10.1080/10408363.2020.1828811},
note = {Publisher: Taylor & Francis
{\_}eprint: https://doi.org/10.1080/10408363.2020.1828811
PMID: 33045173}
}
@book{boehmke2020,
title = {Hands-On Machine Learning with R},
author = {Boehmke, Bradley and Greenwell, Brandon},
year = {2020},
month = {02},
date = {2020-02-01},
url = {https://bradleyboehmke.github.io/HOML/}
}