chapter 2 updates
This commit is contained in:
parent
1f96e35379
commit
1c2d279910
3 changed files with 232 additions and 10 deletions
89
chapter2.qmd
89
chapter2.qmd
|
@ -1 +1,90 @@
|
|||
# Literature Review
|
||||
|
||||
The application of machine learning in medicine has garnered enormous
|
||||
attention over the past decade [@rabbani2022]. Artificial intelligence
|
||||
(AI) and especially the subdiscipline of machine learning (ML) have
|
||||
become hot topics that are generating increasing interest among
|
||||
laboratory professionals. AI is a rather broad term and can be defined
|
||||
as the theory and development of computer systems to perform complex
|
||||
tasks normally requiring human intelligence, such as decision-making,
|
||||
visual perception, speech recognition, and translation between
|
||||
languages. ML is the science of programming, which gives computers the
|
||||
ability to learn from data without being explicitly programmed
|
||||
[@debruyne2021]. The ever wider use of ML in clinical and basic medical
|
||||
research is reflected in the number of titles and abstracts of papers
|
||||
indexed on PubMed and published until 2006 as compared to 2007--2017,
|
||||
with a nearly 10-fold increase from 1000 to slightly more than 9000
|
||||
articles in the that time frame [@cabitza2018]. A literature review by
|
||||
Rabbani et al. found 39 articles pertaining to the field of clinical
|
||||
chemistry in laboratory medicine between 2011 and 2021 [-@rabbani2022].
|
||||
|
||||
## A Brief Primer on Machine Learning
|
||||
|
||||
While the aim of this literature review is not to provide an extensive
|
||||
representation of the mathematics behind ML algorithms, some basic
|
||||
concepts will be introduced to allow a sufficient understanding of the
|
||||
topics discussed in the paper. ML models can be classified into broad
|
||||
categories based on several criteria, such as the type of supervision,
|
||||
whether are not the algorithm can learn incrementally from an incoming
|
||||
stream of data (batch and online learning), and how they generalize
|
||||
(instance-based versus model-based learning) [@debruyne2021]. Rabbani et
|
||||
al. further classified the specfic clinical chemistry uses into five
|
||||
board categories, predicting laboratory test values, improving
|
||||
laboratory utilization, automating laboratory processes, promoting
|
||||
precision laboratory test interpretation, and improving laboratory
|
||||
medicine information systems [-@rabbani2022].
|
||||
|
||||
### Supervised vs Unsupervised Learning
|
||||
|
||||
Four important categories can be distinguished based on the amount and
|
||||
type of supervision the models receive during training: supervised,
|
||||
unsupervised, semi-supervised, and reinforcement learning. In supervised
|
||||
learning, training data are labeled and data samples are predicted with
|
||||
knowledge about the desired solutions [@debruyne2021]. They are
|
||||
typically used for classification and regression purposes. Some of the
|
||||
most important supervised algorithms are Linear Regression, Logistic
|
||||
Regression, K-Nearest Neighbors (KNN), Support Vector Machines (SVMs),
|
||||
Decision Trees (DTs), Random Forests (RFs), and supervised neural
|
||||
networks. In unsupervised learning, training data are unlabeled. In
|
||||
other words, observations are classified without any prior data sample
|
||||
knowledge [@debruyne2021]. Unsupervised algorithms can be used for
|
||||
clustering (e.g. k-means clustering, density-based spatial clustering of
|
||||
applications with noise, hierarchical cluster analysis), visualization
|
||||
and dimensionality reduction (e.g. principal component analysis (PCA),
|
||||
kernel PCA, locally linear embedding, t-distributed stochastic neighbor
|
||||
embedding), anomaly detection and novelty detection (e.g. one-class SVM,
|
||||
isolation forest) and association rule learning (e.g. apriori, eclat).
|
||||
However, some models can deal with partially labeled training data (i.e.
|
||||
semi-supervised learning). At last, in reinforcement learning, an agent
|
||||
(i.e. the learning system) learns what actions to take to optimize the
|
||||
outcome of a strategy (i.e. a policy) or to get the maximum cumulative
|
||||
reward [@debruyne2021]. This system resembles humans learning to ride a
|
||||
bike and can typically be used in learning games, such as Go, chess, or
|
||||
even poker, or settings where the outcome is continuous rather than
|
||||
dichotomous (i.e. right or wrong)[@debruyne2021]. The proposed study
|
||||
will use supervised learning, as the data is labeled and an particular
|
||||
outcome is expected.
|
||||
|
||||
### Machine Learning Workflow
|
||||
|
||||
Since this study will focus of supervised learning the review will focus
|
||||
on that. Machine learning can be broken into three board steps, data
|
||||
cleaning and processing, training and testing the model, finally the
|
||||
model is evaluated, deployed, and monitored [@debruyne2021]. In the
|
||||
first phase data is collected, cleaned, and labeled. Data cleaning or
|
||||
pre-processing is one of the most important steps in designing a
|
||||
reliable model [@debruyne2021]. Some examples of common pre-processing
|
||||
steps are handling of missing data, detection of outliers, and encoding
|
||||
of categorical data. Data at this stage is also split into training and
|
||||
testing data, typically following somewhere near a 70-30 split. These
|
||||
two data sets are used for different portions of the rest of model
|
||||
building. The Training set data is used to develop feature sets, train
|
||||
our algorithms, tune hyperparameters, compare models, and all of the
|
||||
other activities required to choose a final model (e.g., the model we
|
||||
want to put into production) [@boehmke2020]. Once the final model is
|
||||
chosen the test set data is used to estimate an unbiased assessment of
|
||||
the model's performance, which we refer to as the generalization error
|
||||
[@boehmke2020]. Most time (as much as 80%) is invested into the data
|
||||
processes stage.
|
||||
|
||||
####
|
||||
|
|
97
index.qmd
97
index.qmd
|
@ -1,31 +1,108 @@
|
|||
# Introduction
|
||||
|
||||
The primary business purpose of the clinical laboratory is to provide results of testing requested by physicians and other healthcare professionals. This testing in a broad sense is used to help solve diagnostic problems [@verboeket-vandevenne2012]. To continue to add value to the business purpose of the laboratory, laboratory professionals can add value beyond just running the provided tests. Laboratory professionals can add value through both reflective and reflex testing. Automated analyzers add most tests based on rules (algorithms) established by laboratory professionals; this is defined as 'reflex testing.' Clinical biochemists add the remainder of tests after considering a more comprehensive range of information than can readily be incorporated into reflex testing algorithms; this is defined as 'reflective testing' [@srivastava2010]. Both reflex and reflective testing became possible with the advent of laboratory information systems (LIS) that were sufficiently flexible to permit modification of existing test requests at various stages of the analytical process [@srivastava2010]. This research study will focus specifically on reflex testing, those tests added automatically by a set of rules established in each laboratory. In most current clinical laboratories, reflex testing is performed with a 'hard' cutoff, using a specifically established range with no means of flexibility [@murphy2021].
|
||||
The primary business purpose of the clinical laboratory is to provide
|
||||
results of testing requested by physicians and other healthcare
|
||||
professionals. This testing in a broad sense is used to help solve
|
||||
diagnostic problems [@verboeket-vandevenne2012]. To continue to add
|
||||
value to the business purpose of the laboratory, laboratory
|
||||
professionals can add value beyond just running the provided tests.
|
||||
Laboratory professionals can add value through both reflective and
|
||||
reflex testing. Automated analyzers add most tests based on rules
|
||||
(algorithms) established by laboratory professionals; this is defined as
|
||||
'reflex testing.' Clinical biochemists add the remainder of tests after
|
||||
considering a more comprehensive range of information than can readily
|
||||
be incorporated into reflex testing algorithms; this is defined as
|
||||
'reflective testing' [@srivastava2010]. Both reflex and reflective
|
||||
testing became possible with the advent of laboratory information
|
||||
systems (LIS) that were sufficiently flexible to permit modification of
|
||||
existing test requests at various stages of the analytical process
|
||||
[@srivastava2010]. This research study will focus specifically on reflex
|
||||
testing, those tests added automatically by a set of rules established
|
||||
in each laboratory. In most current clinical laboratories, reflex
|
||||
testing is performed with a 'hard' cutoff, using a specifically
|
||||
established range with no means of flexibility [@murphy2021].
|
||||
|
||||
<!--# Rewrite this section -->
|
||||
|
||||
This study will examine the use of Machine learning to develop algorithms to allow flexibility for automatic reflex testing in clinical chemistry. The goal is to fill the gap between hard coded reflex testing and fully manual reflective testing using machine learning algorithms.
|
||||
This study will examine the use of Machine learning to develop
|
||||
algorithms to allow flexibility for automatic reflex testing in clinical
|
||||
chemistry. The goal is to fill the gap between hard coded reflex testing
|
||||
and fully manual reflective testing using machine learning algorithms.
|
||||
|
||||
<!--# -->
|
||||
|
||||
## Statement of Problem
|
||||
|
||||
## Purpose and Research Question
|
||||
## Purpose and Research Statement
|
||||
|
||||
Develop and test a machine learning algorithm to further reduce unnecessary reflex testing.
|
||||
Develop and test a machine learning algorithm to establish if said
|
||||
algorithm can perform better then current hard coded rules to reduced
|
||||
unnecessary patient testing.
|
||||
|
||||
## Significance
|
||||
|
||||
Health spending in the U.S. increased by 4.6% in 2019 to \$3.8 trillion or \$11,582 per capita. This growth rate is in line with 2018 (4.7 percent) and slightly faster than what was observed in 2017 (4.3 percent) [@americanmedicalassociation2021]. Although laboratory costs comprise only about 5% of the healthcare budget in the United States, it is estimated that laboratory services drive up to 70% of all downstream medical decisions, which encompass a substantial portion of the budget [@ma2019]. As healthcare budgets increase, payers, including Medicare, commercial insurers, and employers, will demand accountability and eliminate the abuse and misuse of ineffective testing strategies [@hernandez2003]. Increasingly, payers demand to know the value of the tests, with value equaling quality per unit of cost. Payers want laboratories to prove that tests are cost-effective; as reimbursement rates decline for many standard laboratory tests, the incentives for automated reflex testing rise for many clinical laboratories [@hernandez2003]. Unnecessary laboratory tests are a significant source of waste in the United States healthcare system. Prior studies suggest that 20% of labs performed are unnecessary, wasting 200 billion dollars each year [@li2022].
|
||||
Health spending in the U.S. increased by 4.6% in 2019 to \$3.8 trillion
|
||||
or \$11,582 per capita. This growth rate is in line with 2018 (4.7
|
||||
percent) and slightly faster than what was observed in 2017 (4.3
|
||||
percent) [@americanmedicalassociation2021]. Although laboratory costs
|
||||
comprise only about 5% of the healthcare budget in the United States, it
|
||||
is estimated that laboratory services drive up to 70% of all downstream
|
||||
medical decisions, which encompass a substantial portion of the budget
|
||||
[@ma2019]. As healthcare budgets increase, payers, including Medicare,
|
||||
commercial insurers, and employers, will demand accountability and
|
||||
eliminate the abuse and misuse of ineffective testing strategies
|
||||
[@hernandez2003]. Increasingly, payers demand to know the value of the
|
||||
tests, with value equaling quality per unit of cost. Payers want
|
||||
laboratories to prove that tests are cost-effective; as reimbursement
|
||||
rates decline for many standard laboratory tests, the incentives for
|
||||
automated reflex testing rise for many clinical laboratories
|
||||
[@hernandez2003]. Unnecessary laboratory tests are a significant source
|
||||
of waste in the United States healthcare system. Prior studies suggest
|
||||
that 20% of labs performed are unnecessary, wasting 200 billion dollars
|
||||
each year [@li2022].
|
||||
|
||||
A typical example of reflex testing is thyrotropin (TSH), relaxing to free thyroxine (Free T4 or FT4). TSH measurement is a sensitive screening test for thyroid dysfunction. Guidelines from the American Thyroid Association, the American Association of Clinical Endocrinologists, and the National Academy of Clinical Biochemistry have endorsed TSH measurement as the best first-line strategy for detecting thyroid dysfunction in most clinical settings [@plebani2020]. Traditionally the cutoff for reflex testing was simply the reference range for a patient's sex and race. However, recent studies have suggested that widening these ranges reduces reflex testing by up to 34% [@plebani2020]. In an additional study, the authors concluded that the TSH reference range leading to reflex Free T4 testing could likely be widened to decrease the number of unnecessary Free T4 measurements performed. This reduction would reduce overall costs to the medical system without likely causing negative consequences of missing the detection of people with thyroid hormone abnormalities [@whitneyw.woodmansee2018].
|
||||
A typical example of reflex testing is thyrotropin (TSH), relaxing to
|
||||
free thyroxine (Free T4 or FT4). TSH measurement is a sensitive
|
||||
screening test for thyroid dysfunction. Guidelines from the American
|
||||
Thyroid Association, the American Association of Clinical
|
||||
Endocrinologists, and the National Academy of Clinical Biochemistry have
|
||||
endorsed TSH measurement as the best first-line strategy for detecting
|
||||
thyroid dysfunction in most clinical settings [@plebani2020].
|
||||
Traditionally the cutoff for reflex testing was simply the reference
|
||||
range for a patient's sex and race. However, recent studies have
|
||||
suggested that widening these ranges reduces reflex testing by up to 34%
|
||||
[@plebani2020]. In an additional study, the authors concluded that the
|
||||
TSH reference range leading to reflex Free T4 testing could likely be
|
||||
widened to decrease the number of unnecessary Free T4 measurements
|
||||
performed. This reduction would reduce overall costs to the medical
|
||||
system without likely causing negative consequences of missing the
|
||||
detection of people with thyroid hormone abnormalities
|
||||
[@whitneyw.woodmansee2018].
|
||||
|
||||
<!--# This paragraph should be written -->
|
||||
<!--# This paragraph should be written and most removed-->
|
||||
|
||||
The reduction in testing aside, the hard-coded rule still exists. Additionally, machine learning may predict missing values in a patient's record or even suggest further testing on a particular patient. In a study at Massachusetts General hospital, researchers predicted ferritin results based on already run laboratory testing [@charnaalbert2020].
|
||||
The reduction in testing aside, the hard-coded rule still exists.
|
||||
Additionally, machine learning may predict missing values in a patient's
|
||||
record or even suggest further testing on a particular patient. In a
|
||||
study at Massachusetts General hospital, researchers predicted ferritin
|
||||
results based on already run laboratory testing [@charnaalbert2020].
|
||||
|
||||
## Purposed Study Set Up
|
||||
|
||||
Using the Medical Information Mart for Intensive Care (MIMIC) IV Database develop and test a machine learning algorithm to determine if TSH reflex testing can be further reduced.
|
||||
Using the Medical Information Mart for Intensive Care (MIMIC) IV
|
||||
Database develop and test a machine learning algorithm to determine if
|
||||
TSH reflex testing can be further reduced.
|
||||
|
||||
The MIMIC-IV database contains patient records from 2008 to 2019 for patients admitted to the critical care units of Beth Israel Deaconess Medical Center. It is a common database used for various studies. The data will be cleaned and tided to contain various patient demographics, and all available laboratory testing for each patient. The exact structure of the cleaned data will be determined later. Once cleaned the data will be split into a training and testing data set. The training data will be used to develop various machine learning algorithms to attempt to develop an algorithm that can perform better then the hard coded rules in place today. The study will primarily focus on TSH reflex testing as this is the most common reflex test used in most laboratories. The hypothesis however is that this model could be used for many different types of reflex testing in the lab.
|
||||
The MIMIC-IV database contains patient records from 2008 to 2019 for
|
||||
patients admitted to the critical care units of Beth Israel Deaconess
|
||||
Medical Center. It is a common database used for various studies. The
|
||||
data will be cleaned and tided to contain various patient demographics,
|
||||
and all available laboratory testing for each patient. The exact
|
||||
structure of the cleaned data will be determined later. Once cleaned the
|
||||
data will be split into a training and testing data set. The training
|
||||
data will be used to develop various machine learning algorithms to
|
||||
attempt to develop an algorithm that can perform better then the hard
|
||||
coded rules in place today. The study will primarily focus on TSH reflex
|
||||
testing as this is the most common reflex test used in most
|
||||
laboratories. The hypothesis however is that this model could be used
|
||||
for many different types of reflex testing in the lab.
|
||||
|
|
|
@ -135,3 +135,59 @@ PMCID: PMC7961679}
|
|||
url = {https://www.captodayonline.com/can-machine-learning-algorithms-predict-lab-values/},
|
||||
langid = {canadian}
|
||||
}
|
||||
|
||||
@article{rabbani2022,
|
||||
title = {Applications of machine learning in routine laboratory medicine: Current state and future directions},
|
||||
author = {Rabbani, Naveed and Kim, Grace Y. E. and Suarez, Carlos J. and Chen, Jonathan H.},
|
||||
year = {2022},
|
||||
month = {05},
|
||||
date = {2022-05-01},
|
||||
journal = {Clinical Biochemistry},
|
||||
pages = {1--7},
|
||||
volume = {103},
|
||||
doi = {10.1016/j.clinbiochem.2022.02.011},
|
||||
url = {https://www.sciencedirect.com/science/article/pii/S0009912022000595},
|
||||
langid = {en}
|
||||
}
|
||||
|
||||
@article{cabitza2018,
|
||||
title = {Machine learning in laboratory medicine: waiting for the flood?},
|
||||
author = {Cabitza, Federico and Banfi, Giuseppe},
|
||||
year = {2018},
|
||||
month = {04},
|
||||
date = {2018-04-01},
|
||||
journal = {Clinical Chemistry and Laboratory Medicine (CCLM)},
|
||||
pages = {516--524},
|
||||
volume = {56},
|
||||
number = {4},
|
||||
doi = {10.1515/cclm-2017-0287},
|
||||
url = {https://www.degruyter.com/document/doi/10.1515/cclm-2017-0287/html},
|
||||
note = {Publisher: De Gruyter},
|
||||
langid = {en}
|
||||
}
|
||||
|
||||
@article{debruyne2021,
|
||||
title = {Recent evolutions of machine learning applications in clinical laboratory medicine},
|
||||
author = {De Bruyne, Sander and Speeckaert, Marijn M. and Van Biesen, Wim and Delanghe, Joris R.},
|
||||
year = {2021},
|
||||
month = {02},
|
||||
date = {2021-02-17},
|
||||
journal = {Critical Reviews in Clinical Laboratory Sciences},
|
||||
pages = {131--152},
|
||||
volume = {58},
|
||||
number = {2},
|
||||
doi = {10.1080/10408363.2020.1828811},
|
||||
url = {https://doi.org/10.1080/10408363.2020.1828811},
|
||||
note = {Publisher: Taylor & Francis
|
||||
{\_}eprint: https://doi.org/10.1080/10408363.2020.1828811
|
||||
PMID: 33045173}
|
||||
}
|
||||
|
||||
@book{boehmke2020,
|
||||
title = {Hands-On Machine Learning with R},
|
||||
author = {Boehmke, Bradley and Greenwell, Brandon},
|
||||
year = {2020},
|
||||
month = {02},
|
||||
date = {2020-02-01},
|
||||
url = {https://bradleyboehmke.github.io/HOML/}
|
||||
}
|
||||
|
|
Loading…
Reference in a new issue