Medical clinical trial site identification
US-10515099-B2 · Dec 24, 2019 · US
US11494680B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11494680-B2 |
| Application number | US-201815980532-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 15, 2018 |
| Priority date | May 15, 2018 |
| Publication date | Nov 8, 2022 |
| Grant date | Nov 8, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system for predicting subject enrollment for a study includes a time-to-first-enrollment (TTFE) model and a first-enrollment-to-last-enrollment (FELE) model for each site in the study. The TTFE model includes a Gaussian distribution with a generalized linear mixed effects model solved with maximum likelihood point estimation or with Bayesian regression, and the FELE model includes a negative binomial distribution with a generalized linear mixed effects model solved with maximum likelihood point estimation or with Bayesian regression estimation.
Opening claim text (preview).
The invention claimed is: 1. A method for predicting subject enrollment for a clinical study, comprising: generating a database of unique healthcare sites, the database including data regarding site enrollment history for at least some of the sites and having no duplicated sites; splitting the database of unique healthcare sites into a training set and a testing set; determining training data from the training set based on time to first subject enrollment and enrollment count from the time of first subject enrollment to the time of last subject enrollment; training a first statistical model to predict a time to first enrollment for each site in the unique healthcare site database using the training data based on time to first subject enrollment; training a second statistical model to predict enrollment count for periods of time after the time of first subject enrollment for each site in the unique healthcare site database using the training data based on enrollment count from the time of first subject enrollment to the time of last subject enrollment; generating a clinical study model for predicting subject enrollment by: combining the first and second statistical models for each site by using the predicted time to first enrollment as a starting point for generating the predicted enrollment count for the periods of time after the time of first subject enrollment; and aggregating the predicted enrollment count for each period of time for each site to predict cumulative enrollment for the clinical study for each period of time; using the clinical study model to generate an initial prediction of subject enrollment for each site in the unique healthcare database; receiving updated site enrollment history; using the clinical study model to generate a revised prediction of subject enrollment for each site in the unique healthcare database, wherein the revised prediction improves as site enrollment history increases; and using at least one of the initial prediction or the revised prediction for each site to improve the efficiency of the clinical study; wherein: the first statistical model comprises a Gaussian distribution for the time to first subject enrollment in the training data and a generalized linear mixed effects model for each random effect variable; the first statistical model converges using maximum likelihood point estimation; the second statistical model comprises a gamma-Poisson distribution for the enrollment count from the time of first subject enrollment to the time of last subject enrollment in the training data and a generalized linear mixed effects model for each random effect variable; and the second statistical model converges using Bayesian regression estimation. 2. The method of claim 1 , wherein generating a database of unique healthcare sites comprises: receiving a database of entities; determining which of the entities is related to healthcare; applying a gradient boosting model to pairs of healthcare-related entities that have a common geographic characteristic; calculating a matching probability for each pair of healthcare-related entities; when the matching probability for a pair of healthcare-related entities at least equals a pre-determined threshold, manually reviewing the pair of healthcare-related entities to determine whether they are a single healthcare site; when the pair of healthcare-related entities is determined to be a single healthcare site, adding the single healthcare site to the database of unique healthcare sites; when the matching probability for the pair of healthcare-related entities is less than the pre-determined threshold, adding the healthcare-related entities to the database of unique healthcare sites; and adding sites from a site master managed database to the database of unique healthcare sites. 3. The method of claim 2 , wherein sites from the site master managed database and the database of unique healthcare sites are compared to eliminate duplicate sites and integrate the data about each site. 4. The method of claim 2 , wherein the common geographic characteristic is selected from a group consisting of country, state, and zip code. 5. The method of claim 2 , wherein the site master managed database is generated by: receiving a database of study sites; preparing the information for the study sites; applying a gradient boosting model to pairs of study sites that have a common geographic characteristic; calculating a matching probability for each pair of study sites; when the matching probability for a pair of study sites at least equals a pre-determined second threshold, manually reviewing the pair of study sites to determine whether they are a single study site; and when the pair of study sites is determined to be a single study site, adding the single study site to the site master managed database. 6. The method of claim 5 , wherein when the matching probability for the pair of study sites is less than the pre-determined second threshold, adding the study sites to the database of unique healthcare sites when the names and addresses for the study sites exist and are recognizable. 7. The method of claim 5 , wherein after the information for the study sites is prepared, when a first study site is not matched with a second study site having a common geographic characteristic, adding the first study site to the database of unique healthcare sites when the name and address for the first study site exists and is recognizable. 8. The method of claim 5 , wherein the common geographic characteristic is selected from a group consisting of country, state, and zip code.
Related publications grouped by family.
Answers are generated from the same data shown on this page.