Leveraging genetics and feature engineering to boost placement predictability for seed product selection and recommendation by field

US12039463B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12039463-B2
Application numberUS-202318114936-A
CountryUS
Kind codeB2
Filing dateFeb 27, 2023
Priority dateOct 24, 2018
Publication dateJul 16, 2024
Grant dateJul 16, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An example computer-implemented method includes receiving agricultural data records comprising a first set of yield properties for a first set of seeds grown in a first set of environments, and receiving genetic feature data related to a second set of seeds. The method further includes generating a second set of yield properties for the second set of seeds associated with a second set of environments by applying a model using the genetic feature data and the agricultural data records. In addition, the method includes determining predicted yield performance for a third set of seeds associated with one or more target environments by applying the second set of yield properties, and generating seed recommendations for the one or more target environments based on the predicted yield performance for the third set of seeds. In the present example, the method also includes causing display of the seed recommendations.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: receiving, over a digital data communication network at a server computer system, agricultural data records comprising genetic feature data and environmental data for a set of seeds grown in a set of environments; determining, by the server computer system, using a machine learning model, predicted yield performance for a target set of seeds associated with one or more target environments based on the genetic feature data and the environmental data included in the received agricultural data records and on at least one genomic-by-environmental (GxE) feature(s) for the set of seeds and the set of environments; generating, by the server computer system, seed recommendations for the one or more target environments based on the predicted yield performance for the target set of seeds; and causing display, on a display device communicatively coupled to the server computer system, the seed recommendations. 2. The computer-implemented method of claim 1 , wherein the genetic feature data includes genomic marker data. 3. The computer-implemented method of claim 1 , wherein the genetic feature data includes a pedigree-based kinship matrix or a gene marker-based kinship matrix. 4. The computer-implemented method of claim 1 , wherein the at least one GxE feature(s) includes a non-additive interaction between the genetic feature data and the environmental data for the set of seeds and set of environments. 5. The computer-implemented method of claim 4 , wherein determining the predicted yield performance for the target set of seeds includes first applying feature engineering to develop the at least one GxE feature(s). 6. The computer-implemented method of claim 5 , wherein applying the feature engineering includes: transforming at least one continuous environmental feature of the environmental data into one or more of multiple distinct environmental categorical features defined by classification criteria; and using the transformed environmental categorical features to generate the predicted yield performance. 7. The computer-implemented method of claim 6 , wherein the classification criteria defines the multiple categorical features based on values of the at least one continuous environmental feature. 8. The computer-implemented method of claim 6 , wherein the at least one continuous environmental feature includes one or more of: pH, day length, temperature, slope, soil drainage, soil texture, and/or elevation. 9. The computer-implemented method of claim 1 , wherein generating the seed recommendations for the one or more target environments is further based on one or more of hybrid or inbred genetic heterotic groups, genetic markers associated with biotech traits or quantitative trait loci, whole genome genetics markers, long-shaped haplotype, inbred BLUP-GCA (best linear unbiased predication—general combining ability) yield, yield related phenotypes, and/or hybrid or inbred disease characteristics. 10. The computer-implemented method of claim 1 , wherein predictor variables of the machine learning model include one or more of genomic marker data, genetic cluster data, inbred encoding, genetic kinship matrixes, the at least one GxE feature(s), and/or the environmental data; and wherein a target variable of the machine learning model indicates a probabilistic value ranging from 0 to 1. 11. One or more non-transitory computer-readable storage media storing instructions which when executed by one or more processors of a server computer system, cause the one or more processors to perform the steps of: receiving, over a digital data communication network, agricultural data records comprising genetic feature data and environmental data for a set of seeds grown in a set of environments; determining, using a machine learning model, predicted yield performance for a target set of seeds associated with one or more target environments based on the genetic feature data and the environmental data included in the received agricultural data records and on at least one genomic-by-environmental (GxE) feature(s) for the set of seeds and the set of environments; generating seed recommendations for the one or more target environments based on the predicted yield performance for the target set of seeds, and causing display, on a display device communicatively coupled to the server computer system, the seed recommendations. 12. The one or more non-transitory computer-readable storage media of claim 11 , wherein the genetic feature data includes genomic marker data. 13. The one or more non-transitory computer-readable storage media of claim 11 , wherein the genetic feature data includes a pedigree-based kinship matrix or a gene marker-based kinship matrix. 14. The one or more non-transitory computer-readable storage media of claim 11 , wherein the at least one GxE feature(s) includes a non-additive interaction between the genetic feature data and the environmental data for the set of seeds and set of environments. 15. The one or more non-transitory computer-readable storage media of claim 14 , wherein the instructions, when executed by one or more processors, cause the one or more processors, in determining the predicted yield performance for the target set of seeds, to perform the step of first applying feature engineering to develop the at least one GxE feature(s). 16. The one or more non-transitory computer-readable storage media of claim 15 , wherein the instructions, when executed by one or more processors, cause the one or more processors, in applying the feature engineering further, to perform the steps of: transforming at least one continuous environmental feature of the environmental data into one or more of multiple distinct environmental categorical features defined by classification criteria; and using the transformed environmental categorical features to generate the predicted yield performance. 17. The one or more non-transitory computer-readable storage media of claim 16 , wherein the classification criteria defines the multiple categorical features based on values of the at least one continuous environmental feature. 18. The one or more non-transitory computer-readable storage media of claim 16 , wherein the at least one continuous environmental feature includes one or more of: pH, day length, temperature, slope, soil drainage, soil texture, and/or elevation. 19. The one or more non-transitory computer-readable storage media of claim 11 , wherein generating the seed recommendations for the one or more target environments is further based on one or more of hybrid or inbred genetic heterotic groups, genetic markers associated with biotech traits or quantitative trait loci, whole genome genetics markers, long-shaped haplotype, inbred BLUP-GCA (best linear unbiased predication—general combining ability) yield, yield related phenotypes, and/or hybrid or inbred disease characteristics. 20. The one or more non-transitory computer-readable storage media of claim 11 , wherein predictor variables of the machine learning model include one or more of genomic marker data, genetic cluster data, inbred encoding, genetic kinship matrixes, the at least one GxE feature(s), and/or the environmental data; and wherein a target variable of the machine learning model indicates a probabilistic value ranging from 0 to 1.

Assignees

Inventors

Classifications

  • Machine learning · CPC title

  • ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding · CPC title

  • ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations · CPC title

  • G16B10/00Primary

    ICT specially adapted for evolutionary bioinformatics, e.g. phylogenetic tree construction or analysis · CPC title

  • Design optimisation, verification or simulation (optimisation, verification or simulation of circuit designs G06F30/30) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12039463B2 cover?
An example computer-implemented method includes receiving agricultural data records comprising a first set of yield properties for a first set of seeds grown in a first set of environments, and receiving genetic feature data related to a second set of seeds. The method further includes generating a second set of yield properties for the second set of seeds associated with a second set of enviro…
Who is the assignee on this patent?
Climate Llc
What technology area does this patent fall under?
Primary CPC classification G16B10/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 16 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).