Shift invariant loss for deep learning based image segmentation

US11200676B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11200676-B2
Application numberUS-202016746340-A
CountryUS
Kind codeB2
Filing dateJan 17, 2020
Priority dateJan 17, 2020
Publication dateDec 14, 2021
Grant dateDec 14, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods of improving alignment in dense prediction neural networks are disclosed. A method includes identifying, at a computing system, an input data set and a label data set with one or more first parts of the input data set corresponding to a label. The computing system processes the input data set using a neural network to generate a predicted label data set that identifies one or more second parts of the input data set predicted to correspond to the label. The computing system determines an alignment result using the predicted label data set and the label data set and a transformation of the one or more first parts, including a shift, rotation, scaling, and/or deformation, based on the alignment result. The computing system computes a loss score using the transformation, label data and the predicted label data set and updates the neural network based on the loss score.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: identifying, at a computing system, an input data set; identifying, at the computing system, a label data set that identifies one or more first parts of the input data set that correspond to a particular label; processing, by the computing system, the input data set using a neural network to generate predicted label data set that identifies one or more second parts of the input data set predicted to correspond to the particular label; determining, by the computing system, an alignment result using the predicted label data set and the label data set; determining, by the computing system and based on the alignment result, a transformation that results in a shift, rotation, scaling, and/or deformation of the one or more first parts of the input data set relative to the one or more second parts of the input data set; computing, by the computing system, a loss score using the transformation, label data and the predicted label data set; and updating, by the computing system, the neural network based on the loss score. 2. The method of claim 1 , further comprising: receiving an image; and parsing the image into a set of patches, wherein the input data set corresponds to the set of patches. 3. The method of claim 1 , further comprising: receiving an audio recording; and parsing the audio recording into a set of clips, wherein the input data set corresponds to the set of clips. 4. The method of claim 1 , wherein determining the alignment result includes performing a cross-correlation technique using the predicted label data set and the label data set or a mutual-information technique using the predicted label data set and the label data set. 5. The method of claim 1 , wherein computing the loss score includes: co-registering the label data set and the predicted label data set using the alignment result. 6. The method of claim 1 , further comprising: cropping the predicted label data set; and padding the predicted label data set with one or more average values, wherein the alignment result is computed with the padded predicted label data set. 7. The method of claim 1 , further comprising repeating the steps of processing the data, determining an alignment result, determining a transformation, computing a loss score, and updating the neural network until the loss score converges. 8. A system comprising: one or more processors; and a computer-readable medium storing a plurality of instructions that when executed cause the one or more processors to: identify an input data set; identify a label data set that identifies one or more first parts of the input data set that correspond to a particular label; process the input data set using a neural network to generate predicted label data set that identifies one or more second parts of the input data set predicted to correspond to the particular label; determine an alignment result using the predicted label data set and the label data set; determine, based on the alignment result, a transformation that results in a shift, rotation, scaling, and/or deformation of the one or more first parts of the input data set relative to the one or more second parts of the input data set; compute a loss score using the transformation, label data and the predicted label data set; and update the neural network based on the loss score. 9. The system of claim 8 , wherein the plurality of instructions that when executed further cause the one or more processors to: receive an image; and parse the image into a set of patches, wherein the input data set corresponds to a patch of the set of patches. 10. The system of claim 8 , wherein the plurality of instructions that when executed further cause the one or more processors to: receive an audio recording; and parse the audio recording into a set of clips, wherein the input data set corresponds to a clip of the set of clips. 11. The system of claim 8 , wherein determining the alignment result includes performing a cross-correlation technique using the predicted label data set and the label data set or a mutual-information technique using the predicted label data set and the label data set. 12. The system of claim 8 , wherein computing the loss score includes: co-registering the label data set and the predicted label data set using the alignment result. 13. The system of claim 8 , wherein the plurality of instructions that when executed further cause the one or more processors to: crop the predicted label data set; and pad the predicted label data set with one or more average values, wherein the alignment result is computed with the padded predicted label data set. 14. The system of claim 8 , wherein the plurality of instructions that when executed further cause the one or more processors to repeat the steps of processing the data, determining an alignment result, determining a transformation, computing a loss score, and updating the neural network until the loss score converges. 15. A non-transitory computer-readable medium storing a plurality of instructions that when executed by one or more processors perform a method comprising: identifying an input data set; identifying a label data set that identifies one or more first parts of the input data set that correspond to a particular label; processing the input data set using a neural network to generate predicted label data set that identifies one or more second parts of the input data set predicted to correspond to the particular label; determining an alignment result using the predicted label data set and the label data set; determining, based on the alignment result, a transformation that results in a shift, rotation, scaling, and/or deformation of the one or more first parts of the input data set relative to the one or more second parts of the input data set; computing a loss score using the transformation, label data and the predicted label data set; and updating the neural network based on the loss score. 16. The non-transitory computer-readable medium of claim 15 , wherein the method further comprises: receiving an image; and parsing the image into a set of patches, wherein the input data set corresponds to a patch of the set of patches. 17. The non-transitory computer-readable medium of claim 15 , wherein the method further comprises: receiving an audio recording; and parsing the audio recording into a set of clips, wherein the input data set corresponds to a clip of the set of clips. 18. The non-transitory computer-readable medium of claim 15 , wherein determining the alignment result includes performing a cross-correlation technique using the predicted label data set and the label data set or a mutual-information technique using the predicted label data set and the label data set. 19. The non-transitory computer-readable medium of claim 15 , wherein computing the loss score includes: co-registering the label data set and the predicted label data set using the alignment result. 20. The non-transitory computer-readable medium of claim 15 , wherein the method further comprises: cropping the predicted label data set; and padding the predicted label data set with one or more average values, wherein the alignment result is computed with the padded predicted label data set.

Assignees

Inventors

Classifications

  • G06V20/695Primary

    Preprocessing, e.g. image segmentation · CPC title

  • involving a deformation of the sample pattern or of the reference pattern; Elastic matching · CPC title

  • Shifting the patterns to accommodate for positional errors · CPC title

  • G06T7/11Primary

    Region-based segmentation · CPC title

  • Supervised learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11200676B2 cover?
Systems and methods of improving alignment in dense prediction neural networks are disclosed. A method includes identifying, at a computing system, an input data set and a label data set with one or more first parts of the input data set corresponding to a label. The computing system processes the input data set using a neural network to generate a predicted label data set that identifies one o…
Who is the assignee on this patent?
Verily Life Sciences Llc
What technology area does this patent fall under?
Primary CPC classification G06V20/695. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 14 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).