Automated design of primer sets for nucleic acid amplification
US-2024336954-A1 · Oct 10, 2024 · US
US10586609B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10586609-B2 |
| Application number | US-201514926051-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 29, 2015 |
| Priority date | Oct 30, 2014 |
| Publication date | Mar 10, 2020 |
| Grant date | Mar 10, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method and apparatus for determining similarity among gene sequences, for compressing a gene sequence, and for decompressing a gene sequence. The method for determining similarity between a first gene sequence and a second gene sequence includes: moving a sliding window of a predefined length on the first gene sequence and the second gene sequence respectively; extracting a first part String 1 i of the first gene sequence within the sliding window, and a second part String 2 i of the second gene sequence within the sliding window during the i th movement of the sliding window; and determining similarity between the first gene sequence and the second gene sequence based on the first part String 1 i and the second part String 2 i . Also provided is an apparatus for the above method.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method for determining similarity between a first gene sequence of a first sample and a second gene sequence of a second sample, wherein the first sample is taken from a first organism and the second sample is taken from a second organism, the method comprising: moving a sliding window having a predefined length on the first gene sequence; moving the sliding window on the second gene sequence, simultaneously with moving the sliding window on the first gene sequence; extracting a first part String 1 i of the first gene sequence that is present within the sliding window during an i th movement of the sliding window; extracting a second part String 2 i of the second gene sequence that is present within the sliding window during the i th movement of the sliding window; determining the similarity between the first gene sequence and the second gene sequence based on a similarity between the first part String 1 i and the second part String 2 i , thereby identifying the similarity of the first gene sequence and second gene sequence; selecting the first gene sequence as a reference gene sequence for the second gene sequence, based on the similarity between the first gene sequence and the second gene sequence meeting a predefined threshold; storing the reference gene sequence in a memory device; and storing the second gene sequence in the memory device as an identifier for the reference gene sequence plus a difference between the reference gene sequence and the second gene sequence, thereby minimizing an amount of data from the second gene sequence that is stored in the memory device. 2. The computer-implemented method according to claim 1 , wherein moving the sliding window on the first gene sequence and moving the sliding window on the second gene sequence further comprises: moving the sliding window at a stepsize that is less than or equal to the predefined length. 3. The computer-implemented method according to claim 1 , wherein determining the similarity between the first gene sequence and the second gene sequence based on the similarity between the first part String 1 i and the second part String 2 i during the i th movement of the sliding window, comprises: calculating local similarity, similarity i , between the first part String 1 i and the second part String 2 i ; and determining the similarity between the first gene sequence and the second gene sequence based on the local similarity, similarity i . 4. The computer-implemented method according to claim 3 , wherein calculating local similarity, similarity i , between the first part String 1 i and the second part String 2 i comprises: calculating the local similarity, similarity i , based on an edit distance, d i , between the first part String 1 i and the second part String 2 i . 5. The computer-implemented method according to claim 3 , wherein determining similarity between the first gene sequence and the second gene sequence based on the local similarity, similarity i , comprises: calculating the similarity between the first gene sequence and the second gene sequence based on a formula similarity=Σ i=1 N similarity i , wherein N is the number of movements of the sliding window. 6. An apparatus for determining similarity between a first gene sequence of a first sample and a second gene sequence of a second sample, wherein the first sample is taken from a first organism and the second sample is taken from a second organism, comprising: a processor device; and a memory communicatively coupled to the processor device, the memory storing a program product that, when executed by the processor device, causes the processor device to carry out steps for determining a similarity between a first gene sequence and a second gene sequence, the steps comprising: moving a sliding window of a predefined length on the first gene sequence; moving the sliding window on the second gene sequence, simultaneously with moving the sliding window on the first gene sequence; extracting a first part String 1 i of the first gene sequence that is present within the sliding window during an i th movement of the sliding window; extracting a second part String 2 i of the second gene sequence that is present within the sliding window during the i th movement of the sliding window; determining the similarity between the first gene sequence and the second gene sequence based on a similarity between the first part String 1 i and the second part String 2 i ; selecting the first gene sequence as a reference gene sequence for the second gene sequence, based on the similarity between the first gene sequence and the second gene sequence meeting a predefined threshold; storing the reference gene sequence in the memory; and storing the second gene sequence in the memory as an identifier for the reference gene sequence plus a difference between the reference gene sequence and the second gene sequence, thereby minimizing an amount of data from the second gene sequence that is stored in the memory. 7. The apparatus according to claim 6 , wherein the moving the sliding window on the first gene sequence and the moving the sliding window on the second gene sequence further comprises: moving the sliding window at a stepsize that is less than or equal to the predefined length. 8. The apparatus according to claim 6 , wherein the determining step of the computer-implemented method further comprises: calculating local similarity, similarity i , between the first part String 1 i and the second part String 2 i during the i th movement of the sliding window; and determining the similarity between the first gene sequence and the second gene sequence based on the local similarity, similarity i . 9. The apparatus according to claim 8 , wherein the calculating step of the computer-implemented method further comprises: calculating the local similarity, similarity i , based on an edit distance, d i , between the first part String 1 i and the second part String 2 i . 10. The apparatus according to claim 8 , wherein the calculating step of the computer-implemented method further comprises: calculating the similarity between the first gene sequence and the second gene sequence based on a formula similarity=Σ i=1 N similarity i , wherein N is the number of movements of the sliding window. 11. The apparatus according to claim 7 , wherein the i th movement of the sliding window is one of a plurality of movements of the sliding window, and wherein each movement of the plurality of movements is made at the same stepsize. 12. The apparatus according to claim 11 , wherein the stepsize comprises a number of characters of the first gene sequence and of the second gene sequence by which the sliding window is shifted. 13. The apparatus according to claim 6 , wherein the computer-implemented method further comprises: extracting a third part of the first gene sequence that is present within the sliding window during an i+1 th movement of the sliding window, wherein the third part of the first gene sequence contains a different character string from the first part of the first gene sequence; extracting a fourth part of the second gene sequence that is present within the sliding window during the i+1 th movement of the sliding window, wherein the fourth part of the second gene sequence contains a different character string from the second part of the second gene sequence, wherein the determining the similarity between the first gene sequence and the second gene sequence is further based on a similarity between the third part of the firs
ICT specially adapted for sequence analysis involving nucleotides or amino acids · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.