Machine learning collaboration techniques
US-2024420212-A1 · Dec 19, 2024 · US
US9430463B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9430463-B2 |
| Application number | US-201414503128-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 30, 2014 |
| Priority date | May 30, 2014 |
| Publication date | Aug 30, 2016 |
| Grant date | Aug 30, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and processes for exemplar-based natural language processing are provided. In one example process, a first text phrase can be received. It can be determined whether editing the first text phrase to match a second text phrase requires one or more of inserting, deleting, and substituting a word of the first text phrase. In response to determining that editing the first text phrase to match the second text phrase requires one or more of inserting, deleting, and substituting a word of the first text phrase, one or more of an insertion cost, a deletion cost, and a substitution cost can be determined. A semantic edit distance between the first text phrase and the second text phrase in a semantic space can be determined based on one or more of the insertion cost, the deletion cost, and the substitution cost.
Opening claim text (preview).
What is claimed is: 1. A method for processing natural language comprising: at an electronic device: receiving a first text phrase; determining whether editing the first text phrase to match a second text phrase requires one or more of: inserting a first word into the first text phrase, wherein the second text phrase includes the first word; deleting a second word from the first text phrase; wherein the first text phrase includes the second word; and substituting a third word of the first text phrase with a fourth word, wherein the second text phrase includes the fourth word; in response to determining that editing the first text phrase to match the second text phrase requires one or more of inserting the first word into the first text phrase, deleting the second word from the first text phrase, and substituting the third word of the first text phrase with the fourth word, determining one or more of: an insertion cost associated with inserting the first word into the first text phrase; a deletion cost associated with deleting the second word from the first text phrase; and a substitution cost associated with substituting the third word of the first text phrase with the fourth word; determining, based on the one or more of the insertion cost, the deletion cost, and the substitution cost, a semantic edit distance between the first text phrase and the second text phrase in a semantic space, wherein a degree of semantic similarity between the first text phrase and the second text phrase is based on the semantic edit distance; determining, based on the degree of semantic similarity between the first text phrase and the second text phrase, a first intent associated with the first text phrase; and performing, based on the first intent, a task associated with the first text phrase. 2. The method according to claim 1 , wherein: the insertion cost associated with inserting the first word into the first text phrase is determined in response to determining that editing the first text phrase to match the second text phrase requires inserting the first word into the first text phrase; and the insertion cost is determined based on a first predetermined semantic cost and a salience of the first word. 3. The method according to claim 1 , wherein: the deletion cost associated with deleting the second word from the first text phrase is determined in response to determining that editing the first text phrase to match the second text phrase requires deleting the second word from the first text phrase; and the deletion cost is determined based on a second predetermined semantic cost and a salience of the second word. 4. The method according to claim 1 , wherein: the substitution cost associated with substituting the third word of the first text phrase with the fourth word is determined in response to determining that editing the first text phrase to match the second text phrase requires substituting the third word of the first text phrase with the fourth word; and the substitution cost is determined based on a salience of the third word, a salience of the fourth word, a semantic similarity between the third word and the fourth word in the semantic space, a first predetermined semantic cost, and a second predetermined semantic cost. 5. The method according to claim 1 , wherein: the insertion cost is determined based on a first predetermined semantic cost and a salience of the first word; the deletion cost is determined based on a second predetermined semantic cost and a salience of the second word; and the first predetermined semantic cost is higher than the second predetermined semantic cost. 6. The method according to claim 1 , wherein: the insertion cost is determined based on a first predetermined semantic cost and a salience of the first word; the deletion cost is determined based on a second predetermined semantic cost and a salience of the second word; and the substitution cost is determined based on a salience of the third word, a salience of the fourth word, a semantic similarity between the third word and the fourth word in the semantic space, the first predetermined semantic cost, and the second predetermined semantic cost. 7. The method according to claim 6 , wherein: the salience of the first word is based on a frequency of occurrence of the first word in a first corpus; the salience of the second word is based on a frequency of occurrence of the second word in the first corpus; the salience of the third word is based on a frequency of occurrence of the third word in the first corpus; and the salience of the fourth word is based on a frequency of occurrence of the fourth word in the first corpus. 8. The method according to claim 7 , wherein the first corpus comprises a plurality of categories that includes a plurality of text phrases, and wherein: the salience of the first word is based on a proportion of the plurality of categories that include the first word; the salience of the second word is based on a proportion of the plurality of categories that include the second word; the salience of the third word is based on a proportion of the plurality of categories that include the third word; and the salience of the fourth word is based on a proportion of the plurality of categories that include the fourth word. 9. The method according to claim 6 , wherein: the salience of the first word is based on a normalized entropy of the first word in a second corpus; the salience of the second word is based on a normalized entropy of the second word in the second corpus; the salience of the third word is based on a normalized entropy of the third word in the second corpus; and the salience of the fourth word is based on a normalized entropy of the fourth word in the second corpus. 10. The method according to claim 6 , wherein: the salience of the first word is based on whether a first predetermined list of sensitive words includes the first word; the salience of the second word is based on whether a second predetermined list of sensitive words includes the second word; the salience of the third word is based on whether a third predetermined list of sensitive words includes the third word; and the salience of the fourth word is based on whether a fourth predetermined list of sensitive words includes the fourth word. 11. The method according to claim 1 , further comprising: determining a centroid distance between a centroid position of the first text phrase in the semantic space and a centroid position of the second text phrase in the semantic space, wherein the degree of semantic similarity between the first text phrase and the second text phrase is based on the centroid distance. 12. The method according to claim 11 , wherein the centroid position of the first text phrase is determined based on a semantic position of one or more words of the first text phrase in the semantic space and the centroid position of the second text phrase is determined based on a semantic position of one or more words of the second text phrase in the semantic space. 13. The method according to claim 11 , wherein the centroid position of the first text phrase is determined based on a salience of one or more words of the first text phrase and the centroid position of the second text phrase is determined based on a salience of one or more words of the second text phrase. 14. The method according to claim 11 , wherein the degree of semantic similarity is based on a linear combination of the semantic edit distance and the centroid distance. 15. The method according to claim 1 , wherein the degree of
Semantic analysis · CPC title
Calculation of difference between files · CPC title
Physics · mapped topic
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.