Detection of data in a sequence of characters

US9454522B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9454522-B2
Application numberUS-201414286838-A
CountryUS
Kind codeB2
Filing dateMay 23, 2014
Priority dateJun 6, 2008
Publication dateSep 27, 2016
Grant dateSep 27, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of processing a sequence of characters, the method comprising converting the sequence of characters into a sequence of tokens so that each token comprises a lexeme and one of a plurality of token types. Each of the plurality of token types relates to at least one of a plurality of predetermined functions, wherein at least one said token type relates to multiple functions of the plurality of predetermined functions.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method to process a sequence of characters representing natural or computer language, the method comprising: receiving, by a lexer executing on a processor, the sequence of characters in a communication between users or in output from one of a compiler and an interpreter; converting, by a lexer, the sequence of characters into a sequence of tokens, at least one token comprising a proto-lexeme and a proto-token type, the proto-token type representing multiple token types, each of the multiple token types describing one of a plurality of predetermined classes ascribed to the proto-lexeme within the sequence of characters during the conversion, wherein a parser, executing on the processor, resolves the proto-lexeme into a single lexeme having a single class that is one of the multiple classes represented by the proto-token type, and further resolves the proto-token type into a single token type describing the single class; and when the single class is associated with a user application, presenting, by the processor, an option to update the user application with the single lexeme, wherein the user application includes one of a calendar and an address book. 2. A method according to claim 1 , wherein the proto-token type indicates that the proto-lexeme relates to the single class. 3. A method according to claim 1 , wherein the predetermined classes describe classes of the lexeme in the context of the sequence of characters. 4. A method according to claim 1 , wherein the parser further parses the tokens to detect predetermined types of data in the sequence of characters. 5. A method according to claim 4 , wherein each said type of data corresponds to at least one of said multiple classes and a predetermined combination of said multiple classes. 6. A method according to claim 4 , wherein the predetermined types of data include at least one of a physical address, an IP address, an e-mail address, a time, a day, a date, and a contact number. 7. A method according to claim 4 , wherein the parser further provides a single parsing path in a decision tree for each token type and each proto-token type in the sequence of token. 8. A method of processing data in a sequence of characters representing natural or computer language, comprising: receiving, by a parser executing on a processor, a sequence of tokens, at least one token comprising a proto-lexeme and a proto-token type, the proto-token type representing multiple token types, each of the multiple token types describing one of a plurality of predetermined classes previously ascribed to the proto-lexeme within the sequence of characters received by a lexer in a communication between users or in output from one of a compiler and an interpreter; parsing, by the parser, the sequence of tokens to detect predetermined types of data; resolving, by the parser, the proto-lexeme into a single lexeme having a single class that is one of the multiple classes represented by the proto-token type, and further resolving the proto-token type into a single token type describing the single class; and when the single class is associated with a user application, presenting, by the processor, an option to update the user application with the single lexeme, wherein the user application includes one of a calendar and an address book. 9. A method according to claim 8 , the method comprising: providing, by the parser, a single path for each token type and each proto-token type in the sequence of tokens. 10. A method of processing a sequence of characters representing natural or computer language, the method comprising: receiving, by a lexer executing on a processor, the sequence of characters in a communication between users or in output from one of a compiler and an interpreter; converting, by the lexer, the sequence of characters into a sequence of tokens comprising one or more proto-lexemes and one or more corresponding proto-token types, wherein each proto-lexeme is defined as belonging to one of: a first set comprising one class, wherein the proto-token type represents the one class, and a second set comprising a combination of classes, wherein the proto-token type represents the classes in the combination; resolving, by a parser executing on the processor, each proto-lexeme into a single lexeme and the corresponding proto-token type into a single token type; when the single class is associated with a user application, presenting, by the processor, an option to update the user application with the single lexeme, wherein the user application includes one of a calendar and an address book. 11. A method according to claim 10 , wherein the resolving comprises providing a single analysis path for each said proto-lexeme in a decision tree. 12. A method according to claim 10 , wherein the classes in the combination may include the single class. 13. A method according to claim 10 , further comprising converting the sequence of characters into lexemes and proto-lexemes. 14. An apparatus to detect predetermined data in a sequence of characters representing natural or computer language, the apparatus comprising: a processor; a network interface coupled to the processor to receive the sequence of characters in a communication between users or in output from one of a compiler and an interpreter; a lexer executing on the processor to convert the sequence of characters into a sequence of tokens, at least one token comprising a proto-lexeme and a proto-token type that represents multiple token types, each of the multiple token types describing one of a plurality of predetermined classes ascribed to the proto-lexeme within the sequence of characters during the conversion, wherein a parser, executing the processor, resolves the proto-lexeme into a single lexeme having a single class that is one of the multiple classes represented by the proto-token type, and further resolves the proto-token type into a single token type describing the single class; and when the single class is associated with a user application, presenting, by the processor, an option to update the user application with the single lexeme, wherein the user application includes one of a calendar and an address book. 15. An apparatus according to claim 14 , wherein the predetermined classes describe classes of the lexeme in the context of the sequence of characters. 16. An apparatus according to claim 14 , wherein the parser further parses the tokens to detect predetermined types of data in the sequence of characters. 17. An apparatus according to claim 16 , wherein each type of data corresponds to at least one said multiple classes and a predetermined combination said multiple classes. 18. An apparatus according to claim 16 , wherein the predetermined types of data include at least one of a physical address, an IP address, an e-mail address, a time, a day, a date, and a contact number. 19. An apparatus according to claim 16 , wherein the parser comprises a decision tree having a single parsing path for each token type and each proto-token type in the sequence of tokens. 20. A data processing system for processing a sequence of characters representing natural or computer language comprising: means for receiving the sequence of characters in a communication between users or in output from one of a compiler and an interpreter, means for converting the sequence of characters into a sequence of one or more proto-lexemes and one or more proto-token types, wherein each proto-lexeme is defined as belonging to o

Assignees

Inventors

Classifications

  • Processing or translation of natural language (natural language analysis G06F40/20; semantic analysis G06F40/30) · CPC title

  • Parsing · CPC title

  • Relational databases · CPC title

  • Phrasal analysis, e.g. finite state techniques or chunking · CPC title

  • Lexical analysis, e.g. tokenisation or collocates · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9454522B2 cover?
A method of processing a sequence of characters, the method comprising converting the sequence of characters into a sequence of tokens so that each token comprises a lexeme and one of a plurality of token types. Each of the plurality of token types relates to at least one of a plurality of predetermined functions, wherein at least one said token type relates to multiple functions of the plurali…
Who is the assignee on this patent?
Apple Inc
What technology area does this patent fall under?
Primary CPC classification G06F40/205. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 27 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).