Voice-enabled dialog interaction with web pages

US9690854B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9690854-B2
Application numberUS-201314092033-A
CountryUS
Kind codeB2
Filing dateNov 27, 2013
Priority dateNov 27, 2013
Publication dateJun 27, 2017
Grant dateJun 27, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Voice enabled dialog with web pages is provided. An Internet address of a web page is received including an area with which a user of a client device can specify information. The web page is loaded using the received Internet address of the web page. A task structure of the web page is then extracted. An abstract representation of the web is then generated. A dialog script, based on the abstract representation of the web page is then provided. Spoken information received from the user is converted into text and the converted text is inserted into the area.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of enabling voice interaction with a web page, comprising: receiving, at a processor, an Internet address of the web page, the web page including an area with which a user of a client device can specify information; determining that the client device does not have an alphabetic input device and a displayable keypad; loading the web page using the Internet address; extracting a task structure of the web page; generating a representation of the web page to facilitate the voice interaction with the web page; and providing a dialog script based on the representation to provide the voice interaction with the user, wherein spoken information of the user is converted into text and the text is inserted into the area. 2. The method of claim 1 , further comprising: extracting forms, form constraints, and dependencies from the web page using a browser object model and a document object model. 3. The method of claim 1 , further comprising: generating a browser object model of the web page. 4. The method of claim 3 , wherein the browser object model comprises a state of hypertext markup language elements as the hypertext markup language elements would appear in a web browser. 5. The method of claim 3 , further comprising: generating a document object model of the web page. 6. The method of claim 5 , wherein the document object model comprises a tree of hypertext markup language elements. 7. The method of claim 1 , further comprising: detecting that the user of the client device prefers to specify information in the web page by voice. 8. The method of claim 1 , further comprising: determining that the client device is a portable communications device. 9. The method of claim 8 , wherein the determining that the client device is a portable communications device is based upon a received user agent string in a hypertext transfer protocol address header received with the Internet address. 10. A system for enabling voice interaction with a web page, comprising: a processor; a memory that stores executable instructions that, when executed by the processor, cause the processor to perform operations comprising: receiving an Internet address of the web page, the web page including an area with which a user of a client device can specify information; determining that the client device does not have an alphabetic input device and a displayable keypad; loading the web page using the Internet address; extracting a task structure of the web page; generating a representation of the web page to facilitate the voice interaction with the web page; and providing a dialog script based on the representation to provide the voice interaction with the user, and wherein spoken information of the user is converted into text and the text is inserted into the area. 11. A non-transitory computer readable storage medium encoded with an executable computer program that enables voice interaction with a web page and that, when executed by a processor, causes the processor to perform operations comprising: receiving an Internet address of the web page, the web page including an area with which a user of a client device can specify information; determining that the client device does not have an alphabetic input device and a displayable keypad; loading the web page using the Internet address; extracting a task structure of the web page; generating a representation of the web page to facilitate the voice interaction with the web page; and providing a dialog script based on the representation to provide the voice interaction with the user, wherein spoken information of the user is converted into text and the text is inserted into the area. 12. The non-transitory computer readable storage medium of claim 11 , wherein the operations further comprise: extracting forms, form constraints, and dependencies from the web page using a browser object model and a document object model. 13. The non-transitory computer readable storage medium of claim 11 , wherein the operations further comprise: generating a browser object model of the web page. 14. The non-transitory computer readable storage medium of claim 13 , wherein the browser object model comprises a state of hypertext markup language elements as the hypertext markup language elements would appear in a web browser. 15. The non-transitory computer readable storage medium of claim 13 , wherein the operations further comprise: generating a document object model of the web page. 16. The non-transitory computer readable storage medium of claim 11 , wherein the operations further comprise: detecting that the user of the client device prefers to specify information in the web page by voice. 17. The non-transitory computer readable storage medium of claim 11 , wherein the operations further comprise: determining that the client device is a portable communications device. 18. The non-transitory computer readable storage medium of claim 17 , wherein the determining that the client device is a portable communications device is based upon a received user agent string in a hypertext transfer protocol address header received with the Internet address.

Assignees

Inventors

Classifications

  • G06F40/174Primary

    Form filling; Merging · CPC title

  • of access to content, e.g. by caching · CPC title

  • Document structures and storage, e.g. HTML extensions · CPC title

  • G10L15/26Primary

    Speech to text systems (G10L15/08 takes precedence) · CPC title

  • Procedures used during a speech recognition process, e.g. man-machine dialogue · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9690854B2 cover?
Voice enabled dialog with web pages is provided. An Internet address of a web page is received including an area with which a user of a client device can specify information. The web page is loaded using the received Internet address of the web page. A task structure of the web page is then extracted. An abstract representation of the web is then generated. A dialog script, based on the abstrac…
Who is the assignee on this patent?
Nuance Communications Inc
What technology area does this patent fall under?
Primary CPC classification G06F40/174. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 27 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).