System and method for auto-curation of Q and A websites for search engine optimization

US10191985B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-10191985-B1
Application numberUS-201414283112-A
CountryUS
Kind codeB1
Filing dateMay 20, 2014
Priority dateMay 20, 2014
Publication dateJan 29, 2019
Grant dateJan 29, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method of generating rich content webpages from a question and answer (Q&A) library includes providing a topic and one or more seed questions related to the topic. The computing device searches the one or more seed questions against all questions in the Q&A library and identifies questions related to the topic. The computing device clusters the text of the questions related to the topic into a plurality of clusters and then removes substantial duplicates from the plurality of clusters. The computing device generates a rich content webpage by aggregating a question from each cluster onto a single webpage containing the topic.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method of automatically generating rich content landing webpage for a question and answer (Q&A) website, the computer-implemented method comprising: analyzing, by a computing device including a processor executing computer-executable instructions and in communication with a database hosting a Q&A library and in communication through respective networks with a server that serves respective webpages to respective computers of respective users in response to respective requests submitted through a search engine, respective user click histories of questions contained within the Q&A library; generating, by the computing device, a click graph comprising related questions obtained from co-clicks by users as determined by analyzing the click history; clustering, by the computing device, the click graph; selecting, by the computing device, a plurality of seed questions from the clustered click graph, wherein said selection corresponds to a plurality of seed questions related to a common topic; searching, by the computing device, the seed questions against a set of questions stored in the Q&A library; identifying, by the computing device, questions related to the common topic based on common words found in the one or more seed questions; clustering, by the computing device, the text of the questions related to the common topic into a plurality of clusters under the common topic, wherein each cluster represents a different sub-topic of the common topic; comparing, by the computing device, the text of at least one of the questions to the text of at least another one of the questions; determining, by the computing device, that the at least one of the questions is a substantial duplicate of the at least another one of the questions based on the comparing indicating that the at least one of the questions has a textual similarity to the at least another one of the questions above a predetermined textual similarity threshold; removing, by the computing device, the at least one substantial duplicate from the plurality of clusters; selecting, by the computing device, a separate question from each cluster after the at least one substantial duplicate has been removed; and automatically generating, by the computing device, a rich content landing webpage for the Q&A website, the automatically generated rich content landing webpage displaying each selected separate question on a single webpage containing the common topic, the automatically generated rich content landing webpage being presented through a display of a user computer in response to a user search request associated with the common topic. 2. The computer-implemented method of claim 1 , wherein the Q&A library comprises a plurality of tax-related questions. 3. The computer-implemented method of claim 1 , wherein the seed questions are selected at least in part on user votes. 4. The computer-implemented method of claim 1 , wherein the method of generating rich content webpages is performed periodically on the Q&A library. 5. The computer-implemented method of claim 1 , wherein the method of generating rich content webpages is performed dynamically on the Q&A library. 6. The computer-implemented method of claim 1 , further comprising tagging each of the plurality of clusters under the common topic as product-related or tax-related. 7. The computer-implemented method of claim 6 , wherein the rich content webpage contains only product-related questions. 8. The computer-implemented method of claim 6 , wherein the rich content webpage contains only tax-related questions. 9. The computer-implemented method of claim 1 , wherein after the computing device searches the seed questions against the set of questions in the Q&A library and identifies questions related to the common topic, the computing device ranks the identified questions related to the common topic based on top matches. 10. The computer-implemented method of claim 1 , wherein the plurality of clusters under the common topic includes five to fifteen clusters. 11. The computer-implemented method of claim 1 , wherein the single webpage containing the common topic includes five to thirty questions. 12. The computer-implemented method of claim 1 , wherein the automatically generated rich content landing webpage is more likely to have a higher ranking by the search engine. 13. The computer-implemented method of claim 12 , wherein the automatically generated rich content landing webpage is more prominently displayed to users of the search engine by appearing higher on a list of search results generated in response to a query submitted through the search engine.

Assignees

Inventors

Classifications

  • Physics · mapped topic

  • G06F16/355Primary

    Creation or modification of classes or clusters · CPC title

  • Natural language query formulation · CPC title

  • G06F16/951Primary

    Indexing; Web crawling techniques · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10191985B1 cover?
A computer-implemented method of generating rich content webpages from a question and answer (Q&A) library includes providing a topic and one or more seed questions related to the topic. The computing device searches the one or more seed questions against all questions in the Q&A library and identifies questions related to the topic. The computing device clusters the text of the questions relat…
Who is the assignee on this patent?
Intuit Inc
What technology area does this patent fall under?
Primary CPC classification G06F17/30864. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 29 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).