Who is the assignee on this patent?

Wu Xinli, Yang Jianwu, Univ Peking Founder Group Co, and 3 more

What technology area does this patent fall under?

Primary CPC classification G06Q10/10. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 24 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method and system for incremental collection of forum replies

Patent metadata
Field	Value
Publication number	US-9552435-B2
Application number	US-201113997257-A
Country	US
Kind code	B2
Filing date	Dec 22, 2011
Priority date	Dec 22, 2010
Publication date	Jan 24, 2017
Grant date	Jan 24, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present application discloses methods and systems for incrementally collecting replies in a forum and belongs to the technical field of collecting network information. The method comprises periodically determining whether there is a newly-established post and a post having new replies in all forum list pages needed to be collected: if yes, extracting a main post and reply information from the newly-established post, and extracting the information of the new replies from the post having new replies. The system comprises a determining device ( 11 ) for periodically determining whether there is a newly-established post and a post having new replies in all forum list pages needed to be collected; and an extracting device ( 12 ) for extracting a main post and reply information from the newly-established post, and extracting the information of the new replies from the post having new replies. The present application can quickly, accurately and completely collect all main post/replies of a post, so that the drawback that the information of turned pages of a post are missed to be searched or cannot be searched through a general search engine may be overcome.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for incrementally collecting replies in a forum on a computer comprising a processor, the method comprising: determining, using the processor, whether there is a newly-established post or a post with new replies in a forum list page, according to a URL of a first page of the post and number of replies to the post; if it is determined that there is a newly-established post, extracting, using the processor, a main post of the newly-established post and reply information from the newly-established post; if it is determined that there is a post with new replies, calculating, using the processor, an origination and a number of the new replies to, based on the calculated origination and the calculated number, extract the new replies, wherein the determining further comprises: acquiring, using the processor, each URL of list page from a collection queue of list pages recording URLs of the at least one forum list page; retrieving, using the processor, the URL of the first page of each post and the number of current replies from webpage contents corresponding to the acquired URL; and determining, using the processor, if the post has been recorded in an information list of collected posts according to the retrieved URL of the first page, if not, determining, using the processor, that the post is a newly-established post, and the method further comprises: adding, using the processor, the retrieved URL of the first page and the retrieved number of current replies into an information list of collected posts. 2. The method according to claim 1 , wherein the determining further comprises: retrieving a URL of the first page of each post and the number of current replies from webpage contents corresponding to URLs of the forum list page; determining whether the post exists in an information list of collected posts according to the retrieved URL of the first page, and whether the retrieved current number of replies is larger than a number of present replies recorded in said information list, if yes, it is determined that the post has a new reply. 3. The method according to claim 2 , further comprising: adding the URL of the forum list page into a collection queue of forum list pages if a collection interval for the forum list page expires; retrieving URLs of list pages from the collection queue of forum list pages in a First-In-First-Out order. 4. The method according to claim 3 , wherein the collection interval is dynamically adjustable according to an update frequency of the forum of the URLs of list pages. 5. The method according to claim 3 , wherein the URLs retrieved from the collection queue of list pages meet a friendly access condition of the website of the retrieved URLs of list pages. 6. The method according to claim 2 , further comprising: adding the URL of the first page of the newly-established post or the URL of the post with new replies into a collection queue of content pages; extracting the main post and/or reply and/or URLs of turned pages from the webpage contents corresponding to URLs of the forum list page. 7. The method according to claim 6 , wherein, for the new-established post, if the URL of the first page of the post exists in the collection queue of content pages, the method further comprises: extracting the URL of the first page of the post; replacing a record of a number of present replies of the post in the information list of collected posts with the number of current replies; inserting the URL of the first page of the post into the collection queue of content pages. 8. The method according to claim 6 , wherein the retrieving of URLs of list pages from the collection queue of forum list pages comprises: acquiring the URLs of list pages from the collection queue of list pages in order of FIFO, the acquired URLs meeting a friendly access condition of the website of the URLs of list pages. 9. The method according to claim 6 , wherein extracting the main post and/or reply information from the webpage contents in step (iv) comprises: if the URL is the URL of the first page of the post and is collected for the first time, extracting the main post and reply information from the webpage contents corresponding to the URL; if the URL is the URL of the first page of the post but is not collected for the first time, calculating an origination of new replies S′ From and the number of new replies C′ ParseCount according to the following formulae, and extracting C′ ParseCount new replies from the origination of new replies S′ From . S From ′ = { R PreNum , N PerPage ⁢ ⁢ includes ⁢ ⁢ main ⁢ ⁢ post R PreNum + 1 , N PerPage ⁢ ⁢ ⁢ does ⁢ ⁢ not ⁢ ⁢ include ⁢ ⁢ main ⁢ ⁢ post

Assignees

Inventors

Classifications

G06F16/955
using information identifiers, e.g. uniform resource locators [URL] · CPC title
G06F16/958
Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking · CPC title
G06Q10/10Primary
Office automation; Time management · CPC title
G06F16/9566Primary
URL specific, e.g. using aliases, detecting broken or misspelled links · CPC title
G06F17/30876
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 46313183

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9552435B2 cover?: The present application discloses methods and systems for incrementally collecting replies in a forum and belongs to the technical field of collecting network information. The method comprises periodically determining whether there is a newly-established post and a post having new replies in all forum list pages needed to be collected: if yes, extracting a main post and reply information from t…
Who is the assignee on this patent?: Wu Xinli, Yang Jianwu, Univ Peking Founder Group Co, and 3 more
What technology area does this patent fall under?: Primary CPC classification G06Q10/10. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 24 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).