Automating multilingual indexing

US9715490B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9715490-B2
Application numberUS-201514934250-A
CountryUS
Kind codeB2
Filing dateNov 6, 2015
Priority dateNov 6, 2015
Publication dateJul 25, 2017
Grant dateJul 25, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In an approach to automating multilingual indexing, a computer receives text of a conversation between at least two users. The computer detects at least one language associated with the text. The computer determines whether the language associated with the text is detected with a confidence level that exceeds a threshold. The computer retrieves text from one or more previous conversations between the two users. The computer detects at least one language associated with the text. The computer determines whether the at least one language associated with the text is detected with a confidence level that exceeds a pre-defined threshold. The computer analyzes the text using at least one of the detected languages to create one or more terms. The computer indexes the one or more terms and stores a boost value associated with each of the one or more indexed terms corresponding to confidence level of the detected language.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for automating multilingual indexing, the method comprising: receiving, by one or more computer processors, text of a first conversation between a first user and at least one second user; detecting, by the one or more computer processors, at least one language associated with the text of the first conversation; determining, by the one or more computer processors, whether the at least one language associated with the text of the first conversation is detected with a first confidence level that exceeds a first pre-defined threshold; responsive to determining the at least one language associated with the text of the first conversation is not detected with the first confidence level that exceeds the first pre-defined threshold, retrieving, by the one or more computer processors, text from one or more previous conversations between the first user and the at least one second user; detecting, by the one or more computer processors, at least one language associated with the text of the one or more previous conversations between the first user and the at least one second user; determining, by the one or more computer processors, whether the at least one language associated with the text of the one or more previous conversations between the first user and the at least one second user is detected with a second confidence level that exceeds a second pre-defined threshold; responsive to determining the at least one language associated with the text of the one or more previous conversations between the first user and the at least one second user is detected with the second confidence level that exceeds the second pre-defined threshold, analyzing, by the one or more computer processors, the text of the first conversation using the at least one detected language associated with the text of the one or more previous conversations between the first user and the at least one second user to create one or more index terms, wherein index terms are included in the text of the first conversation; indexing, by the one or more computer processors, the one or more index terms, wherein indexing serves as a mapping from the index terms to the text of the first conversation as automating multilingual indexing; and storing, by the one or more computer processors, the second confidence level of the at least one detected language associated with the text of the one or more previous conversations between the first user and the at least one second user associated with each of the one or more index terms. 2. The method of claim 1 , further comprising: responsive to determining the at least one language associated with the text of the one or more previous conversations between the first user and the at least one second user is not detected with the second confidence level that exceeds the second pre-defined threshold, retrieving, by the one or more computer processors, text from one or more previous conversations by at least one of the first user and the at least one second user; detecting, by the one or more computer processors, at least one language associated with the text of the one or more previous conversations by at least one of the first user and the at least one second user; and determining, by the one or more computer processors, the at least one language associated with the text of the one or more previous conversations by at least one of the first user and the at least one second user is detected with a third confidence level that exceeds a third pre-defined threshold. 3. The method of claim 2 , further comprising, responsive to determining the at least one language associated with the text of the one or more previous conversations by at least one of the first user and the at least one second user is not detected with the second confidence level that exceeds the second pre-defined threshold, retrieving, by the one or more computer processors, one or more preferred languages of the first user and the at least one second user. 4. The method of claim 3 , wherein retrieving one or more preferred languages further comprises: retrieving, by the one or more computer processors, text input in one or more applications by a user; determining, by the one or more computer processors, at least one of: a language setting of the one or more applications, a language setting of the user's one or more devices that include the one or more applications, and a language of the text; and determining, by the one or more computer processors, based, at least in part, on the at least one of the language setting of the one or more applications, the language setting of the one or more devices, and the language of the text, one or more preferred languages of the user. 5. The method of claim 4 , further comprising: determining, by the one or more computer processors, a third confidence level of the language of the text input in one or more applications by the user; determining, by the one or more computer processors, whether the third confidence level exceeds a third pre-defined threshold; responsive to determining the third confidence level exceeds the third pre-defined threshold, determining, by the one or more computer processors, based, at least in part, on the at least one of the language setting of the one or more applications, the language setting of the one or more devices, and the language of the text input in one or more applications by the user, one or more preferred languages of the user; and storing, by the one or more computer processors, the one or more preferred languages. 6. The method of claim 5 , further comprising, responsive to determining the third confidence level does not exceed the third pre-defined threshold, determining, by the one or more computer processors, based, at least in part, on the at least one of the language setting of the one or more applications and the language setting of the one or more devices, one or more preferred languages of the user. 7. The method of claim 5 , wherein the third confidence level is based, at least in part, on at least one of: a quantity of text created in each language, and a degree of matching between the text input and a base form of the text input. 8. A computer program product for automating multilingual indexing, the computer program product comprising: one or more computer readable storage device and program instructions stored on the one or more computer readable storage device and executed by one or more computer processors, the stored program instructions comprising: program instructions to receive text of a first conversation between a first user and at least one second user; program instructions to detect at least one language associated with the text of the first conversation; program instructions to determine whether the at least one language associated with the text of the first conversation is detected with a first confidence level that exceeds a first pre-defined threshold; responsive to determining the at least one language associated with the text of the first conversation is not detected with the first confidence level that exceeds the first pre-defined threshold, program instructions to retrieve text from one or more previous conversations between the first user and the at least one second user; program instructions to detect at least one language associated with the text of the one or more previous conversations between the first user and the at least one second user; program instructions to determine whether the at least one language associated with the text of the one or more previous conversations between the first user and the at least one second user is detected with a second confidence level that exceeds a second pre-defined threshold; responsive to determining the at l

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9715490B2 cover?
In an approach to automating multilingual indexing, a computer receives text of a conversation between at least two users. The computer detects at least one language associated with the text. The computer determines whether the language associated with the text is detected with a confidence level that exceeds a threshold. The computer retrieves text from one or more previous conversations betwe…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/337. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 25 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).