What technology area does this patent fall under?

Primary CPC classification G06F40/263. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 02 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Automatic locale determination for electronic documents

US9858258B1 · US · B1

Patent metadata
Field	Value
Publication number	US-9858258-B1
Application number	US-201615282350-A
Country	US
Kind code	B1
Filing date	Sep 30, 2016
Priority date	Sep 30, 2016
Publication date	Jan 2, 2018
Grant date	Jan 2, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Automatic locale determination for documents is described. In an embodiment, a computer server receives an electronic document comprising a plurality of unknown-language data elements each associated with one or more types. Based on a document schema of the document, the computer system selects one or more unknown-language data elements from the plurality of unknown-language data elements and assigning to each of the one or more unknown-language data elements a corresponding weight value based on a respective type of the unknown-language data element. The computer system compares the one or more unknown-language data elements with a plurality of known-language data elements that are associated with the document schema and based on the comparing, determines a number of unknown-language data elements in the one or more unknown-language data elements that matched any in a subset of the plurality of known-language data elements, wherein the subset of known-language data elements corresponds to a particular language. Based on the number of data elements that matched to the subset of known-language data elements and based on the corresponding weight assigned to each unknown-language data element in the number of unknown-language data elements, the computer system determines a language confidence level value specifying a level of machine confidence that the document is expressed in the particular language and based on the language confidence value for the particular language exceeding a language threshold value, automatically processes the document using the particular language.

First claim

Opening claim text (preview).

What is claimed is: 1. A data processing method comprising: receiving, at a server computer, an electronic document comprising a plurality of unknown-language data elements each associated with one or more types; based on a document schema of the document, selecting one or more unknown-language data elements from the plurality of unknown-language data elements; assigning to each of the one or more unknown-language data elements a corresponding weight value based on a respective type of the unknown-language data element; comparing the one or more unknown-language data elements with a plurality of known-language data elements that are associated with the document schema; based on the comparing, determining a number of unknown-language data elements in the one or more unknown-language data elements that matched any in a subset of the plurality of known-language data elements, wherein the subset of known-language data elements corresponds to a particular language; based on the number of unknown-language data elements in the one or more unknown-language data elements that matched to the subset of known-language data elements and based on the corresponding weight value assigned to each unknown-language data element in the number of unknown-language data elements, determining a language confidence level value specifying a level of machine confidence that the document is expressed in the particular language; based on the language confidence level value for the particular language exceeding a language threshold value, automatically processing the document using the particular language. 2. The method of claim 1 , further comprising: receiving the document as part of receiving a request to process the document, the request comprising one or more additional data elements; selecting an additional data element that indicates possible language for the request, the additional data element assigned to a particular weight; based on a data value of the additional data element and the particular weight, adjusting the language confidence level value for the document. 3. The method of claim 1 , wherein the respective type of the unknown-language data element is a data field name of the unknown-language data element or a data value of the unknown-language data element of the document. 4. The method of claim 1 , wherein selecting one or more unknown-language data elements from the plurality of unknown-language data elements is further based on a document type of the document. 5. The method of claim 1 , wherein the document schema of the document depends on a type of structured data included in the document, and wherein the type of the structured data is one or more of XML (Extensible Markup Language), JSON (JavaScript Object Notation), cXML (commerce eXtensible Markup Language), IDoc (Intermediate Document), CSV (Comma Separated values), or ODF (Open Document). 6. The method of claim 1 , further comprising: storing the plurality of known-language data elements associated with the document schema of the document in a data store in a plurality of language sets of known-language data elements, each set of known-language data elements corresponding to a supported language in a plurality of supported languages that includes the particular language; comparing the one or more unknown-language data elements with one or more known-language data elements in said each set of known-language data elements to determine corresponding number of unknown-language data elements that matched for the corresponding supported language. 7. The method of claim 1 , wherein the comparing further comprises stemming the one or more unknown-language data elements to match with the plurality of known-language data elements. 8. The method of claim 1 , further comprising: based on the document schema of the document, selecting at least one unknown-language data element of the plurality of unknown-language data elements such that the at least one unknown-language data element has a data value that can vary in formats based on a locale of the document; based on a format of the data value, determining a locale confidence level value for the document. 9. The method of claim 8 , wherein the format of the data value is based at least on one of the following: a date format, a number format, or a currency value format. 10. The method of claim 1 , further comprising determining the threshold language value based on a maximum language confidence value possible for the document. 11. The method of claim 1 , further comprising determining the language threshold value based on a plurality of language confidence level values, for a plurality of languages, determined for the document that includes the language confidence level value. 12. The method of claim 1 , further comprising: automatically determining that a file that includes the document is compressed; in response to automatically determining that the file that includes the document is compressed, automatically decompressing the file to extract the document. 13. The method of claim 1 , further comprising: automatically determining that the document is encrypted; in response to automatically determining that the document is encrypted, automatically decrypting the document. 14. A data-processing method comprising: using a first computer, obtaining from one or more non-transitory computer-readable data storage media a copy of one or more sequences of instructions that are stored on the media and are arranged, when executed using a second computer among a plurality of other computers to cause the second computer to perform: using a computer, receiving an electronic document comprising a plurality of unknown-language data elements each associated with one or more types; using the computer, based on a document schema of the document, selecting one or more unknown-language data elements from the plurality of unknown-language data elements; using the computer, assigning to each of the one or more unknown-language data elements a corresponding weight value based on a respective type of the unknown-language data element; using the computer, comparing the one or more unknown-language data elements with a plurality of known-language data elements that are associated with the document schema; using the computer, based on the comparing, determining a number of unknown-language data elements in the one or more unknown-language data elements that matched any in a subset of the plurality of known-language data elements, wherein the subset of known-language data elements corresponds to a particular language; using the computer, based on the number of unknown-language data elements in the one or more unknown-language data elements that matched to the subset of known-language data elements and based on the corresponding weight value assigned to each unknown-language data element in the number of unknown-language data elements, determining a language confidence level value specifying a level of machine confidence that the document is expressed in the particular language; using the computer, based on the language confidence level value for the particular language exceeding a language threshold value, automatically processing the document using the particular language. 15. The method of claim 14 , further comprising: receiving the document as part of receiving a request to process the document, the request comprising one or more additional data elements; selecting an additional data element that indicates possible language for the request, the additional data element assigned to a particular weight; based on a data value

Assignees

Coupa Software Inc

Inventors

Pasquini Matthew

Classifications

G06F16/9535
Search customisation based on user profiles and personalisation · CPC title
G06F40/279
Recognition of textual entities · CPC title
G06F16/81
Indexing, e.g. XML tags; Data structures therefor; Storage structures · CPC title
G06F40/146
Coding or compression of tree-structured data · CPC title
G06F16/182
Distributed file systems · CPC title

Patent family

Related publications grouped by family.

View patent family 60788784

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9858258B1 cover?: Automatic locale determination for documents is described. In an embodiment, a computer server receives an electronic document comprising a plurality of unknown-language data elements each associated with one or more types. Based on a document schema of the document, the computer system selects one or more unknown-language data elements from the plurality of unknown-language data elements and a…
Who is the assignee on this patent?: Coupa Software Inc
What technology area does this patent fall under?: Primary CPC classification G06F40/263. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 02 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Systems and methods for multilingual metadata

Server-side internationalization and localization of web applications using a scripting language

Method and system for determining device settings at device initialization

Frequently asked questions