Method and system of audio input bit-size conversion for audio processing

US11875783B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11875783-B2
Application numberUS-202016892080-A
CountryUS
Kind codeB2
Filing dateJun 3, 2020
Priority dateJun 3, 2020
Publication dateJan 16, 2024
Grant dateJan 16, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method, system, and device are directed to audio input bit-size conversion for compatibility to audio processing systems with an expected input sample bit-size.

First claim

Opening claim text (preview).

What is claimed is: 1. An audio processing device comprising: memory storing audio input including human speech and in a form of initial samples with a first bit-size; and at least one processor communicatively coupled to the memory to operate by: dividing at least one of the initial samples into multiple sample parts; generating at least one gain formed by at least one neural network accelerator; applying the at least one gain to at least one of the sample parts to form at least one scaled sample part; and generating a scaled output sample in a second bit-size comprising combining at least portions of the multiple sample parts including the at least one scaled sample part, and wherein a portion of one of the sample parts being combined has most significant bits (MSBs) of the initial sample and a portion of another one of the sample parts being combined has least significant bits (LSBs) of the initial sample. 2. The device of claim 1 , wherein the sample parts each have a size so that the sample parts cooperatively hold all of the bits from the initial sample. 3. The device of claim 1 , wherein the sample parts are of the second bit-size. 4. The device of claim 1 , wherein the sample parts comprise at least a high sample part filled with the most significant bits and other bits from the initial sample and a low sample part having the least significant bits from the initial sample and remaining bit spaces filled with zeros. 5. The device of claim 1 , wherein the dividing comprises storing the initial sample in a container of a transition sample with a third bit-size that is larger than the first bit-size of the initial sample and evenly divisible into the sample parts. 6. The device of claim 5 , wherein the first bit-size is 24 bits, the second bit-size is 16 bits, and the third bit-size is 32 bits. 7. The device of claim 5 , wherein the at least one processor is arranged to operate by deinterleaving a sequence of the transition samples, wherein each transition sample has a high sample part and a low sample part, and the deinterleaving to generate a high sample vector of high sample parts separate from a low sample vector of low sample parts to separately input the high and low sample vectors into a neural network accelerator. 8. The device of claim 7 , wherein the at least one processor to shift the low sample parts having the least significant bits (LSBs) of the initial samples to reserve a bit space in the low sample part for a sign bit using at least one neural network accelerator. 9. The device of claim h wherein the at least one processor operates by determining absolute value versions of the sample parts and a separate sign vector maintaining a sign of at least one of the sample parts to use to generate the scaled output sample. 10. A method of audio processing comprising: obtaining audio input including human speech and in a form of initial samples with a first bit-size; dividing at least one of the initial samples into multiple sample parts; generating, by at least one neural network accelerator, at least one gain; applying the at least one gain to at least one of the sample parts to form at least one scaled sample part; and generating a scaled output sample in a second bit-size comprising combining at least portions of the multiple sample parts and including the at least one scaled sample part, and wherein a portion of one of the sample parts being combined has most significant bits (MSBs) of the initial sample and a portion of another one of the sample parts being combined has least significant bits (LSBs) of the initial sample. 11. The method of claim 10 , wherein the at least one gain is computed dynamically depending on the sample parts. 12. The method of claim 10 , wherein the at least one gain is computed by using a count of a number of bit spaces occupied by one of the sample parts. 13. The method of claim 10 , wherein the same at least one gain is used for multiple sample parts of a same sample set of multiple parts of multiple initial samples regardless of which sample part was used to form the gain. 14. The method of claim 10 , wherein multiple initial samples of a sample set of initial samples are divided into sample parts, and wherein the at least one gain is generated by using only data of a high sample part with the highest value among all high sample parts of the set. 15. The method of claim 14 , comprising determining the high sample part with the highest value by using max pooling layers of a neural network. 16. A computer-implemented system for audio processing comprising: at least one microphone to capture audio input including human speech; memory to store the audio input in of initial samples of a first bit-size; at least one processor communicatively coupled to the at least one microphone and at least one memory, and to operate by: dividing at least one of the initial samples into multiple sample parts; generating at least one gain formed by at least one neural network accelerator; applying the at least one gain to at least one of the sample parts to form at least one scaled sample part; and generating a scaled output sample in a second bit-size comprising combining at least portions of the multiple sample parts and including the at least one scaled sample part, and wherein a portion of one of the sample parts being combined has most significant bits (MSBs) of the initial sample and a portion of another one of the sample parts being combined has least significant bits (LSBs) of the initial sample. 17. The system of claim 16 , wherein the at least one gain is arranged so that applying the at least one gain causes a bit-shift in the sample part to place a most significant bit of the sample part at the highest available bit space of a scaled sample part to be used to form the scaled output sample. 18. The system of claim 17 , wherein the bit-shift provides empty bit spaces on the scaled sample part to receive bits of a scaled low sample part associated with the least significant bits of the initial sample. 19. The system of claim 16 , wherein the scaled output sample is formed by combining at least portions of a scaled high sample part and a scaled low sample part. 20. At least one non-transitory machine readable medium comprising instructions that, in response to being executed on a computing device, cause the computing device to operate by: obtaining audio input including human speech and in a form of initial samples with a first bit-size; dividing at least one of the initial samples into multiple sample parts; generating, by at least one neural network accelerator, at least one gain; applying the at least one gain to at least one of the sample parts to form at least one scaled sample part; and generating a scaled output sample in a second bit-size comprising combining at least portions of the multiple sample parts and including the at least one scaled sample part, and wherein a portion of one of the sample parts being combined has most significant bits (MSBs) of the initial sample and a portion of another one of the sample parts being combined has least significant bits (LSBs) of the initial sample. 21. The machine readable medium of claim 20 , wherein at least one of the dividing, applying the at least one gain, and generating a scaled output sample are performed by one or more neural network accelerators without the use of a digital signal processor (DSP). 22. The machine readable medium of claim 20 , wherein the instructi

Assignees

Inventors

Classifications

  • G10L15/16Primary

    using artificial neural networks · CPC title

  • Interface to dedicated audio devices, e.g. audio drivers, interface to CODECs · CPC title

  • Management of the audio stream, e.g. setting of volume, audio stream path · CPC title

  • H03M7/28Primary

    Programmable structures, i.e. where the code converter contains apparatus which is operator-changeable to modify the conversion process · CPC title

  • the radix thereof being two · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11875783B2 cover?
A method, system, and device are directed to audio input bit-size conversion for compatibility to audio processing systems with an expected input sample bit-size.
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G10L15/16. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 16 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).