Audio coding device, audio coding method, audio coding program, audio decoding device, audio decoding method, and audio decoding program
US-2015262588-A1 · Sep 17, 2015 · US
US9767812B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9767812-B2 |
| Application number | US-201414226788-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 26, 2014 |
| Priority date | Sep 26, 2011 |
| Publication date | Sep 19, 2017 |
| Grant date | Sep 19, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods for increasing transmission bandwidth efficiency by the analysis and synthesis of the ultimate components of transmitted content are presented. To implement such a system, a dictionary or database of elemental codewords can be generated from a set of audio clips. Using such a database, a given arbitrary song or other audio file can be expressed as a series of such codewords, where each given codeword in the series is a compressed audio packet that can be used as is, or, for example, can be tagged to be modified to better match the corresponding portion of the original audio file. Each codeword in the database has an index number or unique identifier. For a relatively small number of bits used in a unique ID, e.g. 27-30, several hundreds of millions of codewords can be uniquely identified. By providing the database of codewords to receivers of a broadcast or content delivery system in advance, instead of broadcasting or streaming the actual compressed audio signal, all that need be transmitted is the series of identifiers along with any modification instructions to the identified codewords. After reception, intelligence on the receiver having access to a locally stored copy of the dictionary can reconstruct the original audio clip by accessing the codewords via the received IDs, modify them as instructed by the modification instructions, further modify the codewords either individually or in groups using the audio profile of the original audio file (also sent by the encoder) and play back a generated sequence of phase corrected codewords and modified codewords as instructed. In exemplary embodiments of the present invention, such modification can extend into neighboring codewords, and can utilize either or both (i) cross correlation based time alignment and (ii) phase continuity between harmonics, to achieve higher fidelity to the original audio clip.
Opening claim text (preview).
What is claimed: 1. A method of transmitting an audio content stream, comprising: encoding the audio content using a perceptual encoder to obtain a first series of compressed audio packets; comparing each of the compressed audio packets in said first series of compressed packets with a database of compressed audio packets created using the same perceptual encoder, each of which has a unique identifier, and identifying a close match database packet for each first series compressed audio packet; generating a sequence of said unique identifiers of said close match database packets to represent said first series of compressed audio packets and, if the close match database packet is not an exact match, a modification instruction or an error vector for each Identified close match database packet; and transmitting the sequence of (i) unique identifiers and (ii) associated modification instructions or error vectors across a communications channel to one or more receivers as part of a broadcast, in a form that at least one of the receivers can process to play to a user the audio content stream. 2. The method of claim 1 , further comprising one of: generating a modification instruction or an error vector for each identified close match database packet for each first series compressed audio packet, and sending said modification instruction or error vector with each of said unique identifiers in said sequence of unique identifiers; or generating a modification instruction or an error vector for each identified dose match database packet for each first series compressed audio packet, and sending said modification instruction or error vector with each of said unique identifiers in said sequence of unique identifiers, wherein the unique identifiers and modification instructions or error vectors are grouped and the bit length of each of said unique identifier and modification instruction or error vector grouping is 46 bits. 3. The method of claim 1 , wherein said database of compressed audio packets is generated as follows: obtain original audio content for a set of audio files; encode a first audio file from said set using a perceptual encoder to obtain a series of compressed audio packets for said first audio file, and store said series of compressed audio packets in the database, each with a unique identifier; for each additional audio file in the set of audio files: encode the audio file using the perceptual encoder to obtain a series of compressed audio packets for the audio file; compare each of the series of compressed audio packets for the additional audio file with the compressed audio packets stored in the database; remove any of the compressed packets for the additional audio file that are similar by a defined metric to a compressed audio packet already stored in the database; store the non-removed compressed packets for said additional audio file in the database, each with a unique identifier. 4. The method of claim 3 , wherein at least one of: said unique identifier is a unique identification number of between 20-30 bits; said comparing each of the series of compressed audio packets for the additional audio file with the compressed audio packets stored in the database includes assigning a similarity score having at least 20 similarity gradations to each of said compressed audio packets for the additional audio file as regards each packet already stored in the database; and said comparing each of the series of compressed audio packets for the additional audio file with the compressed audio packets stored in the database includes assigning a similarity score having at least 20 similarity gradations to each of said compressed audio packets for the additional audio file as regards each packet already stored in the database; wherein said similarity score is a number between 1-5, with increments every 0.1 and with 1 being the most similar. 5. The method of claim 3 , further comprising one of: (i) following the storage of said series of compressed audio packets in the database for said first audio file, comparing said series of compressed audio packets stored in the database amongst each other, and removing ones of said series of compressed audio packets in the database for said first audio file that are similar to another compressed audio packet of said first audio file by a defined metric; and (ii) following the storage of said series of compressed audio packets in the database for said first audio file, comparing said series of compressed audio packets stored In the database amongst each other, and removing ones of said series of compressed audio packets in the database for said first audio file that similar to another compressed audio packet of said first audio file by a defined metric; wherein said comparing each of the series of compressed audio packets for the first audio file amongst each other includes assigning a similarity score having at least 20 similarity gradations to each pair of said compressed audio packets for the first audio file. 6. The method of claim 5 , wherein packets being determined to be similar is defined by a metric which includes having a similarity score of between 1 -1.4. 7. The method of claim 5 , further comprising: following the storage of said series of compressed audio packets in the database for said first audio file, comparing said series of compressed audio packets stored in the database amongst each other, and removing ones of said series of compressed audio packets in the database for said first audio file that are similar to another compressed audio packet of said first audio file by a defined metric, wherein said comparing each of the series of compressed packets for the additional audio file with those compressed packets stored in the database includes assigning a similarity score having at least 10 similarity gradations to each of said compressed packets for the additional audio file as regards each packet already stored in the database. 8. The method of claim 7 , wherein said similarity score is a number between 1-5, with increments every 0.1and with 1 being the most similar. 9. The method of claim 8 , wherein packets being determined to be similar is defined by a metric which includes having a similarity score of between 1-1.4. 10. The method of claim 1 , wherein each of the compressed audio packets in the database of compressed audio packets was generated by: encoding an audio file using a perceptual encoder to obtain a series of compressed packets for said first audio file, and storing one or more of the compressed packets. 11. The method of claim 1 , wherein the unique identifier for each compressed packet in the database is a unique identification number of between 20-30 bits. 12. The method of claim 1 , wherein each of the compressed audio packets in the database of compressed audio packets was generated by: sampling a full length audio clip, and dividing it into segments of 2048 samples; calculating an Odd Discrete Frequency Transform for each RMS normalized time domain segment; performing psychoacoustic analysis over each segment to calculate masking thresholds corresponding to N quality indices; analyzing each segment with other segments present in the database to identify the uniqueness of the segment; removing any segment that is not unique by a defined metric; storing the unique segments in the database as the compressed audio packets. 13. The method of claim 12 , wherein each of said segments was considered as an examine frame, and each of said other segments present in the database was considered as a reference frame, and each examine frame was allocated a similarit
Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing · CPC title
Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis (in musical instruments G10H) · CPC title
in band on channel [IBOC] · CPC title
using orthogonal transformation · CPC title
of audio {(determination or detection of speech characteristics in general G10L25/00; speech recognition in general G10L15/00)} · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.