MPEG

xiaoxiao2021-02-27 237

一、实验原理

· 多相滤波器组：将PCM样本变换到32个子带的频域信号

· 心理声学模型:计算信号中不可听觉感知的部分，计算出噪声的遮蔽效应，对这部分被掩蔽的听不见的信号不进行传输

· 比特分配器：根据心理声学模型的计算结果，为每个子带信号分配比特数

· 装帧：产生出一个数据帧，帧要求与MPEG-l兼容

· 心理声学模型：

计算信号中不可听觉感知的部分。

时频分析的矛盾：

1、通过子带分析滤波器组使信号具有高的时间分辨率，确保在短暂冲击信号情况下，编码的声音信号具有足够高的质量 2、又可以使信号通过FFT运算具有高的频率分辨率，因为掩蔽阈值是从功率谱密度推出来的。

在低频子带中，为了保护音调和共振峰的结构，就要求用较小的量化阶、较多的量化级数，即分配较多的位数来表示样本值。而话音中的摩擦音和类似噪声的声音，通常出现在高频子带中，对它分配较少的位数。n

步骤：

将样本变换到频域->确定声压级别->考虑安静时阈值->将音频信号分解成“乐音(tones)” 和“非乐音/噪声”部分：因为两种信号的掩蔽能力不同->音调和非音调掩蔽成分的消除->单个掩蔽阈值的计算->全局掩蔽阈值的计算->每个子带的掩蔽阈值->计算每个子带信号掩蔽比(signal-to-maskratio, SMR)

比特分配过程：

使整帧和每个子带的总噪声—掩蔽比最小。

算法：循环，直到没有比特可用：

1、对每个子带计算掩蔽-噪声比MNR，MNR = SNR –SMR (dB) 2、对最低MNR的子带分配比特，使获益最大的子带的量化级别增加一级 3、重新计算分配了更多比特子带的MNR

实验代码

· #include <stdio.h> #include <stdlib.h> #include <string.h> #include <time.h> #include "common.h" #include "encoder.h" #include "musicin.h" #include "options.h" #include "audio_read.h" #include "bitstream.h" #include "mem.h" #include "crc.h" #include "psycho_n1.h" #include "psycho_0.h" #include "psycho_1.h" #include "psycho_2.h" #include "psycho_3.h" #include "psycho_4.h" #include "encode.h" #include "availbits.h" #include "subband.h" #include "encode_new.h" #include "m2aenc.h" #include <assert.h> FILE *musicin; Bit_stream_struc bs; char *programName; char toolameversion[10] = "0.2l"; void global_init (void) { glopts.usepsy = TRUE; glopts.usepadbit = TRUE; glopts.quickmode = FALSE; glopts.quickcount = 10; glopts.downmix = FALSE; glopts.byteswap = FALSE; glopts.channelswap = FALSE; glopts.vbr = FALSE; glopts.vbrlevel = 0; glopts.athlevel = 0; glopts.verbosity = 2; } /************************************************************************ * * main * * PURPOSE: MPEG II Encoder with * psychoacoustic models 1 (MUSICAM) and 2 (AT&T) * * SEMANTICS: One overlapping frame of audio of up to 2 channels are * processed at a time in the following order: * (associated routines are in parentheses) * * 1. Filter sliding window of data to get 32 subband * samples per channel. * (window_subband,filter_subband) * * 2. If joint stereo mode, combine left and right channels * for subbands above #jsbound#. * (combine_LR) * * 3. Calculate scalefactors for the frame, and * also calculate scalefactor select information. * (*_scale_factor_calc) * * 4. Calculate psychoacoustic masking levels using selected * psychoacoustic model. * (psycho_i, psycho_ii) * * 5. Perform iterative bit allocation for subbands with low * mask_to_noise ratios using masking levels from step 4. * (*_main_bit_allocation) * * 6. If error protection flag is active, add redundancy for * error protection. * (*_CRC_calc) * * 7. Pack bit allocation, scalefactors, and scalefactor select *headerrmation onto bitstream. * (*_encode_bit_alloc,*_encode_scale,transmission_pattern) * * 8. Quantize subbands and pack them into bitstream * (*_subband_quantization, *_sample_encoding) * ************************************************************************/ int frameNum = 0; int main (int argc, char **argv) { typedef double SBS[2][3][SCALE_BLOCK][SBLIMIT]; SBS *sb_sample; typedef double JSBS[3][SCALE_BLOCK][SBLIMIT]; JSBS *j_sample; typedef double IN[2][HAN_SIZE]; IN *win_que; typedef unsigned int SUB[2][3][SCALE_BLOCK][SBLIMIT]; SUB *subband; frame_info frame; frame_header header; char original_file_name[MAX_NAME_SIZE]; char encoded_file_name[MAX_NAME_SIZE]; short **win_buf; static short buffer[2][1152]; static unsigned int bit_alloc[2][SBLIMIT], scfsi[2][SBLIMIT]; static unsigned int scalar[2][3][SBLIMIT], j_scale[3][SBLIMIT]; static double smr[2][SBLIMIT], lgmin[2][SBLIMIT], max_sc[2][SBLIMIT]; // FLOAT snr32[32]; short sam[2][1344]; /* was [1056]; */ int model, nch, error_protection; static unsigned int crc; int sb, ch, adb; unsigned long frameBits, sentBits = 0; unsigned long num_samples; int lg_frame; int i; /* Used to keep the SNR values for the fast/quick psy models */ static FLOAT smrdef[2][32]; static int psycount = 0; extern int minimum; time_t start_time, end_time; int total_time; C_TRACE = fopen("C_trace.txt", "w"); sb_sample = (SBS *) mem_alloc (sizeof (SBS), "sb_sample"); j_sample = (JSBS *) mem_alloc (sizeof (JSBS), "j_sample"); win_que = (IN *) mem_alloc (sizeof (IN), "Win_que"); subband = (SUB *) mem_alloc (sizeof (SUB), "subband"); win_buf = (short **) mem_alloc (sizeof (short *) * 2,"win_buf"); /* clear buffers */ memset ((char *) buffer, 0, sizeof (buffer)); memset ((char *) bit_alloc, 0, sizeof (bit_alloc)); memset ((char *) scalar, 0, sizeof (scalar)); memset ((char *) j_scale, 0, sizeof (j_scale)); memset ((char *) scfsi, 0, sizeof (scfsi)); memset ((char *) smr, 0, sizeof (smr)); memset ((char *) lgmin, 0, sizeof (lgmin)); memset ((char *) max_sc, 0, sizeof (max_sc)); //memset ((char *) snr32, 0, sizeof (snr32)); memset ((char *) sam, 0, sizeof (sam)); global_init (); header.extension = 0; frame.header = &header; frame.tab_num = -1; /* no table loaded */ frame.alloc = NULL; header.version = MPEG_AUDIO_ID; /* Default: MPEG-1 */ total_time = 0; time(&start_time); programName = argv[0]; if (argc == 1) /* no command-line args */ short_usage (); else parse_args (argc, argv, &frame, &model, &num_samples,original_file_name, encoded_file_name); print_config (&frame, &model, original_file_name,encoded_file_name); /* this will load the alloc tables and do some other stuff */ hdr_to_frps (&frame); nch = frame.nch; error_protection = header.error_protection; while (get_audio (musicin, buffer, num_samples, nch, &header) >0) { if (glopts.verbosity > 1) if (++frameNum % 10 == 0) fprintf (stderr, "[%4u]\r", frameNum); fflush (stderr); win_buf[0] = &buffer[0][0]; win_buf[1] = &buffer[1][0]; adb = available_bits (&header, &glopts); // #if CTRACE if (frameNum == 1) fprintf(C_TRACE, "avilibable bits for No.%d:%d\n", frameNum,adb); fprintf(C_TRACE, "\n"); #endif lg_frame = adb / 8; if (header.dab_extension) { /* in 24 kHz we always have 4 bytes */ if (header.sampling_frequency == 1) header.dab_extension = 4; /* You must have one frame in memory if you are in DAB mode */ /* in conformity of the norme ETS 300 401 http://www.etsi.org */ /* see bitstream.c */ if (frameNum == 1) minimum = lg_frame + MINIMUM; adb -= header.dab_extension * 8 + header.dab_length * 8 +16; } { int gr, bl, ch; /* New polyphase filter Combines windowing and filtering. Ricardo Feb'03 */ for( gr = 0; gr < 3; gr++ ) for ( bl = 0; bl < 12; bl++ ) for ( ch = 0; ch < nch; ch++ ) WindowFilterSubband( &buffer[ch][gr * 12 * 32 + 32 * bl], ch,

· // &(*sb_sample)[ch][gr][bl][0] ); } #ifdef REFERENCECODE { /* Old code. left here for reference */ int gr, bl, ch; for (gr = 0; gr < 3; gr++) for (bl = 0; bl < SCALE_BLOCK; bl++) for (ch = 0; ch < nch; ch++) { window_subband (&win_buf[ch], &(*win_que)[ch][0], ch); filter_subband (&(*win_que)[ch][0],&(*sb_sample)[ch][gr][bl][0]); } } #endif #ifdef NEWENCODE scalefactor_calc_new(*sb_sample, scalar, nch, frame.sblimit); find_sf_max (scalar, &frame, max_sc); if (frame.actual_mode == MPG_MD_JOINT_STEREO) { /* this way we calculate more mono than we need */ /* but it is cheap */ combine_LR_new (*sb_sample, *j_sample, frame.sblimit); scalefactor_calc_new (j_sample, &j_scale, 1, frame.sblimit); } #else scale_factor_calc (*sb_sample, scalar, nch, frame.sblimit);

· // pick_scale (scalar, &frame, max_sc); if (frame.actual_mode == MPG_MD_JOINT_STEREO) { /* this way we calculate more mono than we need */ /* but it is cheap */ combine_LR (*sb_sample, *j_sample, frame.sblimit); scale_factor_calc (j_sample, &j_scale, 1,frame.sblimit); } #endif if ((glopts.quickmode == TRUE) && (++psycount %glopts.quickcount != 0)) { /* We're using quick mode, so we're only calculating themodel every 'quickcount' frames. Otherwise, just copy theold ones across */ for (ch = 0; ch < nch; ch++) { for (sb = 0; sb < SBLIMIT; sb++) smr[ch][sb] = smrdef[ch][sb]; } } else { /* calculate the psymodel */ switch (model) { case -1: psycho_n1 (smr, nch); break; case 0: /* Psy Model A */ psycho_0 (smr, nch, scalar, (FLOAT)s_freq[header.version][header.sampling_frequency] * 1000);

· break; case 1: psycho_1 (buffer, max_sc, smr, &frame); break; case 2: for (ch = 0; ch < nch; ch++) { psycho_2 (&buffer[ch][0], &sam[ch][0], ch, &smr[ch][0],//snr32, (FLOAT) s_freq[header.version][header.sampling_frequency] * 1000, &glopts); } break; case 3: /* Modified psy model 1 */ psycho_3 (buffer, max_sc, smr, &frame, &glopts); break; case 4: /* Modified Psycho Model 2 */ for (ch = 0; ch < nch; ch++) { psycho_4 (&buffer[ch][0], &sam[ch][0], ch, &smr[ch][0], //snr32, (FLOAT) s_freq[header.version][header.sampling_frequency] * 1000, &glopts); } break; case 5: /* Model 5 comparse model 1 and 3 */ psycho_1 (buffer, max_sc, smr, &frame); fprintf(stdout,"1 "); smr_dump(smr,nch); psycho_3 (buffer, max_sc, smr, &frame, &glopts); fprintf(stdout,"3 "); smr_dump(smr,nch); break; case 6: /* Model 6 compares model 2 and 4 */ for (ch = 0; ch < nch; ch++) psycho_2 (&buffer[ch][0], &sam[ch][0], ch, &smr[ch][0],//snr32, (FLOAT) s_freq[header.version][header.sampling_frequency] * 1000, &glopts); fprintf(stdout,"2 "); smr_dump(smr,nch); for (ch = 0; ch < nch; ch++) psycho_4 (&buffer[ch][0], &sam[ch][0], ch, &smr[ch][0], //snr32, (FLOAT) s_freq[header.version][header.sampling_frequency] * 1000, &glopts); fprintf(stdout,"4 "); smr_dump(smr,nch); break; case 7: fprintf(stdout,"Frame: %i\n",frameNum); /* Dump the SMRs for all models */

· psycho_1 (buffer, max_sc, smr, &frame); fprintf(stdout,"1"); smr_dump(smr, nch); psycho_3 (buffer, max_sc, smr, &frame, &glopts); fprintf(stdout,"3"); smr_dump(smr,nch); for (ch = 0; ch < nch; ch++) psycho_2 (&buffer[ch][0], &sam[ch][0], ch, &smr[ch][0],//snr32, (FLOAT) s_freq[header.version][header.sampling_frequency] * 1000, &glopts); fprintf(stdout,"2"); smr_dump(smr,nch); for (ch = 0; ch < nch; ch++) psycho_4 (&buffer[ch][0], &sam[ch][0], ch, &smr[ch][0], //snr32, (FLOAT) s_freq[header.version][header.sampling_frequency] * 1000, &glopts); fprintf(stdout,"4"); smr_dump(smr,nch); break; case 8: /* Compare 0 and 4 */

· psycho_n1 (smr, nch); fprintf(stdout,"0"); smr_dump(smr,nch); for (ch = 0; ch < nch; ch++) psycho_4 (&buffer[ch][0], &sam[ch][0], ch, &smr[ch][0], //snr32, (FLOAT) s_freq[header.version][header.sampling_frequency] * 1000, &glopts); fprintf(stdout,"4"); smr_dump(smr,nch); break; default: fprintf (stderr, "Invalid psy model specification: %i\n", model); exit (0); } if (glopts.quickmode == TRUE) /* copy the smr values and reuse them later */ for (ch = 0; ch < nch; ch++) { for (sb = 0; sb < SBLIMIT; sb++) smrdef[ch][sb] = smr[ch][sb]; } if (glopts.verbosity > 4) smr_dump(smr, nch); } #ifdef NEWENCODE sf_transmission_pattern (scalar, scfsi, &frame); main_bit_allocation_new (smr, scfsi, bit_alloc, &adb,&frame, &glopts); //main_bit_allocation (smr, scfsi, bit_alloc, &adb,&frame, &glopts); if (error_protection) CRC_calc (&frame, bit_alloc, scfsi, &crc); write_header (&frame, &bs); //encode_info (&frame, &bs); if (error_protection) putbits (&bs, crc, 16); write_bit_alloc (bit_alloc, &frame, &bs); //encode_bit_alloc (bit_alloc, &frame, &bs); write_scalefactors(bit_alloc, scfsi, scalar, &frame,&bs); //encode_scale (bit_alloc, scfsi, scalar, &frame, &bs); subband_quantization_new (scalar, *sb_sample, j_scale, *j_sample,bit_alloc, *subband, &frame); //subband_quantization (scalar, *sb_sample, j_scale, *j_sample,bit_alloc, // *subband, &frame); write_samples_new(*subband, bit_alloc, &frame, &bs); //sample_encoding (*subband, bit_alloc, &frame, &bs); #else transmission_pattern (scalar, scfsi, &frame); main_bit_allocation (smr, scfsi, bit_alloc, &adb, &frame,&glopts); if (error_protection) CRC_calc (&frame, bit_alloc, scfsi, &crc); encode_info (&frame, &bs); if (error_protection) encode_CRC (crc, &bs); encode_bit_alloc (bit_alloc, &frame, &bs); #if CTRACE if (frameNum == 1) { fprintf(C_TRACE, "下面输出比特分配：\n"); for (int i = 0; i < frame.sblimit; i++) fprintf(C_TRACE, "ch[0].subband[%d]: %d bits\n",i,bit_alloc[0][i]); //putbits(bs, bit_alloc[k][i], (*alloc)[i][0].bits); fprintf(C_TRACE, "\n"); } #endif encode_scale (bit_alloc, scfsi, scalar, &frame, &bs); #if CTRACE if (frameNum == 1) { fprintf(C_TRACE, "下面输出比例因子选择：\n"); for (int i = 0; i < frame.sblimit; i++) fprintf(C_TRACE, "Ch[0].subband[%d] scfsi: %d\n",i,scfsi[0][i]); fprintf(C_TRACE, "\n"); } #endif #if CTRACE if (frameNum == 1) { fprintf(C_TRACE, "下面输出比例因子：\n"); for (int i = 0; i < frame.sblimit; i++) { fprintf(C_TRACE, "Ch[0].subband[%d] scalar: %d\t %d\t %d\n", i,scalar[0][0][i], scalar[0][1][i], scalar[0][2][i]); } fprintf(C_TRACE, "\n"); } #endif subband_quantization (scalar, *sb_sample, j_scale, *j_sample,bit_alloc, *subband, &frame); sample_encoding (*subband, bit_alloc, &frame, &bs); #endif /* If not all the bits were used, write out a stack of zeros */ for (i = 0; i < adb; i++) put1bit (&bs, 0); if (header.dab_extension) { /* Reserve some bytes for X-PAD in DAB mode */ putbits (&bs, 0, header.dab_length * 8); for (i = header.dab_extension - 1; i >= 0; i--) { CRC_calcDAB (&frame, bit_alloc, scfsi, scalar, &crc, i); /* this crc is for the previous frame in DAB mode */ if (bs.buf_byte_idx + lg_frame < bs.buf_size) bs.buf[bs.buf_byte_idx + lg_frame] = crc; /* reserved 2 bytes for F-PAD in DAB mode */ putbits (&bs, crc, 8); } putbits (&bs, 0, 16); } frameBits = sstell (&bs) - sentBits; if (frameBits % 8) { /* a program failure */ fprintf (stderr, "Sent %ld bits = %ld slots plus%ld\n", frameBits, frameBits / 8, frameBits % 8); fprintf (stderr, "If you are reading this, theprogram is broken\n"); fprintf (stderr, "email [mfc at NOTplanckenerg.com]without the NOT\n"); fprintf (stderr, "with the command line arguments andother info\n"); exit (0); } sentBits += frameBits; } close_bit_stream_w (&bs); if ((glopts.verbosity > 1) && (glopts.vbr == TRUE)) { int i; #ifdef NEWENCODE extern int vbrstats_new[15]; #else extern int vbrstats[15]; #endif fprintf (stdout, "VBR stats:\n"); for (i = 1; i < 15; i++) fprintf (stdout, "%4i ",bitrate[header.version][i]); fprintf (stdout, "\n"); for (i = 1; i < 15; i++) #ifdef NEWENCODE fprintf (stdout,"%4i ",vbrstats_new[i]); #else fprintf (stdout, "%4i ", vbrstats[i]); #endif fprintf (stdout, "\n"); } fprintf (stderr, "Avg slots/frame = %.3f; b/smp = %.2f; bitrate = %.3f kbps\n", (FLOAT) sentBits / (frameNum * 8), (FLOAT) sentBits / (frameNum * 1152), (FLOAT) sentBits / (frameNum * 1152) * s_freq[header.version][header.sampling_frequency]); if (fclose (musicin) != 0) { fprintf (stderr, "Could not close \"%s\".\n",original_file_name); exit (2); } fprintf (stderr, "\nDone\n"); time(&end_time); total_time = end_time - start_time; printf("total time is %d\n", total_time); exit (0); } /************************************************************************ * * print_config * * PURPOSE: Prints the encoding parameters used * ************************************************************************/ void print_config (frame_info * frame, int *psy, char *inPath, char *outPath) { frame_header *header = frame->header; if (glopts.verbosity == 0) return; fprintf (stderr,"--------------------------------------------\n"); fprintf (stderr, "Input File : '%s' %.1f kHz\n", (strcmp (inPath, "-") ? inPath : "stdin"), s_freq[header->version][header->sampling_frequency]); fprintf (stderr, "Output File: '%s'\n", (strcmp (outPath, "-") ? outPath : "stdout")); fprintf (stderr, "%d kbps ",bitrate[header->version][header->bitrate_index]); fprintf (stderr, "%s ", version_names[header->version]); if (header->mode != MPG_MD_JOINT_STEREO) fprintf (stderr, "Layer II %s Psycho model=%d (Mode_Extension=%d)\n", mode_names[header->mode], *psy, header->mode_ext); else fprintf (stderr, "Layer II %s Psy model %d \n",mode_names[header->mode], *psy); fprintf (stderr,"[De-emph:%s\tCopyright:%s\tOriginal:%s\tCRC:%s]\n", ((header->emphasis) ? "On" : "Off"), ((header->copyright) ? "Yes" : "No"), ((header->original) ? "Yes" : "No"), ((header->error_protection) ? "On" : "Off")); fprintf (stderr,"[Padding:%s\tByte-swap:%s\tChanswap:%s\tDAB:%s]\n", ((glopts.usepadbit) ? "Normal" : "Off"), ((glopts.byteswap) ? "On" : "Off"), ((glopts.channelswap) ? "On" : "Off"), ((glopts.dab) ? "On" : "Off")); if (glopts.vbr == TRUE) fprintf (stderr, "VBR Enabled. Using MNR boost of%f\n", glopts.vbrlevel); fprintf(stderr,"ATH adjustment %f\n",glopts.athlevel); fprintf (stderr,"--------------------------------------------\n"); } /************************************************************************ * * usage * * PURPOSE: Writes command line syntax to the file specified by #stderr# * ************************************************************************/ void usage (void) { /* print syntax & exit */ /* FIXME: maybe have an option to display better definitions of helpcodes, and long equivalents of the flags */ fprintf (stdout, "\ntooLAME version %s(http://toolame.sourceforge.net)\n", toolameversion); fprintf (stdout, "MPEG Audio Layer II encoder\n\n"); fprintf (stdout, "usage: \n"); fprintf (stdout, "\t%s [options] <input><output>\n\n", programName); fprintf (stdout, "Options:\n"); fprintf (stdout, "Input\n"); fprintf (stdout, "\t-s sfrq input smpl rate in kHz (dflt %4.1f)\n", DFLT_SFQ); fprintf (stdout, "\t-a downmix from stereo tomono\n"); fprintf (stdout, "\t-x force byte-swapping ofinput\n"); fprintf (stdout, "\t-g swap channels of inputfile\n"); fprintf (stdout, "Output\n"); fprintf (stdout, "\t-m mode channel mode : s/d/j/m (dflt L)\n", DFLT_MOD); fprintf (stdout, "\t-p psy psychoacoustic model 0/1/2/3(dflt %4u)\n", DFLT_PSY); fprintf (stdout, "\t-b br total bitrate in kbps (dflt 192)\n"); fprintf (stdout, "\t-v lev vbr mode\n"); fprintf (stdout, "\t-l lev ATH level (dflt 0)\n"); fprintf (stdout, "Operation\n"); // fprintf (stdout, "\t-f fast mode (turns offpsy model)\n"); // deprecate the -f switch. use "-p 0" instead. fprintf (stdout, "\t-q num quick mode. only calculate psy model every numframes\n"); fprintf (stdout, "Misc\n"); fprintf (stdout, "\t-d emp de-emphasis n/5/c (dflt L)\n", DFLT_EMP); fprintf (stdout, "\t-c mark ascopyright\n"); fprintf (stdout, "\t-o mark asoriginal\n"); fprintf (stdout, "\t-e add errorprotection\n"); fprintf (stdout, "\t-r force padding bit/frameoff\n"); fprintf (stdout, "\t-D len add DAB extensions of length[len]\n"); fprintf (stdout, "\t-t talkativity 0=nomessages (dflt 2)"); fprintf (stdout, "Files\n"); fprintf (stdout, "\tinput input sound file. (WAV,AIFF,PCM or use'/dev/stdin')\n"); fprintf (stdout, "\toutput output bit stream of encodedaudio\n"); fprintf (stdout, "\n\tAllowable bitrates for 16, 22.05 and 24kHz sampleinput\n"); fprintf (stdout, "\t8, 16, 24, 32, 40, 48, 56, 64, 80, 96, 112, 128, 144,160\n"); fprintf (stdout, "\n\tAllowable bitrates for 32, 44.1 and 48kHz sampleinput\n"); fprintf (stdout, "\t32, 48, 56, 64, 80, 96, 112, 128, 160, 192, 224, 256, 320,384\n"); exit (1); } /********************************************* * void short_usage(void) ********************************************/ void short_usage (void) { /* print a bit of info about the program */ fprintf (stderr, "tooLAME version %s\n(http://toolame.sourceforge.net)\n", toolameversion); fprintf (stderr, "MPEG Audio Layer II encoder\n\n"); fprintf (stderr, "USAGE: %s [options] <infile>[outfile]\n\n", programName); fprintf (stderr, "Try \"%s -h\" for moreinformation.\n", programName); exit (0); } /********************************************* * void proginfo(void) ********************************************/ void proginfo (void) { /* print a bit of info about the program */ fprintf (stderr, "\ntooLAME version 0.2g (http://toolame.sourceforge.net)\n"); fprintf (stderr, "MPEG Audio Layer II encoder\n\n"); } /************************************************************************ * * parse_args * * PURPOSE: Sets encoding parameters to the specifications of the * command line. Default settings are used for parameters * not specified in the command line. * * SEMANTICS: The command line is parsed according to the following * syntax: * * -m is followed by the mode * -p is followed by the psychoacoustic model number * -s is followed by the sampling rate * -b is followed by the total bitrate, irrespective of the mode * -d is followed by the emphasis flag * -c is followed by the copyright/no_copyright flag * -o is followed by the original/not_original flag * -e is followed by the error_protection on/off flag * -f turns off psy model (fast mode) * -q <i> only calculate psy model every ith frame * -a downmix from stereo to mono * -r turn off padding bits in frames. * -x force byte swapping of input * -g swap the channels on an input file * -t talkativity. how verbose should the program be. 0 = nomessages. * * If the input file is in AIFF format, the sampling frequency is read * from the AIFF header. * * The input and output filenames are read into #inpath# and #outpath#. * ************************************************************************/ void parse_args (int argc, char **argv, frame_info * frame, int *psy, unsigned long *num_samples, char inPath[MAX_NAME_SIZE], char outPath[MAX_NAME_SIZE]) { FLOAT srate; int brate; frame_header *header = frame->header; int err = 0, i = 0; long samplerate; /* preset defaults */ inPath[0] = '\0'; outPath[0] = '\0'; header->lay = DFLT_LAY; switch (DFLT_MOD) { case 's': header->mode = MPG_MD_STEREO; header->mode_ext = 0; break; case 'd': header->mode = MPG_MD_DUAL_CHANNEL; header->mode_ext = 0; break; /* in j-stereo mode, no default header->mode_ext was defined,gave error.. now default = 2 added by MFC 14 Dec1999. */ case 'j': header->mode = MPG_MD_JOINT_STEREO; header->mode_ext = 2; break; case 'm': header->mode = MPG_MD_MONO; header->mode_ext = 0; break; default: fprintf (stderr, "%s: Bad mode dflt %c\n", programName,DFLT_MOD); abort (); } *psy = DFLT_PSY; if ((header->sampling_frequency = SmpFrqIndex ((long) (1000 * DFLT_SFQ),&header->version)) < 0) { fprintf (stderr, "%s: bad sfrq default %.2f\n",programName, DFLT_SFQ); abort (); } header->bitrate_index = 14; brate = 0; switch (DFLT_EMP) { case 'n': header->emphasis = 0; break; case '5': header->emphasis = 1; break; case 'c': header->emphasis = 3; break; default: fprintf (stderr, "%s: Bad emph dflt %c\n", programName,DFLT_EMP); abort (); } header->copyright = 0; header->original = 0; header->error_protection = FALSE; header->dab_extension = 0; /* process args */ while (++i < argc && err == 0) { char c, *token, *arg, *nextArg; int argUsed; token = argv[i]; if (*token++ == '-') { if (i + 1 < argc) nextArg = argv[i + 1]; else nextArg = ""; argUsed = 0; if (!*token) { /* The user wants to use stdin and/or stdout. */ if (inPath[0] == '\0') strncpy (inPath, argv[i], MAX_NAME_SIZE); else if (outPath[0] == '\0') strncpy (outPath, argv[i], MAX_NAME_SIZE); } while ((c = *token++)) { if (*token /* NumericQ(token) */ ) arg = token; else arg = nextArg; switch (c) { case 'm': argUsed = 1; if (*arg == 's') { header->mode = MPG_MD_STEREO; header->mode_ext = 0; } else if (*arg == 'd') { header->mode = MPG_MD_DUAL_CHANNEL; header->mode_ext = 0; } else if (*arg == 'j') { header->mode = MPG_MD_JOINT_STEREO; } else if (*arg == 'm') { header->mode = MPG_MD_MONO; header->mode_ext = 0; } else { fprintf (stderr, "%s: -m mode must be s/d/j/m not %s\n", programName, arg); err = 1; } break; case 'p': *psy = atoi (arg); argUsed = 1; break; case 's': argUsed = 1; srate = atof (arg); /* samplerate = rint( 1000.0 * srate ); $A */ samplerate = (long) ((1000.0 * srate) + 0.5); if ((header->sampling_frequency = SmpFrqIndex ((long) samplerate, &header->version))< 0) err = 1; break; case 'b': argUsed = 1; brate = atoi (arg); break; case 'd': argUsed = 1; if (*arg == 'n') header->emphasis = 0; else if (*arg == '5') header->emphasis = 1; else if (*arg == 'c') header->emphasis = 3; else { fprintf (stderr, "%s: -d emp must be n/5/c not %s\n",programName, arg); err = 1; } break; case 'D': argUsed = 1; header->dab_length = atoi (arg); header->error_protection = TRUE; header->dab_extension = 2; glopts.dab = TRUE; break; case 'c': header->copyright = 1; break; case 'o': header->original = 1; break; case 'e': header->error_protection = TRUE; break; case 'f': *psy = 0; /* this switch is deprecated? FIXME get rid of glopts.usepsy instead us psymodel 0, i.e. "-p 0" */ glopts.usepsy = FALSE; break; case 'r': glopts.usepadbit = FALSE; header->padding = 0; break; case 'q': argUsed = 1; glopts.quickmode = TRUE; glopts.usepsy = TRUE; glopts.quickcount = atoi (arg); if (glopts.quickcount == 0) { /* just don't use psy model */ glopts.usepsy = FALSE; glopts.quickcount = FALSE; } break; case 'a': glopts.downmix = TRUE; header->mode = MPG_MD_MONO; header->mode_ext = 0; break; case 'x': glopts.byteswap = TRUE; break; case 'v': argUsed = 1; glopts.vbr = TRUE; glopts.vbrlevel = atof (arg); glopts.usepadbit = FALSE;

· /* don't use padding for VBR */ header->padding = 0; /* MFC Feb 2003: in VBR mode, joint stereo doesn't make any sense at the moment, as there are no noisy subbands according to bits_for_nonoise in vbr mode */ header->mode = MPG_MD_STEREO; /* force stereo mode */ header->mode_ext = 0; break; case 'l': argUsed = 1; glopts.athlevel = atof(arg); break; case 'h': usage (); break; case 'g': glopts.channelswap = TRUE; break; case 't': argUsed = 1; glopts.verbosity = atoi (arg); break; default: fprintf (stderr, "%s: unrec option %c\n", programName, c); err = 1; break; } if (argUsed) { if (arg == token) token = "";

· /* no more from token */ else ++i; /* skip arg we used */ arg = ""; argUsed = 0; } } } else { if (inPath[0] == '\0') strcpy (inPath, argv[i]); else if (outPath[0] == '\0') strcpy (outPath, argv[i]); else { fprintf (stderr, "%s: excess arg %s\n", programName, argv[i]); err = 1; } } } if (header->dab_extension) { /* in 48 kHz */ /* if the bit rate per channel is less then 56 kbit/s, we have 2scf-crc */ /* else we have 4 scf-crc */ /* in 24 kHz, we have 4 scf-crc, see main loop */ if (brate / (header->mode == MPG_MD_MONO ? 1 : 2) >= 56) header->dab_extension = 4; } if (err || inPath[0] == '\0') usage (); /* If no infile defined, or err has occured, then callusage() */ if (outPath[0] == '\0') { /* replace old extension with new one, 1992-08-19, 1995-06-12 shn*/ new_ext (inPath, DFLT_EXT, outPath); } if (!strcmp (inPath, "-")) { musicin = stdin; /* read from stdin */ *num_samples = MAX_U_32_NUM; } else { if ((musicin = fopen (inPath, "rb")) == NULL) { fprintf (stderr, "Could not find\"%s\".\n", inPath); exit (1); } parse_input_file (musicin, inPath, header, num_samples); } /* check for a valid bitrate */ if (brate == 0) brate = bitrate[header->version][10]; /* Check to see we have a sane value for the bitrate for this version */ if ((header->bitrate_index = BitrateIndex (brate,header->version)) < 0) err = 1; /* All options are hunky dory, open the input audio file and return to the main drag */ open_bit_stream_w (&bs, outPath, BUFFER_SIZE); } void smr_dump(double smr[2][SBLIMIT], int nch) { int ch, sb; fprintf(stdout,"SMR:"); for (ch = 0;ch<nch; ch++) { if (ch==1) fprintf(stdout," "); for (sb=0;sb<SBLIMIT;sb++) fprintf(stdout,"%3.0f ",smr[ch][sb]); fprintf(stdout,"\n"); } }

#include <stdio.h> #include <stdlib.h> #include <string.h> #include <time.h> #include "common.h" #include "encoder.h" #include "musicin.h" #include "options.h" #include "audio_read.h" #include "bitstream.h" #include "mem.h" #include "crc.h" #include "psycho_n1.h" #include "psycho_0.h" #include "psycho_1.h" #include "psycho_2.h" #include "psycho_3.h" #include "psycho_4.h" #include "encode.h" #include "availbits.h" #include "subband.h" #include "encode_new.h" #include "m2aenc.h" #include <assert.h> FILE *musicin; Bit_stream_struc bs; char *programName; char toolameversion[10] = "0.2l"; void global_init (void) { glopts.usepsy = TRUE; glopts.usepadbit = TRUE; glopts.quickmode = FALSE; glopts.quickcount = 10; glopts.downmix = FALSE; glopts.byteswap = FALSE; glopts.channelswap = FALSE; glopts.vbr = FALSE; glopts.vbrlevel = 0; glopts.athlevel = 0; glopts.verbosity = 2; } /************************************************************************ * * main * * PURPOSE: MPEG II Encoder with * psychoacoustic models 1 (MUSICAM) and 2 (AT&T) * * SEMANTICS: One overlapping frame of audio of up to 2 channels are * processed at a time in the following order: * (associated routines are in parentheses) * * 1. Filter sliding window of data to get 32 subband * samples per channel. * (window_subband,filter_subband) * * 2. If joint stereo mode, combine left and right channels * for subbands above #jsbound#. * (combine_LR) * * 3. Calculate scalefactors for the frame, and * also calculate scalefactor select information. * (*_scale_factor_calc) * * 4. Calculate psychoacoustic masking levels using selected * psychoacoustic model. * (psycho_i, psycho_ii) * * 5. Perform iterative bit allocation for subbands with low * mask_to_noise ratios using masking levels from step 4. * (*_main_bit_allocation) * * 6. If error protection flag is active, add redundancy for * error protection. * (*_CRC_calc) * * 7. Pack bit allocation, scalefactors, and scalefactor select *headerrmation onto bitstream. * (*_encode_bit_alloc,*_encode_scale,transmission_pattern) * * 8. Quantize subbands and pack them into bitstream * (*_subband_quantization, *_sample_encoding) * ************************************************************************/ int frameNum = 0; int main (int argc, char **argv) { typedef double SBS[2][3][SCALE_BLOCK][SBLIMIT]; SBS *sb_sample; typedef double JSBS[3][SCALE_BLOCK][SBLIMIT]; JSBS *j_sample; typedef double IN[2][HAN_SIZE]; IN *win_que; typedef unsigned int SUB[2][3][SCALE_BLOCK][SBLIMIT]; SUB *subband; frame_info frame; frame_header header; char original_file_name[MAX_NAME_SIZE]; char encoded_file_name[MAX_NAME_SIZE]; short **win_buf; static short buffer[2][1152]; static unsigned int bit_alloc[2][SBLIMIT], scfsi[2][SBLIMIT]; static unsigned int scalar[2][3][SBLIMIT], j_scale[3][SBLIMIT]; static double smr[2][SBLIMIT], lgmin[2][SBLIMIT], max_sc[2][SBLIMIT]; // FLOAT snr32[32]; short sam[2][1344]; /* was [1056]; */ int model, nch, error_protection; static unsigned int crc; int sb, ch, adb; unsigned long frameBits, sentBits = 0; unsigned long num_samples; int lg_frame; int i; /* Used to keep the SNR values for the fast/quick psy models */ static FLOAT smrdef[2][32]; static int psycount = 0; extern int minimum; time_t start_time, end_time; int total_time; C_TRACE = fopen("C_trace.txt", "w"); sb_sample = (SBS *) mem_alloc (sizeof (SBS), "sb_sample"); j_sample = (JSBS *) mem_alloc (sizeof (JSBS), "j_sample"); win_que = (IN *) mem_alloc (sizeof (IN), "Win_que"); subband = (SUB *) mem_alloc (sizeof (SUB), "subband"); win_buf = (short **) mem_alloc (sizeof (short *) * 2, "win_buf"); /* clear buffers */ memset ((char *) buffer, 0, sizeof (buffer)); memset ((char *) bit_alloc, 0, sizeof (bit_alloc)); memset ((char *) scalar, 0, sizeof (scalar)); memset ((char *) j_scale, 0, sizeof (j_scale)); memset ((char *) scfsi, 0, sizeof (scfsi)); memset ((char *) smr, 0, sizeof (smr)); memset ((char *) lgmin, 0, sizeof (lgmin)); memset ((char *) max_sc, 0, sizeof (max_sc)); //memset ((char *) snr32, 0, sizeof (snr32)); memset ((char *) sam, 0, sizeof (sam)); global_init (); header.extension = 0; frame.header = &header; frame.tab_num = -1; /* no table loaded */ frame.alloc = NULL; header.version = MPEG_AUDIO_ID; /* Default: MPEG-1 */ total_time = 0; time(&start_time); programName = argv[0]; if (argc == 1) /* no command-line args */ short_usage (); else parse_args (argc, argv, &frame, &model, &num_samples,original_file_name, encoded_file_name); print_config (&frame, &model, original_file_name,encoded_file_name); /* this will load the alloc tables and do some other stuff */ hdr_to_frps (&frame); nch = frame.nch; error_protection = header.error_protection; while (get_audio (musicin, buffer, num_samples, nch, &header) >0) { if (glopts.verbosity > 1) if (++frameNum % 10 == 0) fprintf (stderr, "[%4u]\r", frameNum); fflush (stderr); win_buf[0] = &buffer[0][0]; win_buf[1] = &buffer[1][0]; adb = available_bits (&header, &glopts); // #if CTRACE if (frameNum == 1) fprintf(C_TRACE, "avilibable bits for No.%d:%d\n", frameNum,adb); fprintf(C_TRACE, "\n"); #endif lg_frame = adb / 8; if (header.dab_extension) { /* in 24 kHz we always have 4 bytes */ if (header.sampling_frequency == 1) header.dab_extension = 4; /* You must have one frame in memory if you are in DAB mode */ /* in conformity of the norme ETS 300 401 http://www.etsi.org */ /* see bitstream.c */ if (frameNum == 1) minimum = lg_frame + MINIMUM; adb -= header.dab_extension * 8 + header.dab_length * 8 +16; } { int gr, bl, ch; /* New polyphase filter Combines windowing and filtering. Ricardo Feb'03 */ for( gr = 0; gr < 3; gr++ ) for ( bl = 0; bl < 12; bl++ ) for ( ch = 0; ch < nch; ch++ ) WindowFilterSubband( &buffer[ch][gr * 12 * 32 + 32 * bl], ch,

// &(*sb_sample)[ch][gr][bl][0] ); } #ifdef REFERENCECODE { /* Old code. left here for reference */ int gr, bl, ch; for (gr = 0; gr < 3; gr++) for (bl = 0; bl < SCALE_BLOCK; bl++) for (ch = 0; ch < nch; ch++) { window_subband (&win_buf[ch], &(*win_que)[ch][0], ch); filter_subband (&(*win_que)[ch][0],&(*sb_sample)[ch][gr][bl][0]); } } #endif #ifdef NEWENCODE scalefactor_calc_new(*sb_sample, scalar, nch, frame.sblimit); find_sf_max (scalar, &frame, max_sc); if (frame.actual_mode == MPG_MD_JOINT_STEREO) { /* this way we calculate more mono than we need */ /* but it is cheap */ combine_LR_new (*sb_sample, *j_sample, frame.sblimit); scalefactor_calc_new (j_sample, &j_scale, 1,frame.sblimit); } #else scale_factor_calc (*sb_sample, scalar, nch, frame.sblimit);

// pick_scale (scalar, &frame, max_sc); if (frame.actual_mode == MPG_MD_JOINT_STEREO) { /* this way we calculate more mono than we need */ /* but it is cheap */ combine_LR (*sb_sample, *j_sample, frame.sblimit); scale_factor_calc (j_sample, &j_scale, 1,frame.sblimit); } #endif if ((glopts.quickmode == TRUE) && (++psycount %glopts.quickcount != 0)) { /* We're using quick mode, so we're only calculating themodel every 'quickcount' frames. Otherwise, just copy theold ones across */ for (ch = 0; ch < nch; ch++) { for (sb = 0; sb < SBLIMIT; sb++) smr[ch][sb] = smrdef[ch][sb]; } } else { /* calculate the psymodel */ switch (model) { case -1: psycho_n1 (smr, nch); break; case 0: /* Psy Model A */ psycho_0 (smr, nch, scalar, (FLOAT) s_freq[header.version][header.sampling_frequency]* 1000);

break; case 1: psycho_1 (buffer, max_sc, smr, &frame); break; case 2: for (ch = 0; ch < nch; ch++) { psycho_2 (&buffer[ch][0], &sam[ch][0], ch, &smr[ch][0],//snr32, (FLOAT) s_freq[header.version][header.sampling_frequency] * 1000, &glopts); } break; case 3: /* Modified psy model 1 */ psycho_3 (buffer, max_sc, smr, &frame, &glopts); break; case 4: /* Modified Psycho Model 2 */ for (ch = 0; ch < nch; ch++) { psycho_4 (&buffer[ch][0], &sam[ch][0], ch, &smr[ch][0], //snr32, (FLOAT) s_freq[header.version][header.sampling_frequency] * 1000, &glopts); } break; case 5: /* Model 5 comparse model 1 and 3 */ psycho_1 (buffer, max_sc, smr, &frame); fprintf(stdout,"1 "); smr_dump(smr,nch); psycho_3 (buffer, max_sc, smr, &frame, &glopts); fprintf(stdout,"3 "); smr_dump(smr,nch); break; case 6: /* Model 6 compares model 2 and 4 */ for (ch = 0; ch < nch; ch++) psycho_2 (&buffer[ch][0], &sam[ch][0], ch, &smr[ch][0],//snr32, (FLOAT) s_freq[header.version][header.sampling_frequency] * 1000, &glopts); fprintf(stdout,"2 "); smr_dump(smr,nch); for (ch = 0; ch < nch; ch++) psycho_4 (&buffer[ch][0], &sam[ch][0], ch, &smr[ch][0], //snr32, (FLOAT) s_freq[header.version][header.sampling_frequency] * 1000, &glopts); fprintf(stdout,"4 "); smr_dump(smr,nch); break; case 7: fprintf(stdout,"Frame: %i\n",frameNum); /* Dump the SMRs for all models */

psycho_1 (buffer, max_sc, smr, &frame); fprintf(stdout,"1"); smr_dump(smr, nch); psycho_3 (buffer, max_sc, smr, &frame, &glopts); fprintf(stdout,"3"); smr_dump(smr,nch); for (ch = 0; ch < nch; ch++) psycho_2 (&buffer[ch][0], &sam[ch][0], ch, &smr[ch][0],//snr32, (FLOAT) s_freq[header.version][header.sampling_frequency] * 1000, &glopts); fprintf(stdout,"2"); smr_dump(smr,nch); for (ch = 0; ch < nch; ch++) psycho_4 (&buffer[ch][0], &sam[ch][0], ch, &smr[ch][0], //snr32, (FLOAT) s_freq[header.version][header.sampling_frequency] * 1000, &glopts); fprintf(stdout,"4"); smr_dump(smr,nch); break; case 8: /* Compare 0 and 4 */

psycho_n1 (smr, nch); fprintf(stdout,"0"); smr_dump(smr,nch); for (ch = 0; ch < nch; ch++) psycho_4 (&buffer[ch][0], &sam[ch][0], ch, &smr[ch][0], //snr32, (FLOAT) s_freq[header.version][header.sampling_frequency] * 1000, &glopts); fprintf(stdout,"4"); smr_dump(smr,nch); break; default: fprintf (stderr, "Invalid psy model specification: %i\n", model); exit (0); } if (glopts.quickmode == TRUE) /* copy the smr values and reuse them later */ for (ch = 0; ch < nch; ch++) { for (sb = 0; sb < SBLIMIT; sb++) smrdef[ch][sb] = smr[ch][sb]; } if (glopts.verbosity > 4) smr_dump(smr, nch); } #ifdef NEWENCODE sf_transmission_pattern (scalar, scfsi, &frame); main_bit_allocation_new (smr, scfsi, bit_alloc, &adb,&frame, &glopts); //main_bit_allocation (smr, scfsi, bit_alloc, &adb,&frame, &glopts); if (error_protection) CRC_calc (&frame, bit_alloc, scfsi, &crc); write_header (&frame, &bs); //encode_info (&frame, &bs); if (error_protection) putbits (&bs, crc, 16); write_bit_alloc (bit_alloc, &frame, &bs); //encode_bit_alloc (bit_alloc, &frame, &bs); write_scalefactors(bit_alloc, scfsi, scalar, &frame,&bs); //encode_scale (bit_alloc, scfsi, scalar, &frame, &bs); subband_quantization_new (scalar, *sb_sample, j_scale, *j_sample,bit_alloc, *subband, &frame); //subband_quantization (scalar, *sb_sample, j_scale, *j_sample,bit_alloc, // *subband, &frame); write_samples_new(*subband, bit_alloc, &frame, &bs); //sample_encoding (*subband, bit_alloc, &frame, &bs); #else transmission_pattern (scalar, scfsi, &frame); main_bit_allocation (smr, scfsi, bit_alloc, &adb, &frame,&glopts); if (error_protection) CRC_calc (&frame, bit_alloc, scfsi, &crc); encode_info (&frame, &bs); if (error_protection) encode_CRC (crc, &bs); encode_bit_alloc (bit_alloc, &frame, &bs); #if CTRACE if (frameNum == 1) { fprintf(C_TRACE, "下面输出比特分配：\n"); for (int i = 0; i < frame.sblimit; i++) fprintf(C_TRACE, "ch[0].subband[%d]: %d bits\n",i,bit_alloc[0][i]); //putbits(bs, bit_alloc[k][i], (*alloc)[i][0].bits); fprintf(C_TRACE, "\n"); } #endif encode_scale (bit_alloc, scfsi, scalar, &frame, &bs); #if CTRACE if (frameNum == 1) { fprintf(C_TRACE, "下面输出比例因子选择：\n"); for (int i = 0; i < frame.sblimit; i++) fprintf(C_TRACE, "Ch[0].subband[%d] scfsi: %d\n",i,scfsi[0][i]); fprintf(C_TRACE, "\n"); } #endif #if CTRACE if (frameNum == 1) { fprintf(C_TRACE, "下面输出比例因子：\n"); for (int i = 0; i < frame.sblimit; i++) { fprintf(C_TRACE, "Ch[0].subband[%d] scalar: %d\t %d\t %d\n", i,scalar[0][0][i], scalar[0][1][i], scalar[0][2][i]); } fprintf(C_TRACE, "\n"); } #endif subband_quantization (scalar, *sb_sample, j_scale, *j_sample,bit_alloc, *subband, &frame); sample_encoding (*subband, bit_alloc, &frame, &bs); #endif /* If not all the bits were used, write out a stack of zeros */ for (i = 0; i < adb; i++) put1bit (&bs, 0); if (header.dab_extension) { /* Reserve some bytes for X-PAD in DAB mode */ putbits (&bs, 0, header.dab_length * 8); for (i = header.dab_extension - 1; i >= 0; i--) { CRC_calcDAB (&frame, bit_alloc, scfsi, scalar, &crc, i); /* this crc is for the previous frame in DAB mode */ if (bs.buf_byte_idx + lg_frame < bs.buf_size) bs.buf[bs.buf_byte_idx + lg_frame] = crc; /* reserved 2 bytes for F-PAD in DAB mode */ putbits (&bs, crc, 8); } putbits (&bs, 0, 16); } frameBits = sstell (&bs) - sentBits; if (frameBits % 8) { /* a program failure */ fprintf (stderr, "Sent %ld bits = %ld slots plus%ld\n", frameBits, frameBits / 8, frameBits % 8); fprintf (stderr, "If you are reading this, theprogram is broken\n"); fprintf (stderr, "email [mfc at NOTplanckenerg.com]without the NOT\n"); fprintf (stderr, "with the command line arguments andother info\n"); exit (0); } sentBits += frameBits; } close_bit_stream_w (&bs); if ((glopts.verbosity > 1) && (glopts.vbr == TRUE)) { int i; #ifdef NEWENCODE extern int vbrstats_new[15]; #else extern int vbrstats[15]; #endif fprintf (stdout, "VBR stats:\n"); for (i = 1; i < 15; i++) fprintf (stdout, "%4i ",bitrate[header.version][i]); fprintf (stdout, "\n"); for (i = 1; i < 15; i++) #ifdef NEWENCODE fprintf (stdout,"%4i ",vbrstats_new[i]); #else fprintf (stdout, "%4i ", vbrstats[i]); #endif fprintf (stdout, "\n"); } fprintf (stderr, "Avg slots/frame = %.3f; b/smp = %.2f; bitrate = %.3f kbps\n", (FLOAT) sentBits / (frameNum * 8), (FLOAT) sentBits / (frameNum * 1152), (FLOAT) sentBits / (frameNum * 1152) * s_freq[header.version][header.sampling_frequency]); if (fclose (musicin) != 0) { fprintf (stderr, "Could not close \"%s\".\n",original_file_name); exit (2); } fprintf (stderr, "\nDone\n"); time(&end_time); total_time = end_time - start_time; printf("total time is %d\n", total_time); exit (0); } /************************************************************************ * * print_config * * PURPOSE: Prints the encoding parameters used * ************************************************************************/ void print_config (frame_info * frame, int *psy, char *inPath, char *outPath) { frame_header *header = frame->header; if (glopts.verbosity == 0) return; fprintf (stderr,"--------------------------------------------\n"); fprintf (stderr, "Input File : '%s' %.1f kHz\n", (strcmp (inPath, "-") ? inPath : "stdin"), s_freq[header->version][header->sampling_frequency]); fprintf (stderr, "Output File: '%s'\n", (strcmp (outPath, "-") ? outPath : "stdout")); fprintf (stderr, "%d kbps ",bitrate[header->version][header->bitrate_index]); fprintf (stderr, "%s ", version_names[header->version]); if (header->mode != MPG_MD_JOINT_STEREO) fprintf (stderr, "Layer II %s Psycho model=%d (Mode_Extension=%d)\n", mode_names[header->mode], *psy, header->mode_ext); else fprintf (stderr, "Layer II %s Psy model %d \n",mode_names[header->mode], *psy); fprintf (stderr,"[De-emph:%s\tCopyright:%s\tOriginal:%s\tCRC:%s]\n", ((header->emphasis) ? "On" : "Off"), ((header->copyright) ? "Yes" : "No"), ((header->original) ? "Yes" : "No"), ((header->error_protection) ? "On" : "Off")); fprintf (stderr,"[Padding:%s\tByte-swap:%s\tChanswap:%s\tDAB:%s]\n", ((glopts.usepadbit) ? "Normal" : "Off"), ((glopts.byteswap) ? "On" : "Off"), ((glopts.channelswap) ? "On" : "Off"), ((glopts.dab) ? "On" : "Off")); if (glopts.vbr == TRUE) fprintf (stderr, "VBR Enabled. Using MNR boost of%f\n", glopts.vbrlevel); fprintf(stderr,"ATH adjustment %f\n",glopts.athlevel); fprintf (stderr,"--------------------------------------------\n"); } /************************************************************************ * * usage * * PURPOSE: Writes command line syntax to the file specified by #stderr# * ************************************************************************/ void usage (void) { /* print syntax & exit */ /* FIXME: maybe have an option to display better definitions of helpcodes, and long equivalents of the flags */ fprintf (stdout, "\ntooLAME version %s(http://toolame.sourceforge.net)\n", toolameversion); fprintf (stdout, "MPEG Audio Layer II encoder\n\n"); fprintf (stdout, "usage: \n"); fprintf (stdout, "\t%s [options] <input><output>\n\n", programName); fprintf (stdout, "Options:\n"); fprintf (stdout, "Input\n"); fprintf (stdout, "\t-s sfrq input smpl rate in kHz (dflt %4.1f)\n", DFLT_SFQ); fprintf (stdout, "\t-a downmix from stereo tomono\n"); fprintf (stdout, "\t-x force byte-swapping ofinput\n"); fprintf (stdout, "\t-g swap channels of inputfile\n"); fprintf (stdout, "Output\n"); fprintf (stdout, "\t-m mode channel mode : s/d/j/m (dflt L)\n", DFLT_MOD); fprintf (stdout, "\t-p psy psychoacoustic model 0/1/2/3(dflt %4u)\n", DFLT_PSY); fprintf (stdout, "\t-b br total bitrate in kbps (dflt 192)\n"); fprintf (stdout, "\t-v lev vbr mode\n"); fprintf (stdout, "\t-l lev ATH level (dflt 0)\n"); fprintf (stdout, "Operation\n"); // fprintf (stdout, "\t-f fast mode (turns offpsy model)\n"); // deprecate the -f switch. use "-p 0" instead. fprintf (stdout, "\t-q num quick mode. only calculate psy model every numframes\n"); fprintf (stdout, "Misc\n"); fprintf (stdout, "\t-d emp de-emphasis n/5/c (dflt L)\n", DFLT_EMP); fprintf (stdout, "\t-c mark ascopyright\n"); fprintf (stdout, "\t-o mark asoriginal\n"); fprintf (stdout, "\t-e add errorprotection\n"); fprintf (stdout, "\t-r force padding bit/frameoff\n"); fprintf (stdout, "\t-D len add DAB extensions of length[len]\n"); fprintf (stdout, "\t-t talkativity 0=nomessages (dflt 2)"); fprintf (stdout, "Files\n"); fprintf (stdout, "\tinput input sound file. (WAV,AIFF,PCM or use'/dev/stdin')\n"); fprintf (stdout, "\toutput output bit stream of encodedaudio\n"); fprintf (stdout, "\n\tAllowable bitrates for 16, 22.05 and 24kHz sampleinput\n"); fprintf (stdout, "\t8, 16, 24, 32, 40, 48, 56, 64, 80, 96, 112, 128, 144,160\n"); fprintf (stdout, "\n\tAllowable bitrates for 32, 44.1 and 48kHz sampleinput\n"); fprintf (stdout, "\t32, 48, 56, 64, 80, 96, 112, 128, 160, 192, 224, 256, 320,384\n"); exit (1); } /********************************************* * void short_usage(void) ********************************************/ void short_usage (void) { /* print a bit of info about the program */ fprintf (stderr, "tooLAME version %s\n(http://toolame.sourceforge.net)\n", toolameversion); fprintf (stderr, "MPEG Audio Layer II encoder\n\n"); fprintf (stderr, "USAGE: %s [options] <infile>[outfile]\n\n", programName); fprintf (stderr, "Try \"%s -h\" for moreinformation.\n", programName); exit (0); } /********************************************* * void proginfo(void) ********************************************/ void proginfo (void) { /* print a bit of info about the program */ fprintf (stderr, "\ntooLAME version 0.2g (http://toolame.sourceforge.net)\n"); fprintf (stderr, "MPEG Audio Layer II encoder\n\n"); } /************************************************************************ * * parse_args * * PURPOSE: Sets encoding parameters to the specifications of the * command line. Default settings are used for parameters * not specified in the command line. * * SEMANTICS: The command line is parsed according to the following * syntax: * * -m is followed by the mode * -p is followed by the psychoacoustic model number * -s is followed by the sampling rate * -b is followed by the total bitrate, irrespective of the mode * -d is followed by the emphasis flag * -c is followed by the copyright/no_copyright flag * -o is followed by the original/not_original flag * -e is followed by the error_protection on/off flag * -f turns off psy model (fast mode) * -q <i> only calculate psy model every ith frame * -a downmix from stereo to mono * -r turn off padding bits in frames. * -x force byte swapping of input * -g swap the channels on an input file * -t talkativity. how verbose should the program be. 0 = no messages. * * If the input file is in AIFF format, the sampling frequency is read * from the AIFF header. * * The input and output filenames are read into #inpath# and #outpath#. * ************************************************************************/ void parse_args (int argc, char **argv, frame_info * frame, int *psy, unsigned long *num_samples, char inPath[MAX_NAME_SIZE], char outPath[MAX_NAME_SIZE]) { FLOAT srate; int brate; frame_header *header = frame->header; int err = 0, i = 0; long samplerate; /* preset defaults */ inPath[0] = '\0'; outPath[0] = '\0'; header->lay = DFLT_LAY; switch (DFLT_MOD) { case 's': header->mode = MPG_MD_STEREO; header->mode_ext = 0; break; case 'd': header->mode = MPG_MD_DUAL_CHANNEL; header->mode_ext = 0; break; /* in j-stereo mode, no default header->mode_ext was defined,gave error.. now default = 2 added by MFC 14 Dec1999. */ case 'j': header->mode = MPG_MD_JOINT_STEREO; header->mode_ext = 2; break; case 'm': header->mode = MPG_MD_MONO; header->mode_ext = 0; break; default: fprintf (stderr, "%s: Bad mode dflt %c\n", programName,DFLT_MOD); abort (); } *psy = DFLT_PSY; if ((header->sampling_frequency = SmpFrqIndex ((long) (1000 * DFLT_SFQ),&header->version)) < 0) { fprintf (stderr, "%s: bad sfrq default %.2f\n",programName, DFLT_SFQ); abort (); } header->bitrate_index = 14; brate = 0; switch (DFLT_EMP) { case 'n': header->emphasis = 0; break; case '5': header->emphasis = 1; break; case 'c': header->emphasis = 3; break; default: fprintf (stderr, "%s: Bad emph dflt %c\n", programName,DFLT_EMP); abort (); } header->copyright = 0; header->original = 0; header->error_protection = FALSE; header->dab_extension = 0; /* process args */ while (++i < argc && err == 0) { char c, *token, *arg, *nextArg; int argUsed; token = argv[i]; if (*token++ == '-') { if (i + 1 < argc) nextArg = argv[i + 1]; else nextArg = ""; argUsed = 0; if (!*token) { /* The user wants to use stdin and/or stdout. */ if (inPath[0] == '\0') strncpy (inPath, argv[i], MAX_NAME_SIZE); else if (outPath[0] == '\0') strncpy (outPath, argv[i], MAX_NAME_SIZE); } while ((c = *token++)) { if (*token /* NumericQ(token) */ ) arg = token; else arg = nextArg; switch (c) { case 'm': argUsed = 1; if (*arg == 's') { header->mode = MPG_MD_STEREO; header->mode_ext = 0; } else if (*arg == 'd') { header->mode = MPG_MD_DUAL_CHANNEL; header->mode_ext = 0; } else if (*arg == 'j') { header->mode = MPG_MD_JOINT_STEREO; } else if (*arg == 'm') { header->mode = MPG_MD_MONO; header->mode_ext = 0; } else { fprintf (stderr, "%s: -m mode must be s/d/j/m not %s\n", programName, arg); err = 1; } break; case 'p': *psy = atoi (arg); argUsed = 1; break; case 's': argUsed = 1; srate = atof (arg); /* samplerate = rint( 1000.0 * srate ); $A */ samplerate = (long) ((1000.0 * srate) + 0.5); if ((header->sampling_frequency = SmpFrqIndex ((long) samplerate, &header->version))< 0) err = 1; break; case 'b': argUsed = 1; brate = atoi (arg); break; case 'd': argUsed = 1; if (*arg == 'n') header->emphasis = 0; else if (*arg == '5') header->emphasis = 1; else if (*arg == 'c') header->emphasis = 3; else { fprintf (stderr, "%s: -d emp must be n/5/c not %s\n",programName, arg); err = 1; } break; case 'D': argUsed = 1; header->dab_length = atoi (arg); header->error_protection = TRUE; header->dab_extension = 2; glopts.dab = TRUE; break; case 'c': header->copyright = 1; break; case 'o': header->original = 1; break; case 'e': header->error_protection = TRUE; break; case 'f': *psy = 0; /* this switch is deprecated? FIXME get rid of glopts.usepsy instead us psymodel 0, i.e. "-p 0" */ glopts.usepsy = FALSE; break; case 'r': glopts.usepadbit = FALSE; header->padding = 0; break; case 'q': argUsed = 1; glopts.quickmode = TRUE; glopts.usepsy = TRUE; glopts.quickcount = atoi (arg); if (glopts.quickcount == 0) { /* just don't use psy model */ glopts.usepsy = FALSE; glopts.quickcount = FALSE; } break; case 'a': glopts.downmix = TRUE; header->mode = MPG_MD_MONO; header->mode_ext = 0; break; case 'x': glopts.byteswap = TRUE; break; case 'v': argUsed = 1; glopts.vbr = TRUE; glopts.vbrlevel = atof (arg); glopts.usepadbit = FALSE;

/* don't use padding for VBR */ header->padding = 0; /* MFC Feb 2003: in VBR mode, joint stereo doesn't make any sense at the moment, as there are no noisy subbands according to bits_for_nonoise in vbr mode */ header->mode = MPG_MD_STEREO; /* force stereo mode */ header->mode_ext = 0; break; case 'l': argUsed = 1; glopts.athlevel = atof(arg); break; case 'h': usage (); break; case 'g': glopts.channelswap = TRUE; break; case 't': argUsed = 1; glopts.verbosity = atoi (arg); break; default: fprintf (stderr, "%s: unrec option %c\n", programName, c); err = 1; break; } if (argUsed) { if (arg == token) token = "";

/* no more from token */ else ++i; /* skip arg we used */ arg = ""; argUsed = 0; } } } else { if (inPath[0] == '\0') strcpy (inPath, argv[i]); else if (outPath[0] == '\0') strcpy (outPath, argv[i]); else { fprintf (stderr, "%s: excess arg %s\n", programName, argv[i]); err = 1; } } } if (header->dab_extension) { /* in 48 kHz */ /* if the bit rate per channel is less then 56 kbit/s, we have 2scf-crc */ /* else we have 4 scf-crc */ /* in 24 kHz, we have 4 scf-crc, see main loop */ if (brate / (header->mode == MPG_MD_MONO ? 1 : 2) >= 56) header->dab_extension = 4; } if (err || inPath[0] == '\0') usage (); /* If no infile defined, or err has occured, then callusage() */ if (outPath[0] == '\0') { /* replace old extension with new one, 1992-08-19, 1995-06-12 shn*/ new_ext (inPath, DFLT_EXT, outPath); } if (!strcmp (inPath, "-")) { musicin = stdin; /* read from stdin */ *num_samples = MAX_U_32_NUM; } else { if ((musicin = fopen (inPath, "rb")) == NULL) { fprintf (stderr, "Could not find\"%s\".\n", inPath); exit (1); } parse_input_file (musicin, inPath, header, num_samples); } /* check for a valid bitrate */ if (brate == 0) brate = bitrate[header->version][10]; /* Check to see we have a sane value for the bitrate for this version */ if ((header->bitrate_index = BitrateIndex (brate,header->version)) < 0) err = 1; /* All options are hunky dory, open the input audio file and return to the main drag */ open_bit_stream_w (&bs, outPath, BUFFER_SIZE); } void smr_dump(double smr[2][SBLIMIT], int nch) { int ch, sb; fprintf(stdout,"SMR:"); for (ch = 0;ch<nch; ch++) { if (ch==1) fprintf(stdout," "); for (sb=0;sb<SBLIMIT;sb++) fprintf(stdout,"%3.0f ",smr[ch][sb]); fprintf(stdout,"\n"); } }

下面输出比特分配： ch[0].subband[0]: 8 bits ch[0].subband[1]: 8 bits ch[0].subband[2]: 6 bits ch[0].subband[3]: 8 bits ch[0].subband[4]: 7 bits ch[0].subband[5]: 8 bits ch[0].subband[6]: 8 bits ch[0].subband[7]: 6 bits ch[0].subband[8]: 5 bits ch[0].subband[9]: 6 bits ch[0].subband[10]: 6 bits ch[0].subband[11]: 7 bits ch[0].subband[12]: 6 bits ch[0].subband[13]: 6 bits ch[0].subband[14]: 6 bits ch[0].subband[15]: 5 bits ch[0].subband[16]: 5 bits ch[0].subband[17]: 5 bits ch[0].subband[18]: 4 bits ch[0].subband[19]: 6 bits ch[0].subband[20]: 3 bits ch[0].subband[21]: 3 bits ch[0].subband[22]: 0 bits ch[0].subband[23]: 0 bits ch[0].subband[24]: 0 bits ch[0].subband[25]: 0 bits ch[0].subband[26]: 0 bits ch[0].subband[27]: 0 bits ch[0].subband[28]: 0 bits ch[0].subband[29]: 0 bits 下面输出比例因子选择： Ch[0].subband[0] scfsi: 2 Ch[0].subband[1] scfsi: 2 Ch[0].subband[2] scfsi: 3 Ch[0].subband[3] scfsi: 2 Ch[0].subband[4] scfsi: 2 Ch[0].subband[5] scfsi: 0 Ch[0].subband[6] scfsi: 2 Ch[0].subband[7] scfsi: 2 Ch[0].subband[8] scfsi: 3 Ch[0].subband[9] scfsi: 3 Ch[0].subband[10] scfsi: 2 Ch[0].subband[11] scfsi: 1 Ch[0].subband[12] scfsi: 3 Ch[0].subband[13] scfsi: 2 Ch[0].subband[14] scfsi: 0 Ch[0].subband[15] scfsi: 3 Ch[0].subband[16] scfsi: 2 Ch[0].subband[17] scfsi: 2 Ch[0].subband[18] scfsi: 1 Ch[0].subband[19] scfsi: 2 Ch[0].subband[20] scfsi: 1 Ch[0].subband[21] scfsi: 3 Ch[0].subband[22] scfsi: 2 Ch[0].subband[23] scfsi: 3 Ch[0].subband[24] scfsi: 0 Ch[0].subband[25] scfsi: 3 Ch[0].subband[26] scfsi: 3 Ch[0].subband[27] scfsi: 3 Ch[0].subband[28] scfsi: 3 Ch[0].subband[29] scfsi: 3 下面输出比例因子： Ch[0].subband[0] scalar: 11 11

11 Ch[0].subband[1] scalar: 12 12

12 Ch[0].subband[2] scalar: 21 18

18 Ch[0].subband[3] scalar: 25 25

25 Ch[0].subband[4] scalar: 29 29

29 Ch[0].subband[5] scalar: 28 23

26 Ch[0].subband[6] scalar: 22 22

22 Ch[0].subband[7] scalar: 21 21

21 Ch[0].subband[8] scalar: 32 28

28 Ch[0].subband[9] scalar: 34 30

30 Ch[0].subband[10] scalar: 31 31

31 Ch[0].subband[11] scalar: 30 30

26 Ch[0].subband[12] scalar: 27 24

24 Ch[0].subband[13] scalar: 23 23

23 Ch[0].subband[14] scalar: 26 22

25 Ch[0].subband[15] scalar: 30 25

25 Ch[0].subband[16] scalar: 26 26

26 Ch[0].subband[17] scalar: 29 29

29 Ch[0].subband[18] scalar: 31 31

30 Ch[0].subband[19] scalar: 26 26

26 Ch[0].subband[20] scalar: 34 34

31 Ch[0].subband[21] scalar: 34 31

31 Ch[0].subband[22] scalar: 38 38

38 Ch[0].subband[23] scalar: 39 50

50 Ch[0].subband[24] scalar: 43 51

57 Ch[0].subband[25] scalar: 41 54

54 Ch[0].subband[26] scalar: 45 52

52 Ch[0].subband[27] scalar: 42 54

54 Ch[0].subband[28] scalar: 44 52

52 Ch[0].subband[29] scalar: 43 52

转载请注明原文地址: https://www.6miu.com/read-11727.html

技术

最新回复(0)