《為了把多媒體數(shù)據(jù)正確地發(fā)送到用戶界面上》由會員分享,可在線閱讀,更多相關(guān)《為了把多媒體數(shù)據(jù)正確地發(fā)送到用戶界面上(13頁珍藏版)》請在裝配圖網(wǎng)上搜索。
1、單擊此處編輯母版標(biāo)題樣式,單擊此處編輯母版文本樣式,第二級,第三級,第四級,第五級,*,為了把多媒體數(shù)據(jù)正確地發(fā)送到用戶界面上,同步在其中起著重要的作用。很難從人的主觀感知角度這同步提供一個客觀的度量標(biāo)準(zhǔn)。每個人的感知都不一樣,只有一些啟發(fā)性的標(biāo)準(zhǔn)可以決定一個媒體流的展現(xiàn)正確與否。,For delivering multimedia data correctly at the user interface,synchronization is essential.It is not possible to provide an objective measurement for synchro
2、nization from the viewpoint of subjective human perception.As human perception varies from person to person,only heuristic criteria can determine whether a stream presentation is correct or not.,表現(xiàn)要求,口形同步要求,口形同步是指在人說話的情況下,音頻與視頻之間的時序關(guān)系。音頻與視頻的邏輯數(shù)據(jù)單元之間的時間偏差稱為錯切(,shew,),,同步的媒體流之間的應(yīng)該沒有偏差。,圖,15.18,給出實驗室結(jié)果
3、的概述,縱軸表示受試者發(fā)現(xiàn)同步錯誤的相對數(shù)目,但不管是滯后或提前,他們最初的假設(shè)是與不同視圖相關(guān)的三條曲線應(yīng)該大不一樣。但事實上并非如此(如圖,15.18,所示)。,左:頭像;中:正面半身;右:遠景全身像,圖,15.17,圖,15.18,三個不同視角發(fā)現(xiàn)同步錯誤的曲線,15.3.1,Lip synchronization refers to the temporal relationship between an audio and video stream for the particular case of humans speaking.The time difference betwe
4、en related audio and video LDUs is known as the skew.,Figure 15.17:Left:head view;middle:shoulder view;right:body view.,Figure 15.18 provides an overview of the results.The vertical axis denotes the relative number of test candidates who detected a synchronization error,regardless of being able to d
5、etermine if the audio was before or after the video.,Figure 15.17:,Left:head view;middle:shoulder view;right:body view.,指向同步要求,在計算機支持的協(xié)同工作環(huán)境中(,CSCW,),,攝像機與麥克風(fēng)通常與用戶的工作站相連。在這個實現(xiàn)中,實現(xiàn)人員要觀察一個包含有一些數(shù)據(jù)及相關(guān)圖形的商務(wù)報告,所有受試人員有一個觀察這些數(shù)據(jù)與圖形的觀察窗口。在討論時,共享一個指針,使用這一指針說話者可以指向任一與討論內(nèi)容相關(guān)的圖形,這就要求音頻與遠程指針的同步。,In a Computer-Supp
6、orted Co-operative,Work(CSCW)environment,cameras and microphones are usually attached to the users workstations.In the next experiment,the experimenters looked at a business report that contained some data with accompanying graphics.All participants had a window with these graphics on their desktop
7、where a shared pointer was used in the discussion.Using this pointer,speakers pointed out individual elements of the graphics which may have been relevant to the discussion taking place.This obviously required synchronization of the audio and remote telepointer.,實驗人員設(shè)計了兩類實驗:,第一是對一般船的技術(shù)部件進行解釋,指針指向正在討
8、論的區(qū)域(圖,15.21,右邊解釋越短,同步的要求越高。實驗人員選擇了一個使用很短單詞的講話速度很快的人。,實驗人員的另一個實驗是在地圖上對航海路線進行解釋(圖,15.21,左邊),這包括指針的連續(xù)移動。,從人的感知角度來看,指向同步與口形同步極不同。在接近同步的偏差值的情況下,它更難發(fā)現(xiàn)同步錯誤。口形同步錯誤的偏差值在,40ms,到,160ms,之間,對于指向同步,The experimenters conducted two experiments:,The first was to explain some technical parts of a sailing boat,while
9、a pointer located the area under discussion(Figure15.21).The shorter the explanation,the more crucial the synchronization;therefore,the experimenters selected a fast-speaking person who used fairly short words.,Additionally,the experimenters held a second experiment with the explanation of a traveli
10、ng route on a map(Figure15.21,left side).This involved the continuous movement of the pointer.From the human perception point of view,pointer synchronization is very different from lip synchronization as it is much more difficult to detect the“out of sync”error at skew values near the error-free cas
11、e.While a lip synchronization error is a matter of discussion for skews between 40ms and 160ms,for a pointer.,基本的媒體同步,前面對口形同步進行研究人,下面對同步研究的結(jié)果作一個總結(jié),給出較全面的同步要求。在數(shù)字化音頻一出現(xiàn)時,就對專用硬件所容忍的跳躍范圍進行了研究,,Dannenberg,給出了這些研究的文獻與解釋。在文獻,Ble78,中,對,16,位音頻質(zhì)量中最大的不跳躍采樣間隔是,200ps,。,在文獻,Sto 72,中,一些感知實驗推薦的音頻間隔是,5,到,10ns,,,更進一
12、步的實驗,Lic5,Woo51,表明,由短暫的滴答聲融合為連續(xù)的音調(diào)的最大間隔是,2ms,(,參見文獻,RM80,),Lip synchronization and pointer synchronization were investigated due to inconsistent results from available sources.The following summarizes other synchronization result s to give a complete picture of synchronization requiremints.Since the
13、 beginning of digital audio,the jitter to be tolerate by dedicated hardware has been studied.Dannenberg provided some references and explanations of these studies.InBle78,the maximum allowable jitter for 16-bit quality audio in a sample period is 200ps,which is the error equivalence to the magnitude
14、 of the LSB(Least-Significant Bit)of a full-level maximum-frequency 0-KHz signal.In Sto72,some perception experiments,recommended an allowable jitter in an audio sample period between 5 and 10ns.Further perception experiments were carried out by Lic51 and Wood51,the maximum spacing of short clicks t
15、o obtain fusion into one continuous tone was given at 2ms(as cited byRM80),一般的音頻與視頻的集成沒有口形同步算法那么嚴(yán)格,對于舞蹈的多媒體教學(xué)軟件,它可表現(xiàn)為由動畫展現(xiàn)的伴隨著音樂的舞步。使用多媒體交互能力,就可以一遍又一遍地觀看舞蹈動作。在這個特定的例子中,音樂與動畫之間的同步重要,經(jīng)驗表明,,80ms,的偏差值基本能滿足用戶的要求,不過,最具挑戰(zhàn)性的課題是噪聲事件和視頻表達之間的關(guān)聯(lián)(例如,兩車的碰撞,這里我們用口形同步的相同約束,即,80ms,)。,雙音道既可緊耦合,也可以松散耦合,合成的效果與其內(nèi)容緊密相關(guān),T
16、he combination of audio and animation is usually not as stringent as lip synchronization.A multimedia course on dancing,for example,could show the dancing steps as animated sequences with accompanying music.By making use of the interactive capabilities,individual sequences can be viewed over and over again.In this particular example,the synchronization between music and animation is particularly important.Experience showed that a skew of+/-ms fulfills the user demands despite some possible jitte