|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object edu.ucsb.nmsl.autocap.NaiveCaptionEstimator
public class NaiveCaptionEstimator
The responsiblity of this class is to estimate captions that have no time- stamp after the Recognition and Alignment phases of the AutoCap process. The estimation of untime-stamped captions is important because each caption must have a time-stamp in order to create a usable caption file. These estimations are based on the assumption that there are recognized segments of the media file before and after each segment except for the first and last captions.
The estmation technique used in the NaiveCaptionEstimator is very simple. This technique uses the average speaking rate of the speaker for the entire video. The assumption here is that the speaker speaks at a consistent rate throughout the video and that accurate estimations of un-time-stamped captions can be made using this metric. The naive technique begins at the first word of an un-time-stamped caption and counts the number of words to the nearest time-stamped caption chunk. This search simultaneously counts forward and backward from the starting word until a time-stamped caption chunk. The time-stamp of the nearest recognized word is recorded along with its distance, in words, from the start word. The distance is multiplied by the global speaking rate and added or subtracted from the recorded time-stamp to formulate an estimated time-stamp. This time-stamp is then recorded as the time-stamp for the untime-stamped caption. This process is executed for each untime-stamped caption for a given run of AutoCap.
As its name implies, this technique is quite naive and very inaccurate. The NaiveCaptionEstimator is included only for research purposes because another technique, the inter-coverate caption estimator technique, performs much better and creates much more accurate time stamps than this technique.
InterCoverageCaptionEstimator
Field Summary | |
---|---|
protected java.util.Vector |
covered
Vector that holds all the caption that have at least some words in it recognized by the speech recognition phase, but not necessarily the first word. |
protected double |
SpeakingRate
Global speaking rate of speaker recorded during the recognition phase of the AutoCap process. |
Constructor Summary | |
---|---|
NaiveCaptionEstimator(double r)
This constructor creates an instance of the NaiveCaptionEstimator with the given speaking rate. |
Method Summary | |
---|---|
protected void |
collectStatistics(Transcript t)
This method is called at the end of the estimation process in order to collect statistics about how well the speech recognition system performed while transcribing the input media. |
Transcript |
completeTranscriptTimes(Transcript t)
This method performs the caption estimation for a given transcript. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected double SpeakingRate
protected java.util.Vector covered
Constructor Detail |
---|
public NaiveCaptionEstimator(double r)
r
- The global speaking rate of the speaker throughout the video.Method Detail |
---|
public Transcript completeTranscriptTimes(Transcript t)
This method performs the caption estimation for a given transcript. Each caption of the transcript is investigate, if the first word of the caption has a time-stamp, then no estimation is performed. Otherwise, the time- stamp for the caption is estimated as specified in the class documentation.
completeTranscriptTimes
in interface CaptionEstimator
t
- The Transcript for which the estimation technique will be applied.
protected void collectStatistics(Transcript t)
t
- The Transcript of the transcribed input media for which statistics
are being collected.
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |