edu.ucsb.nmsl.autocap
Class InterCoverageCaptionEstimator

java.lang.Object
  extended by edu.ucsb.nmsl.autocap.InterCoverageCaptionEstimator
All Implemented Interfaces:
CaptionEstimator

public class InterCoverageCaptionEstimator
extends java.lang.Object
implements CaptionEstimator

The responsiblity of this class is to estimate captions that have no time- stamp after the Recognition and Alignment phases of the AutoCap process. The estimation of untime-stamped captions is important because each caption must have a time-stamp in order to create a usable caption file. These estimations are based on the assumption that there are recognized segments of the media file before and after each segment except for the first and last captions.

The estmation technique used in the InterCoverageCaptionEstimater is more accurate than the NaiveCaptionEstimator. This technique uses the local speaking rate rather than the global speaking rate. The assumption here is that the speaker does not speak at a consistent rate throughout the video and that accurate estimations of un-time-stamped captions can be made using this metric. The inter-coverage technique begins at the first word of an un-time- stamped caption and counts the total number of un-time-stamped which includes the first word of the un-time-stamped caption. The time-stamp of the caption chunks at each end of the unrecognized burst is recorded as well as the total number of unrecognized words. The difference between the two recorded time- stamps is divided by the number of unrecognized words to calculate the local speaking rate. This local speaking rate is then multiplied by the number of words to the nearest recognized word with respect to the first word of the untime-stamped caption. This result is added to the nearest time-stampe to calculate the estimated time-stampe for the untime-stamped caption. This process is executed for each untime-stamped caption for a given run of AutoCap.

Version:
1.0
Author:
Allan Knight

Field Summary
protected  double EstimatedBeginning
          Estimated beginning time of first caption.
protected  double EstimatedEnd
          Estimated finish time of last caption.
protected  double SpeakingRate
          Global speaking rate of speaker recorded during the recognition phase of the AutoCap process.
 
Constructor Summary
InterCoverageCaptionEstimator(double rate)
          This constructor creates an instance of the InterCoverageCaptionEstimator with the given global speaking rate.
InterCoverageCaptionEstimator(double rate, double estBeg, double estEnd)
          This constructor creates an instance of InterCoverageCaptionEstimator with the given spekaing rate and the given estimated start time and end time of the media.
 
Method Summary
protected  void collectStatistics(Transcript t)
          This method is called at the end of the estimation process in order to collect statistics about how well the speech recognition system performed while transcribing the input media.
 Transcript completeTranscriptTimes(Transcript t)
          This method performs the caption estimation for a given transcript.
private  void estimateEnds(Transcript t)
          This method estimates the starting time for the first and last captions if no time-stamp has been determined for either.
static void main(java.lang.String[] args)
          This method is used for testing purposes only, and is not necessary for the normal operation of AutoCap.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

SpeakingRate

protected double SpeakingRate
Global speaking rate of speaker recorded during the recognition phase of the AutoCap process.


EstimatedBeginning

protected double EstimatedBeginning
Estimated beginning time of first caption. Used in case the first caption has no time-stamp.


EstimatedEnd

protected double EstimatedEnd
Estimated finish time of last caption. Used in case the last caption has no time-stamp.

Constructor Detail

InterCoverageCaptionEstimator

public InterCoverageCaptionEstimator(double rate)
This constructor creates an instance of the InterCoverageCaptionEstimator with the given global speaking rate. This global speaking rate is only used for recording statistical information and is not needed for normal opeation of AutoCap.

Parameters:
rate - The global speaking rate of the speaker through out the entire video.

InterCoverageCaptionEstimator

public InterCoverageCaptionEstimator(double rate,
                                     double estBeg,
                                     double estEnd)
This constructor creates an instance of InterCoverageCaptionEstimator with the given spekaing rate and the given estimated start time and end time of the media. This global speaking rate is only used for recording statistical information and is not needed for normal operation of AutoCap.

Parameters:
rate - The global speaking rate of the speaker through out the entire video.
estBeg - The estimated start time of speaking within the video.
estEnd - The estimated finish time of speaking within the video.
Method Detail

completeTranscriptTimes

public Transcript completeTranscriptTimes(Transcript t)

This method performs the caption estimation for a given transcript. Each caption of the transcript is investigate, if the first word of the caption has a time-stamp, then no estimation is performed. Otherwise, the time- stamp for the caption is estimated as specified in the class documentation.

Specified by:
completeTranscriptTimes in interface CaptionEstimator
Parameters:
t - The Transcript for which the estimation technique will be applied.
Returns:
A Transcript with all captions time-stamped.

estimateEnds

private void estimateEnds(Transcript t)
This method estimates the starting time for the first and last captions if no time-stamp has been determined for either. Estimating the ends for the transcript must be done before all others, as they may depend on the time stamps existing for the estimation to be calculated.

Parameters:
t - The Transcript for which captions are being estimated.

collectStatistics

protected void collectStatistics(Transcript t)
This method is called at the end of the estimation process in order to collect statistics about how well the speech recognition system performed while transcribing the input media. This method is not necessary for the normal operation of AutoCap.

Parameters:
t - The Transcript of the transcribed input media for which statistics are being collected.

main

public static void main(java.lang.String[] args)
This method is used for testing purposes only, and is not necessary for the normal operation of AutoCap.