Flite: a small run-time speech synthesis engine version 1.2-release Copyright Carnegie Mellon University 1999-2003 All rights reserved http://cmuflite.org Flite is a small fast run-time speech synthesis engine. It is the latest addition to the suite of free software synthesis tools including University of Edinburgh's Festival Speech Synthesis System and Carnegie Mellon University's FestVox project, tools, scripts and documentation for building synthetic voices. However, flite itself does not require either of these systems to compile and run. The core Flite library was developed by Alan W Black (mostly in his so-called spare time) while employed in the Language Technologies Institute at Carnegie Mellon University. The name "flite", originally chosen to mean "festival-lite" is perhaps doubly appropriate as a substantial part of design and coding was done over 30,000ft while awb was travelling. The voices, lexicon and language components of flite, both their compression techniques and their actual contents were developed by Kevin A. Lenzo and Alan W Black . Flite is the answer to the complaint that Festival is too big, too slow, and not portable enough. o Flite is designed for very small devices, such as PDAs, and also for large server machine with lots of ports. o Flite is not a replacement for Festival but an alternative run time engine for voices developed in the FestVox framework where size and speed is crucial. o Flite is all in ANSI C, it contains no C++ or Scheme, thus requires more care in programming, and is harder to customize at run time. o It is thread safe o Voices, lexicons and language descriptions can be compiled (mostly automatically) into C representations from their FestVox formats o All voices, lexicons and language model data are const and in the text segment (i.e. they may be put in ROM). As they are linked in at compile time, there is virtually no startup delay. o Although the synthesized output is not exactly the same as the same voice in Festival they are effectively equivalent. That is flite doesn't sound better or worse than the equivalent voice in festival, just faster, smaller and scalable. o For standard diphone voices, maximum run time memory requirements are approximately less than twice the memory requirement for the waveform generated. For 32bit architectures this effectively means under 1M. (Later versions will include a streaming option which will reduce this to less than one quarter). o The flite program supports, synthesis of individual strings or files (utterance by utterance) to direct audio devices or to waveform files. o The flite library offers simple functions suitable for use in specific applications. Download: The flite distribution is available from http://cmuflite.org/ See the README inside the distribution for more details., if you don't know how to download this and unpack it, you will find compiling this and running this beta version much harder. Flite has been released to interested parties over the last six months from which we've a lot of good feedback. This release includes an 8KHz diphone voice a 16KHz diphone voice (designed for the ipaq) a limited domain talking clock New in 1.2-release o A build process for diphone and clunit/ldom voices FestVox voices can be converted (sometimes) automatically o Various bug fixes o Initial support for Mac OS X (not talking to audio device yet) but compiles and runs o Files can be converted to a single audio file o optional shared library support (Linux) This has been tested under many Unix systems and most versions of gcc, as well as Sun CC and Visual C++ (for WinCE). This version will compile for the Compaq Ipaq under Linux (just set the compiler in config/config). There is begining support for WinCE but its not yet complete in this version. Windows support is still beta but now offically supported. A mailing list for discussion has been set up at flite-beta@cmuflite.org to join it send to majordomo@cmuflite.org with the following line in the body of your mail subscribe flite-beta Alternatively send your mail directly to Alan W Black (awb@cs.cmu.edu) or Kevin A. Lenzo (lenzo@cs.cmu.edu) Alan and Kevin Alan W Black email: awb@cs.cmu.edu Language Technologies Institute http://www.cs.cmu.edu/~awb/ Kevin A. Lenzo email: lenzo@cs.cmu.edu ISRI http://www.cs.cmu.edu/~lenzo/ Carnegie Mellon University tel: +1-412-268-6299 5000 Forbes Ave, Pittsburgh PA, 15213, USA. fax: +1-412-268-6298