2003 - Present
Florida Institute of Technology, Electrical and Computer Engineering – Associate Professor:
- Developed a web portal with my graduate students for TRDA of Melbourne in collaboration with Nterspec.
- PPT Commander – A Voice Only Activated Power Point Presentation Application
- Ported PPT Commander to Apple Mac OS
- Developed Voice Activated Elevator Simulator:
- Developed a Nursing Call Station Voice Only Activated interface for patients. Researching for ways to extend its capability for the patient to control its bed, TV and other devices connected to the system.
- Developed a Voice Activated Car Inspection System prototype for BMW:
- Designed and Developed a High Speed Currency Bill Reader system using Embedded Hardware.
- Organized and Hosted NIST Rich Transcription Evaluation Workshop,
- Hosted and Participated in International "NIST Rich Transcription Evaluation" 2009
- First Place in the First Annual Analog Devices & University of Massachusetts DSP Contest 2005 (Brian Ramos and Don McMann),
- Third Place in IEEE SouthEastCon 2007 Student Hardware Competition: Basketball Robot among 38 Universities (Ronald Ramdhan, Xerxes Beharry & Sean Powers). http://www.southeastcon.org/2007/students/. The robot is displayed in Deans Conference Room.
- Best Paper Nomination " 2006-472: A MATLAB TOOL FOR SPEECH PROCESSING, ANALYSIS AND RECOGNITION: SAR-LAB" ASEE 2006 (undergraduate co-authors Rogers N., Patel M.),
- - Visual Audio - (Brandon Schmitt).
- - "Smart Room" Senior Design 2008. (Matt Hopkins, David Herndon, Patrick Marinelli).
- FaST - Program Calculate Potential Energy Savings-from Using Mobile Smart Technologies. http://science.energy.gov/wdts/fast/project-descriptions/2011-projects/epa-calculate-potential-energy-savings-from-using-mobile-smart-technologies/, 2011
- Kerry Bruce Clark Teacher, 2009
- UML-ADI Assistive Device Competition, June 2005 Lowell MA, First Place.
Developed and Ported Wiener Based Noise Removal Algorithm to Analog Devices ADDS 21161 DSP.
- Notable Presentations:
CS Dept. Curriculum Series Presentation, 2005:
“Wake-Up-Word Speech Recognition: A Missing Link toward Natural Language Understanding”.
- NSF Proposals – PI : Written over 20 NSF proposal.
- NSF Proposal – Co-PI: Participated in over 15 NSF proposal.
2001 - 2003
Speech Recognition Scientist - ThinkEngine Networks, Inc., 175 Maple Street, Marlborough, MA 01745. USA.
- Invented, Designed and Developed unique solution to “Wake-Up-Word” or “OnWord™” Spotting Technology. Wake-Up-Word Spotting entails recognition of a specific word/phrase uttered in isolation or in a context of a continuous speech. Currently this technology is not as widely used as other Speech Recognition Applications/Tasks because of poor performance of Speech Recognition Systems offering such technology commercially - Nuance, SpeechWorks, Philips, Conversay, ART, etc., or as a research tool, that is speech recognition technologies of primarily research and development institutions such as – Byblos (BBN), Sphinx (CMU), HTK Speech Recognition Tool Kit (Cambridge University, Entropic and Microsoft), etc. Furthermore, all those systems require computer systems with powerful CPU’s (~1.5 GHz Pentiums) with large memory (512 Mbytes RAM) with Speech Recognition process itself requiring tens of hundreds of Mbytes for this feature alone to even run in real time. Additional advantage of the developed system is that it is designed also to run on a Fixed Point DSP, requiring less than 36.2 Kbytes of program memory space and 2 Kbytes for Model space, consuming less 2 Million Cycles per Second on a TI C62xx.
- Inventor of 3 Patented Solutions – Patent Pending:
- Working on Generalized scoring using Reversed and Normal Ordered Features for any Pattern Matching Method (e.g, DTW, HMM) to be filed for patent.
- Designed and Assisted in Developed of Voice Data Collection System - necessary for research, development, testing and evaluation of the Wake-Up-Word Recognition System.
- Performed and Managed 2 data collections over various calling environments (noisy, quiet, public, car, etc.) using various calling devices (cellular, landline, speaker phone. Created 2 Corpora from the recorded data. Those Corpora are used for:
- Transcribed and/or Supervised transcription process of recorded data. Set up conventions and standards so that all the tools to be developed that use data of created Corpora comply with a clear set of standards.
- Converted other (CallHome and PhoneBook) Corpora to this set of standards for easy and consistent use.
- Directed and Supervised Code Conversion and Porting from Floating Point to Fixed Point of Wake-Up-Word Spotting Technology.
- Developed Automated Process using combination of perl scripts and perl configuration files controlling various parameters affecting each step of the complex process of:
- Voice Activity Detection Based on Cepstral Features.
- Dynamic Time Warping (DTW) Matching using Reverse Ordered Feature Vectors.
- Rescoring using Distribution Distortion Measurements of Dynamic Time Warping Match.
- Building Models of a particular Wake-Up-Word
- Testing and Evaluation of the System, and
- Research, Development, and Refinement of Wake-Up-Word Recognition System,
- Generating Features from a Voice Data Corpus,
- Building a Model of a Wake-Up-Word (e.g., “Operator”, “Help”, “MapQuest”, “Verizon”, etc.) from the features,
- Using built Model to test and evaluate Wake-Up-Word Recognition System,
- Generating Performance Plots, Charts and Graphs.
Those scripts use numerous executables, gnuplot – a graph plotting tool, as well as other perl scripts. End result of this process is automatic generation of number of plots, charts, and graphs that depict performance of the system for easy evaluation and comparison.
- Trained and Supervised a DSP engineer to port, test and evaluate Wake-Up-Word Technology.
- Worked with Application Developers to integrate Wake-Up-Word Spotting Technology into a viable Demo and potentially viable product.
- Wrote Technical Document and Manual for this Technology.
- Consulted CTO in decision making process regarding Speech Recognition, Text to Speech, as well as Wake-Up-Words Spotting Technologies.
1999 – 2001
Speech Recognition Scientist – SpeechWorks International, Inc., Product Group, 695 Atlantic Ave., Boston, MA 02111. USA.
- Developed Noise Compensation Algorithm to increase recognition robustness against Noise and Channel varying characteristics.
- Conducted Study of Wireless/Cellular vs. Wireline/Landline signal differences and their effect on recognition performance.
- Developed Nonlinear Front End Signal Processing.
- Performed Comparative Studies of various Speech Recognition Technologies (e.g., AT&T, NUANCE, SPEECHWORKS recognizers).
- Developed algorithms to investigate various features (confidence score, acoustic score, etc.) and their optimal use for combining N-best lists produced by different features (mfcc, lpc, etc.) and different recognizers (segmental, HMM, Watson). Combining algorithm achieved significant error reduction as compared to the best.
- Developed diphone clustering for HMM models to minimize model size.
- Involved in re-alignment of acoustic segments for Text to Speech (TTS) model building data. Developed frame work for modular expansion and refinement of re-alignment process using perl scripts combined with perl configuration files. Implemented various heuristic rules to improve alignments generated by the Speech Recognizer to better fit TTS.
- Developed data collection program for Dialogic JCT board that supports CSP. Developed, Run, Digested, Processed, “Call Environment Data Collection” using this application.
1997 - 1999
Scientist - GTE, BBN Technologies, Speech Solutions Group, 70 Fawcett St., Cambridge, MA 02138. USA.
- Compiled and Analyzed BYBLOS (research speech recognition technology) and BBN HARK (commercial technology) system differences; Analyzed possible BYBLOS technologies for porting into BBN HARK; Developed and Coded Voice Model Filter that loads BYBLOS and/or BBN HARK training files and converts them into a new format files in compliance to designed specifications; Ran various tests (BYBLOS and BBN HARK) for Continuous Densities BBN HARK for benchmarking.
- Peer reviewed a paper for Speech Communication Journal.
1993 - 1997
Speech Scientist – Voice Processing Corporation/Voice Control Systems, Advanced Technology Development Group,One Main Street,MA02142.USA.
- Enhanced the performance of existing Front End of Speech Recognition System, implemented in VPro line of products, by designing a non-linear smoothing algorithm based on median filtering.
- Developed and Implemented Dynamic Features that augmented existing Front End Features.
- Developed a universal preprocessing module of the Front End that enables run-time front-end configuration, decompression, and sample-rate transformations of the original wave file.
- Performed numerous tests that provided critical insights into enhancement and debugging of VProFlex Technology.
- Invented, Developed and Integrated a very efficient novel Code Book Search strategy (internally named Fickle Search).
- Compiled a condensed Internal Report of the Literature Review Study on different ways to perform fast FFT’s of a real valued sequence.
- Developed, Tested, and Integrated Split Radix FFT algorithm. The function can handle any power of 2 Real Valued FFT’s.
- Modified Front End to take advantage of higher FFT size and increased frequency resolution:
¨ Analyzed the conflicting effect of window size and type (higher frequency resolution causing break down of enhancement due to harmonics,
¨ Analyzed several possible modifications of enhancement algorithm to accommodate higher frequency resolution, and
¨ Proposed elimination of pitch harmonics from the spectrum with Homomorphic filtering or LPC - based Spectrum.
- Implemented LPC based spectrum integrating it with existing Spectral Enhancement module of the Front End.
- Initiated the study toward enhanced composition of boundary and internal acoustic phonemic features.
- Invented, Developed, Ported, and extensively Tested a novel Noise Compensation with Speech Enhancement Algorithm. Also invented several integration strategies that take further advantage of the algorithm through a better interaction of the Front End with API. that take advantage of calibration when feasible. Default mode of operation is fully unsupervised in real-time.
- Developed and ANN software tool currently supporting five different feed-forward back-propagation type of learning.
- Developed a Pitch Tracking Algorithm based on enhanced Super-Resolution Pitch Determination Algorithm.
1990 – 1993
Post-Doctoral Research Associate - Swiss Federal Institute of Technology, IGP, ETH-Hönggerberg, CH-8093Zürich,Switzerland.
- Swiss National Science Foundation Research Project in Image Understanding - Design and Analysis of Spatial Image Sequences
1985 – 1990
Teaching Assistant – Electrical and Computer Engineering Department.Clemson University.
- Digital Processing of the Speech Signals, Digital Systems, Digital Circuit Design and Microprocessor Applications, Electronics, Programming.
1987 - 1990
Consultant - Engineering Research and Computer Services Department, Clemson University, Electrical and Computer Engineering Department,Clemson,SC29634-0915.USA.
- Design and Development of a database system for processing of the expenditures of theCollegeofEngineering,ClemsonUniversity.
- Design and Development of a database system prototype for automation of:
¨ Management of the repair and maintenance orders,
¨ Task allocation and duty assignment,
¨ Time-table management of the assigned personnel, and
¨ Generation of relevant statistical data.