patents.google.com

CN112885331A - Dialect voice dialing communication equipment based on small vocabulary voice recognition technology - Google Patents

️Tue Jun 01 2021

Dialect voice dialing communication equipment based on small vocabulary voice recognition technology Download PDF

Info

Publication number

CN112885331A

CN112885331A CN202110253408.XA CN202110253408A CN112885331A CN 112885331 A CN112885331 A CN 112885331A CN 202110253408 A CN202110253408 A CN 202110253408A CN 112885331 A CN112885331 A CN 112885331A Authority

China

Prior art keywords

voice

feature

module

feature extraction

recognition technology

Prior art date

2021-03-09

Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)

Pending

Application number

CN202110253408.XA

Other languages

Chinese (zh)

Inventor

茅伟龙

朱永明

Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)

Wuxi Keyida Electric Control Co ltd

Original Assignee

Wuxi Keyida Electric Control Co ltd

Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)

2021-03-09

Filing date

2021-03-09

Publication date

2021-06-01

2021-03-09 Application filed by Wuxi Keyida Electric Control Co ltd filed Critical Wuxi Keyida Electric Control Co ltd

2021-03-09 Priority to CN202110253408.XA priority Critical patent/CN112885331A/en

2021-06-01 Publication of CN112885331A publication Critical patent/CN112885331A/en

Status Pending legal-status Critical Current

Images

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/005—Language recognition
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/24—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum

Landscapes

Engineering & Computer Science (AREA)
Computational Linguistics (AREA)
Health & Medical Sciences (AREA)
Audiology, Speech & Language Pathology (AREA)
Human Computer Interaction (AREA)
Physics & Mathematics (AREA)
Acoustics & Sound (AREA)
Multimedia (AREA)
Signal Processing (AREA)
Computer Vision & Pattern Recognition (AREA)
Telephonic Communication Services (AREA)
Telephone Function (AREA)

Abstract

The invention discloses dialect voice dialing communication equipment based on a small vocabulary voice recognition technology, which comprises a voice recording module, a voice feature extraction module, a feature storage database, a feature matching module and an operation execution module, wherein the output end of the voice recording module is connected with the input end of the voice feature extraction module, the output end of the voice feature extraction module is connected with the input end of the feature storage database, the voice feature extraction module and the storage database are respectively connected with the feature matching module, and the output end of the feature matching module is connected with the input end of the operation execution module. The invention adopts a one-person-one database mode to record the voice of the high-frequency operation which is possibly used, establishes the voice characteristic database corresponding to the related operation, and directly compares the voice characteristics of the voice content with the voice characteristics recorded in the database when the mobile phone is operated by voice, thereby improving the accuracy of voice operation recognition and reducing the difficulty of using the mobile phone.

Description

Dialect voice dialing communication equipment based on small vocabulary voice recognition technology

Technical Field

The invention relates to the technical field of voice recognition, in particular to dialect voice dialing communication equipment based on a small vocabulary voice recognition technology.

Background

With the explosive development of the mobile internet technology, mobile payment, take-out meal ordering, online shopping and the like become indispensable life styles of people, and smart phones also become necessities of people in the current life stage. However, in some underdeveloped areas, a large number of old people with low cultural level or visual impairment cannot use mobile phones normally, and the developed mobile internet cannot provide convenience for the old people but brings inconvenience to life for the old people.

The traditional voice recognition technology is to recognize voice as characters and then perform semantic analysis on the characters to acquire instructions sent by a user. However, for the above-mentioned user group, there are several problems with this type of speech recognition technology: 1) the existing voice recognition technology can only recognize Mandarin or dialects with a large number of users, but can not recognize dialects of the small people; 2) the language identification function is too diverse, so that the learning cost of the users cannot be reduced; 3) the accuracy is not high enough and the semantic meaning is easy to be identified as wrong.

And because the mandarin level, cultural level and learning ability of the user are limited, the user can hardly learn the complex operation of the mobile phone, the traditional voice recognition technology can not bring convenience to the user, and the difficulty of the old people in using the mobile phone is increased.

Disclosure of Invention

The technical problem to be solved by the invention is to provide dialect voice dialing communication equipment based on a small vocabulary voice recognition technology, so that the accuracy of voice operation recognition is improved, and the use difficulty is reduced.

In order to solve the technical problems, the technical scheme adopted by the invention is as follows.

Dialect voice dialing communication equipment based on a small vocabulary voice recognition technology comprises a voice input module for inputting data related to the operation method on a mobile phone, a voice feature extraction module for converting the voice data into voice features, a feature storage database for storing the voice features and corresponding operations, a feature matching module for sequentially comparing the voice features recognized this time with the features in the feature storage database, and an operation execution module for controlling the mobile phone to execute voice operations, wherein the output end of the voice input module is connected with the input end of the voice feature extraction module, the output end of the voice feature extraction module is connected with the input end of the feature storage database, the voice feature extraction module and the storage database are respectively connected with the feature matching module, and the output end of the feature matching module is connected with the input end of the operation execution module.

According to the dialect voice dialing communication equipment based on the small vocabulary voice recognition technology, the voice recording module is used for reading voice data with limited operations one by a mobile phone user and respectively recording the voice data into the mobile phone, and a unique ID is distributed to the recorded voice data.

In the dialect voice dialing communication equipment based on the small vocabulary voice recognition technology, the voice feature extraction module adopts the MFCC feature extraction technology for extraction.

According to the dialect voice dialing communication equipment based on the small vocabulary voice recognition technology, the MFCC feature extraction step comprises preprocessing, fast Fourier transform, a Mei filter bank, logarithm operation, discrete cosine transform and dynamic feature extraction.

In the dialect voice dialing communication equipment based on the small vocabulary voice recognition technology, the extracted features and the corresponding operation voice ID and operation ID are stored in the feature storage database.

According to the dialect voice dialing communication equipment based on the small vocabulary voice recognition technology, the voice characteristics are stored in the local mobile phone or uploaded to the server by the characteristic storage database.

In the dialect voice dialing communication equipment based on the small vocabulary voice recognition technology, if the similarity of the features with the highest matching degree in the feature matching module is within an acceptable threshold value, the corresponding operation is executed by the operation execution module; if the number of the operations is lower than the acceptable threshold value, reminding the user to re-enter the operation, or selecting the corresponding correct operation at the time to update the feature database.

Due to the adoption of the technical scheme, the technical progress of the invention is as follows.

The invention adopts a one-person-one database mode to record the voice of the high-frequency operation which is possibly used, establishes the voice characteristic database corresponding to the related operation, and directly compares the voice characteristics of the voice content with the voice characteristics recorded in the database when the mobile phone is operated by voice, thereby improving the accuracy of voice operation recognition and reducing the difficulty of using the mobile phone.

Drawings

FIG. 1 is a block diagram of a process for creating an operating speech feature library according to the present invention;

FIG. 2 is a block diagram of a process for extracting operational speech according to the present invention;

FIG. 3 is a block diagram of the process of matching the target operation in the present invention.

Detailed Description

The invention will be described in further detail below with reference to the figures and specific examples.

Dialect voice dialing communication equipment based on a small vocabulary voice recognition technology comprises a voice recording module, a voice feature extraction module, a feature storage database, a feature matching module and an operation execution module. The voice input module is used for inputting data related to the operation method on collection, the voice feature extraction module is used for converting the voice data into voice features, the feature storage module is used for storing the voice features and corresponding operations, the feature matching module is used for sequentially comparing the voice features recognized at this time with the features in the feature storage database, and the operation execution module is used for executing the voice operations. The output end of the voice input module is connected with the input end of the voice feature extraction module, the output end of the voice feature extraction module is connected with the input end of the feature storage database, the voice feature extraction module and the storage database are respectively connected with the feature matching module, and the output end of the feature matching module is connected with the input end of the operation execution module.

The voice recording module records voice data, preprocesses the recorded voice data through the voice feature extraction module, converts the voice data into voice features, stores the recorded voice features through the feature storage database, and establishes the operation voice feature library. The flow chart is shown in fig. 1, and the specific operation method is to read the voice data of limited operations which may be needed one by the mobile phone user himself and respectively record the voice data into the mobile phone, and assign a unique ID to the voice data so as to facilitate reference. Deleting the first and last sections of blank data of the voice data, reserving effective data, and performing noise reduction processing on the operation voice data so as to facilitate subsequent recognition. The voice format is generally a WMA series standard format, and for the WMA standard, refer to IEC61939 standard WMA specification series documents.

After the input voice data is preprocessed, an original voice segment is processed into an effective segment and is stored in a to-be-processed area of a computer system, so that the information amount needing data processing is greatly reduced.

The voice feature extraction module is used for converting voice data into voice features, and extracting the voice features by adopting an MFCC (Mel cepstrum coefficient) feature extraction technology, and a flow chart of the voice feature extraction module is shown in FIG. 2 and comprises the steps of preprocessing, fast Fourier transform, Mei filter bank, logarithm operation, discrete cosine transform, dynamic feature extraction and the like.

The extracted MFCC features are stored through a feature storage database, the storage position depends on the situation of computing power collection, if the computing power of the mobile phone is enough, the operations of feature extraction, feature comparison and the like can be performed within the time without influencing the experience, and the extracted MFCC features are stored locally to save server resources; and if the local calculation power of the mobile phone is insufficient, uploading the mobile phone to a server, and calculating in the server.

After the features are extracted, the features and the corresponding operation voice ID and operation ID form the content of the whole database.

In addition to storing in the database, it is necessary to separately create an area on the ROM for storing sample voices, each sample voice being named with its ID, which corresponds to the sample characteristic voice ID in the database.

The feature matching module matches the target voice based on the feature similarity of the voice recorded by the matched user when using the mobile phone, and the flow chart is shown in fig. 3, firstly, the feature is compared, the voice features of the current time are compared with the features in the voice database, the feature similarity is calculated, the similarity is sequenced, the voice feature with the highest similarity is selected, and the corresponding operation ID is the operation ID corresponding to the current operation of the user.

And then feeding back the result through the operation execution module, and if the similarity of the features with the highest matching degree is within an acceptable threshold, executing the corresponding operation. If the number of the operations is lower than the acceptable threshold value, reminding the user to re-enter the operation, or selecting the corresponding correct operation at the time to update the feature database.

The invention discloses a method for operating a mobile phone by adopting a small vocabulary speech recognition technology, which comprises the following steps:

1) the data related to the operation method is entered with the help of a mobile phone salesman or other users familiar with the operation of the mobile phone, and the data comprises the following steps: and voice data of the corresponding operation of each high-frequency operation, voice data of the contact person and the mobile phone number of the corresponding contact person. The voice data is read by the user in person under the direction of the sales force.

The user records the name of the contact person and the pronunciation of the high-frequency operation, so that the steps of recognizing characters, understanding natural language and the like are avoided, and the characters of a certain language do not need to be recognized firstly and then grammar, semantics and context understanding are not needed in the voice recognition operation.

2) And starting a voice feature extraction module to convert the voice data into voice features, store the voice features and the corresponding operation contents in a feature storage database.

3) When the user uses the device, the user presses a special button to input voice data (consistent with the input content when the user activates) needing to execute the operation.

4) And starting a voice feature extraction module to convert the voice data into voice features, and extracting the features of the command record sent by the user.

5) The voice characteristics recognized this time are compared with the characteristics in the characteristic storage database in sequence by the characteristic matching module, and the characteristics with the shortest distance to the voice characteristics, namely the closest characteristics, are found out.

6) And executing the corresponding operation by the operation execution module.

The invention utilizes the small vocabulary speech recognition technology, avoids text recognition and semantic analysis in the traditional speech recognition, directly compares the characteristics of the command with the characteristics of the recording pre-recorded by the user in the mode base, only needs the same command sent by the user and pre-recorded, and can be applied to all languages and dialects. The technology is applied to the mobile phone, so that people with poor eyesight and low cultural degree can normally use the basic function of the mobile phone through voice, and the threshold caused by the fact that the traditional voice recognition technology has high requirements on the standard Chinese level is avoided.

Claims (7)

1. Dialect voice dialing communication equipment based on small vocabulary voice recognition technology is characterized in that: the mobile phone voice recognition method comprises a voice input module for inputting data related to the operation method, a voice feature extraction module for converting the voice data into voice features, a feature storage database for storing the voice features and corresponding operations, a feature matching module for sequentially comparing the voice features recognized this time with the features in the feature storage database, and an operation execution module for controlling the mobile phone to execute the voice operations, wherein the output end of the voice input module is connected with the input end of the voice feature extraction module, the output end of the voice feature extraction module is connected with the input end of the feature storage database, the voice feature extraction module and the storage database are respectively connected with the feature matching module, and the output end of the feature matching module is connected with the input end of the operation execution module.

2. The dialect voice dialing communication device of claim 1 based on small vocabulary voice recognition technology, wherein: the voice recording module reads voice data of limited operations one by the mobile phone user himself and records the voice data into the mobile phone respectively, and assigns a unique ID for the recorded voice data.

3. The dialect voice dialing communication device of claim 1 based on small vocabulary voice recognition technology, wherein: the voice feature extraction module adopts MFCC feature extraction technology for extraction.

4. The dialect voice dialing communication device of claim 3 based on small vocabulary voice recognition technology, wherein: the MFCC feature extraction step comprises preprocessing, fast Fourier transform, Mei filter bank, logarithm operation, discrete cosine transform and dynamic feature extraction.

5. The dialect voice dialing communication device of claim 1 based on small vocabulary voice recognition technology, wherein: the feature storage database stores the extracted features and the operation voice ID and the operation ID corresponding to the extracted features.

6. The dialect voice dialing communication device of claim 1 based on small vocabulary voice recognition technology, wherein: the feature storage database stores the voice features locally in the mobile phone or uploads the voice features to a server.

7. The dialect voice dialing communication device of claim 1 based on small vocabulary voice recognition technology, wherein: the operation execution module executes corresponding operation if the similarity of the feature with the highest matching degree in the feature matching module is within an acceptable threshold; if the number of the operations is lower than the acceptable threshold value, reminding the user to re-enter the operation, or selecting the corresponding correct operation at the time to update the feature database.

CN202110253408.XA 2021-03-09 2021-03-09 Dialect voice dialing communication equipment based on small vocabulary voice recognition technology Pending CN112885331A (en)

Priority Applications (1)

Application Number	Priority Date	Filing Date	Title
CN202110253408.XA CN112885331A (en)	2021-03-09	2021-03-09	Dialect voice dialing communication equipment based on small vocabulary voice recognition technology

Applications Claiming Priority (1)

Application Number	Priority Date	Filing Date	Title
CN202110253408.XA CN112885331A (en)	2021-03-09	2021-03-09	Dialect voice dialing communication equipment based on small vocabulary voice recognition technology

Publications (1)

Publication Number	Publication Date
CN112885331A true CN112885331A (en)	2021-06-01

Family

ID=76053841

Family Applications (1)

Application Number	Title	Priority Date	Filing Date
CN202110253408.XA Pending CN112885331A (en)	2021-03-09	2021-03-09	Dialect voice dialing communication equipment based on small vocabulary voice recognition technology

Country Status (1)

Country	Link
CN (1)	CN112885331A (en)

Citations (5)

* Cited by examiner, † Cited by third party

Publication number	Priority date	Publication date	Assignee	Title
CN102209154A (en) *	2010-03-31	2011-10-05	鸿富锦精密工业（深圳）有限公司	Voice dialing system and method thereof
CN104717339A (en) *	2013-12-13	2015-06-17	富泰华工业（深圳）有限公司	Contact query system, contact query method and communication device
US20150381787A1 (en) *	2014-06-26	2015-12-31	Anatoliy Babayev	Voice Only Phone and Method of Operation
CN105653596A (en) *	2015-12-22	2016-06-08	惠州Tcl移动通信有限公司	Quick startup method and device of specific function on the basis of voice frequency comparison
CN111798845A (en) *	2020-05-28	2020-10-20	厦门快商通科技股份有限公司	Off-line voice control method and device for smart home and storage device

2021
- 2021-03-09 CN CN202110253408.XA patent/CN112885331A/en active Pending

Patent Citations (5)

* Cited by examiner, † Cited by third party

Publication number	Priority date	Publication date	Assignee	Title
CN102209154A (en) *	2010-03-31	2011-10-05	鸿富锦精密工业（深圳）有限公司	Voice dialing system and method thereof
CN104717339A (en) *	2013-12-13	2015-06-17	富泰华工业（深圳）有限公司	Contact query system, contact query method and communication device
US20150381787A1 (en) *	2014-06-26	2015-12-31	Anatoliy Babayev	Voice Only Phone and Method of Operation
CN105653596A (en) *	2015-12-22	2016-06-08	惠州Tcl移动通信有限公司	Quick startup method and device of specific function on the basis of voice frequency comparison
CN111798845A (en) *	2020-05-28	2020-10-20	厦门快商通科技股份有限公司	Off-line voice control method and device for smart home and storage device

Publication	Publication Date	Title
CN109410664B (en)	2021-01-26	Pronunciation correction method and electronic equipment
US8612212B2 (en)	2013-12-17	Method and system for automatically detecting morphemes in a task classification system using lattices
US6681206B1 (en)	2004-01-20	Method for generating morphemes
CN108536654B (en)	2022-05-17	Method and device for displaying identification text
US20050033575A1 (en)	2005-02-10	Operating method for an automated language recognizer intended for the speaker-independent language recognition of words in different languages and automated language recognizer
WO2002035519A1 (en)	2002-05-02	Speech recognition using word-in-phrase command
CN112466287B (en)	2023-06-27	Voice segmentation method, device and computer readable storage medium
CN111159375A (en)	2020-05-15	Text processing method and device
CN117010907A (en)	2023-11-07	Multi-mode customer service method and system based on voice and image recognition
CN112818680B (en)	2023-08-01	Corpus processing method and device, electronic equipment and computer readable storage medium
CN112233680A (en)	2021-01-15	Speaker role identification method and device, electronic equipment and storage medium
Arslan et al.	2020	A detailed survey of Turkish automatic speech recognition
EP1617409B1 (en)	2011-04-20	Multimodal method to provide input to a computing device
CN117352000A (en)	2024-01-05	Speech classification method, device, electronic equipment and computer readable medium
CN113555133A (en)	2021-10-26	Medical inquiry data processing method and device
CN113051384A (en)	2021-06-29	User portrait extraction method based on conversation and related device
US20110119052A1 (en)	2011-05-19	Speech recognition dictionary creating support device, computer readable medium storing processing program, and processing method
CN110809796B (en)	2020-09-18	Speech recognition system and method with decoupled wake phrases
JPWO2014033855A1 (en)	2016-08-08	Voice search device, computer-readable storage medium, and voice search method
CN113112996A (en)	2021-07-13	System and method for speech-based audio and text alignment
CN117634471A (en)	2024-03-01	NLP quality inspection method and computer readable storage medium
CN112885331A (en)	2021-06-01	Dialect voice dialing communication equipment based on small vocabulary voice recognition technology
JPH1097285A (en)	1998-04-14	Speech recognition system
CN116052655A (en)	2023-05-02	Audio processing method, device, electronic equipment and readable storage medium
CN115577712A (en)	2023-01-06	Text error correction method and device

Legal Events

Date	Code	Title	Description
2021-06-01	PB01	Publication
2021-06-01	PB01	Publication
2021-06-18	SE01	Entry into force of request for substantive examination
2021-06-18	SE01	Entry into force of request for substantive examination
2022-12-09	RJ01	Rejection of invention patent application after publication
2022-12-09	RJ01	Rejection of invention patent application after publication	Application publication date: 20210601

CN112885331A - Dialect voice dialing communication equipment based on small vocabulary voice recognition technology - Google Patents

Info

Links

Images

Classifications

Landscapes

Abstract

Description

Claims (7)

Priority Applications (1)

Applications Claiming Priority (1)

Publications (1)

Family

ID=76053841

Family Applications (1)

Country Status (1)

Citations (5)

Patent Citations (5)

Similar Documents

Legal Events