대한민국 인공지능 여기까지, 솔트룩스

Ver 2.x

It can recognize your voice and talk in a range of human-like voices.

VOICE STUDIO is a high-quality voice recognition and voice synthesis studio that can be used in a variety of voice-based businesses. Accurate speech recognition and human-like natural speech synthesis technology enable seamless communication between people and AI.

#AI consultation extension
#Automatic subtitle generation
#Voice-based virtual assistant
#Voice search
#AI voice recording
#AI human

What makes VOICE STUDIO so special?

VOICE STUDIO provides high-quality voice recognition and synthesis service with only a small amount of domain data training through the built-in general model with more than 10,000 hours of training.

Point 01
Speech recognition rate of 95.6%

and natural synthesis
Point 02
Real-time voice recognition of

streamed content
Point 03
5 minutes worth of speaker identification

every second
Point 04
Transfer learning-based

speech synthesis
Point 05
One-click integrated

management tool

System Configuration

Real-Time STT

Real-time voice recognition boasting a 95.6% accuracy

VOICE STUDIO's voice recognition engine can be used for a variety of real-time voice services based on the voice interface. And faster and more accurate voice recognition can be achieved through end-to-end integrated learning, moving away from traditional language models, sound models, pronunciation dictionaries, etc. that are trained individually.

Features

01High performance base model with over 10,000 learning hours

To ensure adequate performance in domains with less data, we offer a high-performance baseline model trained using approximately 10,000 hours of reliable 8kh/16kh data.
02End-to-end real-time streaming speech recognition technology

The Streaming Transformer Voice Recognition technology is applied to process voice recognition in real time, enabling natural conversation processing.
03Speaker identification engine

Equipped with an i-Vector-based speaker identification engine, it is possible to process 5 minutes of voice data every second when operated in batch.
04One-click integrated management tool

We offer tools to handle voice learning, model construction and arrangement, verification, as well as services that can be used with only a few clicks, even if you are not a specialist in development.

Realistic TTS

Voice synthesis that creates real human-like voice

VOICE STUDIO's voice synthesis engine artificially creates a human voice through a voice model that is trained using text. The deep learning-based end-to-end transfer learning technology, in particular, enables high-quality speech synthesis using only a small amount of data.

Features

01Adaptive learning-based speech synthesis technology

You can create a business-grade voice TTS in the style preferred by the client by fusing a small quantity of voice data of the desired style with existing voice data using the transfer learning-based speech synthesis model.
02Voice synthesis with 30-minute audio data

Using deep learning-based end-to-end transfer learning, high-quality voice synthesis of the target entity is now achievable with far less learning data than the current speech segment synthesis approach.
03Domain-specific Hangul text conversion

In order to generate pronunciation for various non-Hangul words, our technology provides non-Hangul notation (numbers, foreign words, abbreviations, etc.) and English phonetic alphabet conversion to provide smooth voice synthesis.
04One-click integrated management tool

A voice synthesis model may be easily created using the management tool, and transfer learning is provided to incorporate limited data into existing models.

VOICE STUDIO TOOL

Professional tool for voice recognition and synthesis

VOICE STUDIO provides key features for voice service implementation, from voice recognition/synthesis model building to distribution and management.

Data management
Learning management
Verification management
Pre-management
End point management
History management

Tool Introduction

01Voice recognition Manage
- Voice recognition engine for real-time streaming
  1. Streaming transformer-based real-time speech recognition model
  2. Connected learning through domain data
  3. Register user vocabulary in real time
  4. Manage voice recognition history inquiry with the management tool
  5. Manage voice recognition learning data and voice recognition models
  6. Verify voice recognition model
  7. Monitor voice recognition server and manage resource
02Speech synthesis
- Create voice synthesis models and offer transfer learning function that merge small data with existing models
  1. Upload and manage transcription and voice data
  2. Edit meta information of learning data
  3. Manage sentence and word dictionaries for speech synthesis
  4. Monitor voice learning performance in real time
  5. Manage transfer learning status
  6. Monitor voice synthesis server and manage resources
  7. Manage voice synthesis processing history
03Voice transcription
- Learning data for implementing voice recognition/synthesis model specific to each domain
  1. Integrated monitoring for work stages
  2. Upload voice audio files and create pre-texts
  3. Manage transcription set per domain
  4. Work management function for each task officer
  5. Inspect and approve work deliverables
  6. Authorize workers with user authorization management

Success Story

LG U+
National Assembly Secretariat
Kiwoom Securities
Yogiyo

AI metahuman (METAHUMAN)
LGU+ “U+Kids Land” AI project Digital human, “AI Garami”

Two AI children, digital AI human Garami who learned refined children's content while the other learned any video content, regardless of its target audience's age.
Watch video
- Received the ‘2020 Korea Advertising Awards’ for creativity, public interest, and educational accomplishments through AI experiments

National Assembly Secretariat
National Assembly Internet broadcasting business AI live meeting subtitle system

Established a demonstration system for live broadcast subtitles with Al voice recognition on the parliamentary telecast to strengthen accessibility of disabled people's activities in accordance with the revision of the National Assembly Act (2020.12.22)
- Implemented intelligent parliament and enhanced public access to information on legislative activities

Outbound AI voice bot
Automate outbound call center AI voice bot utilization

Automated counselors’s tasks such as automatic notification of customer's delinquency, essential notice guide or beneficiary certificate happy calls through outbound callbots.
- Improved the quality of consultation by improving the quality of consultation and providing knowledge information that has been verified.

AI voice bot
AI chatbot combining advanced deep learning technology with 1st gen customer chatbot.

Utilized voice synthesis for contact center announcements and MRC technology to deliver natural speech services and respond to unanswered queries in currently existing chatbots.
- Provided optimal answers to queries that were challenging to answer, increasing chatbot satisfaction and usage at the same time, and offering voice services without heterogeneity.

LG U+
AI metahuman (METAHUMAN)
LGU+ “U+Kids Land” AI project Digital human, “AI Garami”
- Received the ‘2020 Korea Advertising Awards’ for creativity, public interest, and educational accomplishments through AI experiments
Watch video
National Assembly Secretariat
AI voice recognition
National Assembly Internet broadcasting business AI live meeting subtitle system
- Implemented intelligent parliament and enhanced public access to information on legislative activities
Kiwoom Securities
Outbound AI voice bot
Automate outbound call center AI voice bot utilization
- Automated counselors’s tasks such as automatic notification of customer's delinquency, essential notice guide or beneficiary certificate happy calls through outbound callbots.
Yogiyo
AI voice bot
AI chatbot combining advanced deep learning technology with 1st gen customer chatbot.
- Provided optimal answers to queries that were challenging to answer, increasing chatbot satisfaction and usage at the same time, and offering voice services without heterogeneity.

Reference

AI-based consultation booth
Credit Counseling and Recovery Service

Civil AI counseling system on debt settlement
AI customer voice bot
NH Nonghyup Bank

Korea's first real-time remote AI callbot service in the finance contact center
Outbound AI voice bot
NH Nonghyup Bank

Automate outbound call center AI voice bot utilization
Outbound AI callbot
Korea Investment & Securities

AI-based customer support callbot and work chatbot system for employees
Internet Broadcasting System
National Assembly of the Republic of Korea

Live broadcast subtitles with Al voice recognition for parliamentary telecast
AI voice system for civil services
Daejeon Metropolitan City

Smart Mirror, AI voice recognition system for civil services
AI human - Pyongyang Friend
Ministry of Unification

A virtual friend from Pyongyang, trained with North Korean information and voices such as people's speech and intonations.
Remote digital human civil complaint service
Gwangju Metropolitan City

Digital human version of the Mayor of Gwangju
Voice of degenerative brain disease patients
National Information Society Agency

AI learning data building tool for collecting voices of patients with degenerative brain disease
Voice data by subject matter
National Information Society Agency

Over 10,000 hours of broadcast data, over 7,000 hours of conference-related audio
Dialogic AI data of broadcast content
National Information Society Agency

17,000 hours in total of broadcasting content used for voice transcription learning data.
Korean dialect AI data
National Information Society Agency

15,000 hours in total, 2.5 million sentences Learning data of regional dialect transcription

It can recognize your voice and talk in a range of human-like voices.

What makes VOICE STUDIO so special?

Speech recognition rate of 95.6%

Real-time voice recognition of

5 minutes worth of speaker identification

Transfer learning-based

One-click integrated

System Configuration

Core Technology

Real-Time STT

Real-time voice recognition boasting a 95.6% accuracy

Features

01High performance base model with over 10,000 learning hours

02End-to-end real-time streaming speech recognition technology

03Speaker identification engine

04One-click integrated management tool

Realistic TTS

Voice synthesis that creates real human-like voice

Features

01Adaptive learning-based speech synthesis technology

02Voice synthesis with 30-minute audio data

03Domain-specific Hangul text conversion

04One-click integrated management tool

VOICE STUDIO TOOL

Tool Introduction

01Voice recognition Manage

02Speech synthesis

03Voice transcription

Success Story

LGU+ “U+Kids Land” AI project Digital human, “AI Garami”

National Assembly Internet broadcasting business AI live meeting subtitle system

Automate outbound call center AI voice bot utilization

AI chatbot combining advanced deep learning technology with 1st gen customer chatbot.

LGU+ “U+Kids Land” AI project Digital human, “AI Garami”

National Assembly Internet broadcasting business AI live meeting subtitle system

Automate outbound call center AI voice bot utilization

AI chatbot combining advanced deep learning technology with 1st gen customer chatbot.

Reference

Credit Counseling and Recovery Service

NH Nonghyup Bank

NH Nonghyup Bank

Korea Investment & Securities

National Assembly of the Republic of Korea

Daejeon Metropolitan City

Ministry of Unification

Gwangju Metropolitan City

National Information Society Agency

National Information Society Agency

National Information Society Agency

National Information Society Agency

Do you have any questions?​ Listen to SALTLUX​

Do you have any questions?
Listen to SALTLUX