-
Notifications
You must be signed in to change notification settings - Fork 822
Expand file tree
/
Copy pathtext_to_speech_v1.py
More file actions
3769 lines (3331 loc) · 160 KB
/
text_to_speech_v1.py
File metadata and controls
3769 lines (3331 loc) · 160 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
# coding: utf-8
# (C) Copyright IBM Corp. 2015, 2025.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# IBM OpenAPI SDK Code Generator Version: 3.105.0-3c13b041-20250605-193116
"""
The IBM Watson™ Text to Speech service provides APIs that use IBM's speech-synthesis
capabilities to synthesize text into natural-sounding speech in a variety of languages,
dialects, and voices. The service supports at least one male or female voice, sometimes
both, for each language. The audio is streamed back to the client with minimal delay.
For speech synthesis, the service supports a synchronous HTTP Representational State
Transfer (REST) interface and a WebSocket interface. Both interfaces support plain text
and SSML input. SSML is an XML-based markup language that provides text annotation for
speech-synthesis applications. The WebSocket interface also supports the SSML
<code><mark></code> element and word timings.
The service offers a customization interface that you can use to define sounds-like or
phonetic translations for words. A sounds-like translation consists of one or more words
that, when combined, sound like the word. A phonetic translation is based on the SSML
phoneme format for representing a word. You can specify a phonetic translation in standard
International Phonetic Alphabet (IPA) representation or in the proprietary IBM Symbolic
Phonetic Representation (SPR).
The service also offers a Tune by Example feature that lets you define custom prompts. You
can also define speaker models to improve the quality of your custom prompts. The service
supports custom prompts only for US English custom models and voices.
API Version: 1.0.0
See: https://cloud.ibm.com/docs/text-to-speech
"""
from enum import Enum
from typing import BinaryIO, Dict, List, Optional
import json
from ibm_cloud_sdk_core import BaseService, DetailedResponse
from ibm_cloud_sdk_core.authenticators.authenticator import Authenticator
from ibm_cloud_sdk_core.get_authenticator import get_authenticator_from_environment
from ibm_cloud_sdk_core.utils import convert_model
from .common import get_sdk_headers
##############################################################################
# Service
##############################################################################
class TextToSpeechV1(BaseService):
"""The Text to Speech V1 service."""
DEFAULT_SERVICE_URL = 'https://api.us-south.text-to-speech.watson.cloud.ibm.com'
DEFAULT_SERVICE_NAME = 'text_to_speech'
def __init__(
self,
authenticator: Authenticator = None,
service_name: str = DEFAULT_SERVICE_NAME,
) -> None:
"""
Construct a new client for the Text to Speech service.
:param Authenticator authenticator: The authenticator specifies the authentication mechanism.
Get up to date information from https://github.com/IBM/python-sdk-core/blob/main/README.md
about initializing the authenticator of your choice.
"""
if not authenticator:
authenticator = get_authenticator_from_environment(service_name)
BaseService.__init__(self,
service_url=self.DEFAULT_SERVICE_URL,
authenticator=authenticator)
self.configure_service(service_name)
#########################
# Voices
#########################
def list_voices(
self,
**kwargs,
) -> DetailedResponse:
"""
List voices.
Lists all voices available for use with the service. The information includes the
name, language, gender, and other details about the voice. The ordering of the
list of voices can change from call to call; do not rely on an alphabetized or
static list of voices. To see information about a specific voice, use the [Get a
voice](#getvoice).
**See also:** [Listing all
voices](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-voices-list#list-all-voices).
:param dict headers: A `dict` containing the request headers
:return: A `DetailedResponse` containing the result, headers and HTTP status code.
:rtype: DetailedResponse with `dict` result representing a `Voices` object
"""
headers = {}
sdk_headers = get_sdk_headers(
service_name=self.DEFAULT_SERVICE_NAME,
service_version='V1',
operation_id='list_voices',
)
headers.update(sdk_headers)
if 'headers' in kwargs:
headers.update(kwargs.get('headers'))
del kwargs['headers']
headers['Accept'] = 'application/json'
url = '/v1/voices'
request = self.prepare_request(
method='GET',
url=url,
headers=headers,
)
response = self.send(request, **kwargs)
return response
def get_voice(
self,
voice: str,
*,
customization_id: Optional[str] = None,
**kwargs,
) -> DetailedResponse:
"""
Get a voice.
Gets information about the specified voice. The information includes the name,
language, gender, and other details about the voice. Specify a customization ID to
obtain information for a custom model that is defined for the language of the
specified voice. To list information about all available voices, use the [List
voices](#listvoices) method.
**See also:** [Listing a specific
voice](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-voices-list#list-specific-voice).
:param str voice: The voice for which information is to be returned.
:param str customization_id: (optional) The customization ID (GUID) of a
custom model for which information is to be returned. You must make the
request with credentials for the instance of the service that owns the
custom model. Omit the parameter to see information about the specified
voice with no customization.
:param dict headers: A `dict` containing the request headers
:return: A `DetailedResponse` containing the result, headers and HTTP status code.
:rtype: DetailedResponse with `dict` result representing a `Voice` object
"""
if not voice:
raise ValueError('voice must be provided')
headers = {}
sdk_headers = get_sdk_headers(
service_name=self.DEFAULT_SERVICE_NAME,
service_version='V1',
operation_id='get_voice',
)
headers.update(sdk_headers)
params = {
'customization_id': customization_id,
}
if 'headers' in kwargs:
headers.update(kwargs.get('headers'))
del kwargs['headers']
headers['Accept'] = 'application/json'
path_param_keys = ['voice']
path_param_values = self.encode_path_vars(voice)
path_param_dict = dict(zip(path_param_keys, path_param_values))
url = '/v1/voices/{voice}'.format(**path_param_dict)
request = self.prepare_request(
method='GET',
url=url,
headers=headers,
params=params,
)
response = self.send(request, **kwargs)
return response
#########################
# Synthesis
#########################
def synthesize(
self,
text: str,
*,
accept: Optional[str] = None,
voice: Optional[str] = None,
customization_id: Optional[str] = None,
spell_out_mode: Optional[str] = None,
rate_percentage: Optional[int] = None,
pitch_percentage: Optional[int] = None,
**kwargs,
) -> DetailedResponse:
"""
Synthesize audio.
Synthesizes text to audio that is spoken in the specified voice. The service bases
its understanding of the language for the input text on the specified voice. Use a
voice that matches the language of the input text.
The method accepts a maximum of 5 KB of input text in the body of the request, and
8 KB for the URL and headers. The 5 KB limit includes any SSML tags that you
specify. The service returns the synthesized audio stream as an array of bytes.
**See also:** [The HTTP
interface](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-usingHTTP#usingHTTP).
### Audio formats (accept types)
The service can return audio in the following formats (MIME types).
* Where indicated, you can optionally specify the sampling rate (`rate`) of the
audio. You must specify a sampling rate for the `audio/alaw`, `audio/l16`, and
`audio/mulaw` formats. A specified sampling rate must lie in the range of 8 kHz to
192 kHz. Some formats restrict the sampling rate to certain values, as noted.
* For the `audio/l16` format, you can optionally specify the endianness
(`endianness`) of the audio: `endianness=big-endian` or
`endianness=little-endian`.
Use the `Accept` header or the `accept` parameter to specify the requested format
of the response audio. If you omit an audio format altogether, the service returns
the audio in Ogg format with the Opus codec (`audio/ogg;codecs=opus`). The service
always returns single-channel audio.
* `audio/alaw` - You must specify the `rate` of the audio.
* `audio/basic` - The service returns audio with a sampling rate of 8000 Hz.
* `audio/flac` - You can optionally specify the `rate` of the audio. The default
sampling rate is 24,000 Hz for Natural voices and 22,050 Hz for all other voices.
* `audio/l16` - You must specify the `rate` of the audio. You can optionally
specify the `endianness` of the audio. The default endianness is `little-endian`.
* `audio/mp3` - You can optionally specify the `rate` of the audio. The default
sampling rate is 24,000 Hz for Natural voices and 22,050 Hz for for all other
voices.
* `audio/mpeg` - You can optionally specify the `rate` of the audio. The default
sampling rate is 24,000 Hz for Natural voices and 22,050 Hz for all other voices.
* `audio/mulaw` - You must specify the `rate` of the audio.
* `audio/ogg` - The service returns the audio in the `vorbis` codec. You can
optionally specify the `rate` of the audio. The default sampling rate is 48,000
Hz.
* `audio/ogg;codecs=opus` - You can optionally specify the `rate` of the audio.
Only the following values are valid sampling rates: `48000`, `24000`, `16000`,
`12000`, or `8000`. If you specify a value other than one of these, the service
returns an error. The default sampling rate is 48,000 Hz.
* `audio/ogg;codecs=vorbis` - You can optionally specify the `rate` of the audio.
The default sampling rate is 48,000 Hz.
* `audio/wav` - You can optionally specify the `rate` of the audio. The default
sampling rate is 24,000 Hz for Natural voices and 22,050 Hz for all other voices.
* `audio/webm` - The service returns the audio in the `opus` codec. The service
returns audio with a sampling rate of 48,000 Hz.
* `audio/webm;codecs=opus` - The service returns audio with a sampling rate of
48,000 Hz.
* `audio/webm;codecs=vorbis` - You can optionally specify the `rate` of the audio.
The default sampling rate is 48,000 Hz.
For more information about specifying an audio format, including additional
details about some of the formats, see [Using audio
formats](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-audio-formats).
**Note:** By default, the service returns audio in the Ogg audio format with the
Opus codec (`audio/ogg;codecs=opus`). However, the Ogg audio format is not
supported with the Safari browser. If you are using the service with the Safari
browser, you must use the `Accept` request header or the `accept` query parameter
specify a different format in which you want the service to return the audio.
### Warning messages
If a request includes invalid query parameters, the service returns a `Warnings`
response header that provides messages about the invalid parameters. The warning
includes a descriptive message and a list of invalid argument strings. For
example, a message such as `"Unknown arguments:"` or `"Unknown url query
arguments:"` followed by a list of the form `"{invalid_arg_1}, {invalid_arg_2}."`
The request succeeds despite the warnings.
:param str text: The text to synthesize.
:param str accept: (optional) The requested format (MIME type) of the
audio. You can use the `Accept` header or the `accept` parameter to specify
the audio format. For more information about specifying an audio format,
see **Audio formats (accept types)** in the method description.
:param str voice: (optional) The voice to use for speech synthesis. If you
omit the `voice` parameter, the service uses the US English
`en-US_MichaelV3Voice` by default.
_For IBM Cloud Pak for Data,_ if you do not install the
`en-US_MichaelV3Voice`, you must either specify a voice with the request or
specify a new default voice for your installation of the service.
**See also:**
* [Languages and
voices](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-voices)
* [Using the default
voice](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-voices-use#specify-voice-default).
:param str customization_id: (optional) The customization ID (GUID) of a
custom model to use for the synthesis. If a custom model is specified, it
works only if it matches the language of the indicated voice. You must make
the request with credentials for the instance of the service that owns the
custom model. Omit the parameter to use the specified voice with no
customization.
:param str spell_out_mode: (optional) *For German voices,* indicates how
the service is to spell out strings of individual letters. To indicate the
pace of the spelling, specify one of the following values:
* `default` - The service reads the characters at the rate at which it
synthesizes speech for the request. You can also omit the parameter
entirely to achieve the default behavior.
* `singles` - The service reads the characters one at a time, with a brief
pause between each character.
* `pairs` - The service reads the characters two at a time, with a brief
pause between each pair.
* `triples` - The service reads the characters three at a time, with a
brief pause between each triplet.
For more information, see [Specifying how strings are spelled
out](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-synthesis-params#params-spell-out-mode).
:param int rate_percentage: (optional) The percentage change from the
default speaking rate of the voice that is used for speech synthesis. Each
voice has a default speaking rate that is optimized to represent a normal
rate of speech. The parameter accepts an integer that represents the
percentage change from the voice's default rate:
* Specify a signed negative integer to reduce the speaking rate by that
percentage. For example, -10 reduces the rate by ten percent.
* Specify an unsigned or signed positive integer to increase the speaking
rate by that percentage. For example, 10 and +10 increase the rate by ten
percent.
* Specify 0 or omit the parameter to get the default speaking rate for the
voice.
The parameter affects the rate for an entire request.
For more information, see [Modifying the speaking
rate](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-synthesis-params#params-rate-percentage).
:param int pitch_percentage: (optional) The percentage change from the
default speaking pitch of the voice that is used for speech synthesis. Each
voice has a default speaking pitch that is optimized to represent a normal
tone of voice. The parameter accepts an integer that represents the
percentage change from the voice's default tone:
* Specify a signed negative integer to lower the voice's pitch by that
percentage. For example, -5 reduces the tone by five percent.
* Specify an unsigned or signed positive integer to increase the voice's
pitch by that percentage. For example, 5 and +5 increase the tone by five
percent.
* Specify 0 or omit the parameter to get the default speaking pitch for the
voice.
The parameter affects the pitch for an entire request.
For more information, see [Modifying the speaking
pitch](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-synthesis-params#params-pitch-percentage).
:param dict headers: A `dict` containing the request headers
:return: A `DetailedResponse` containing the result, headers and HTTP status code.
:rtype: DetailedResponse with `BinaryIO` result
"""
if text is None:
raise ValueError('text must be provided')
headers = {
'Accept': accept,
}
sdk_headers = get_sdk_headers(
service_name=self.DEFAULT_SERVICE_NAME,
service_version='V1',
operation_id='synthesize',
)
headers.update(sdk_headers)
params = {
'voice': voice,
'customization_id': customization_id,
'spell_out_mode': spell_out_mode,
'rate_percentage': rate_percentage,
'pitch_percentage': pitch_percentage,
}
data = {
'text': text,
}
data = {k: v for (k, v) in data.items() if v is not None}
data = json.dumps(data)
headers['content-type'] = 'application/json'
if 'headers' in kwargs:
headers.update(kwargs.get('headers'))
del kwargs['headers']
url = '/v1/synthesize'
request = self.prepare_request(
method='POST',
url=url,
headers=headers,
params=params,
data=data,
)
response = self.send(request, **kwargs)
return response
#########################
# Pronunciation
#########################
def get_pronunciation(
self,
text: str,
*,
voice: Optional[str] = None,
format: Optional[str] = None,
customization_id: Optional[str] = None,
**kwargs,
) -> DetailedResponse:
"""
Get pronunciation.
Gets the phonetic pronunciation for the specified word. You can request the
pronunciation for a specific format. You can also request the pronunciation for a
specific voice to see the default translation for the language of that voice or
for a specific custom model to see the translation for that model.
**See also:** [Querying a word from a
language](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordsQueryLanguage).
:param str text: The word for which the pronunciation is requested.
:param str voice: (optional) A voice that specifies the language in which
the pronunciation is to be returned. If you omit the `voice` parameter, the
service uses the US English `en-US_MichaelV3Voice` by default. All voices
for the same language (for example, `en-US`) return the same translation.
_For IBM Cloud Pak for Data,_ if you do not install the
`en-US_MichaelV3Voice`, you must either specify a voice with the request or
specify a new default voice for your installation of the service.
**See also:** [Using the default
voice](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-voices-use#specify-voice-default).
:param str format: (optional) The phoneme format in which to return the
pronunciation. Omit the parameter to obtain the pronunciation in the
default format.
:param str customization_id: (optional) The customization ID (GUID) of a
custom model for which the pronunciation is to be returned. The language of
a specified custom model must match the language of the specified voice. If
the word is not defined in the specified custom model, the service returns
the default translation for the custom model's language. You must make the
request with credentials for the instance of the service that owns the
custom model. Omit the parameter to see the translation for the specified
voice with no customization.
:param dict headers: A `dict` containing the request headers
:return: A `DetailedResponse` containing the result, headers and HTTP status code.
:rtype: DetailedResponse with `dict` result representing a `Pronunciation` object
"""
if not text:
raise ValueError('text must be provided')
headers = {}
sdk_headers = get_sdk_headers(
service_name=self.DEFAULT_SERVICE_NAME,
service_version='V1',
operation_id='get_pronunciation',
)
headers.update(sdk_headers)
params = {
'text': text,
'voice': voice,
'format': format,
'customization_id': customization_id,
}
if 'headers' in kwargs:
headers.update(kwargs.get('headers'))
del kwargs['headers']
headers['Accept'] = 'application/json'
url = '/v1/pronunciation'
request = self.prepare_request(
method='GET',
url=url,
headers=headers,
params=params,
)
response = self.send(request, **kwargs)
return response
#########################
# Custom models
#########################
def create_custom_model(
self,
name: str,
*,
language: Optional[str] = None,
description: Optional[str] = None,
**kwargs,
) -> DetailedResponse:
"""
Create a custom model.
Creates a new empty custom model. You must specify a name for the new custom
model. You can optionally specify the language and a description for the new
model. The model is owned by the instance of the service whose credentials are
used to create it.
**See also:** [Creating a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsCreate).
:param str name: The name of the new custom model. Use a localized name
that matches the language of the custom model. Use a name that describes
the purpose of the custom model, such as `Medical custom model` or `Legal
custom model`. Use a name that is unique among all custom models that you
own.
Include a maximum of 256 characters in the name. Do not use backslashes,
slashes, colons, equal signs, ampersands, or question marks in the name.
:param str language: (optional) The language of the new custom model. You
create a custom model for a specific language, not for a specific voice. A
custom model can be used with any voice for its specified language. Omit
the parameter to use the the default language, `en-US`.
:param str description: (optional) A recommended description of the new
custom model. Use a localized description that matches the language of the
custom model. Include a maximum of 128 characters in the description.
:param dict headers: A `dict` containing the request headers
:return: A `DetailedResponse` containing the result, headers and HTTP status code.
:rtype: DetailedResponse with `dict` result representing a `CustomModel` object
"""
if name is None:
raise ValueError('name must be provided')
headers = {}
sdk_headers = get_sdk_headers(
service_name=self.DEFAULT_SERVICE_NAME,
service_version='V1',
operation_id='create_custom_model',
)
headers.update(sdk_headers)
data = {
'name': name,
'language': language,
'description': description,
}
data = {k: v for (k, v) in data.items() if v is not None}
data = json.dumps(data)
headers['content-type'] = 'application/json'
if 'headers' in kwargs:
headers.update(kwargs.get('headers'))
del kwargs['headers']
headers['Accept'] = 'application/json'
url = '/v1/customizations'
request = self.prepare_request(
method='POST',
url=url,
headers=headers,
data=data,
)
response = self.send(request, **kwargs)
return response
def list_custom_models(
self,
*,
language: Optional[str] = None,
**kwargs,
) -> DetailedResponse:
"""
List custom models.
Lists metadata such as the name and description for all custom models that are
owned by an instance of the service. Specify a language to list the custom models
for that language only. To see the words and prompts in addition to the metadata
for a specific custom model, use the [Get a custom model](#getcustommodel) method.
You must use credentials for the instance of the service that owns a model to list
information about it.
**See also:** [Querying all custom
models](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsQueryAll).
:param str language: (optional) The language for which custom models that
are owned by the requesting credentials are to be returned. Omit the
parameter to see all custom models that are owned by the requester.
:param dict headers: A `dict` containing the request headers
:return: A `DetailedResponse` containing the result, headers and HTTP status code.
:rtype: DetailedResponse with `dict` result representing a `CustomModels` object
"""
headers = {}
sdk_headers = get_sdk_headers(
service_name=self.DEFAULT_SERVICE_NAME,
service_version='V1',
operation_id='list_custom_models',
)
headers.update(sdk_headers)
params = {
'language': language,
}
if 'headers' in kwargs:
headers.update(kwargs.get('headers'))
del kwargs['headers']
headers['Accept'] = 'application/json'
url = '/v1/customizations'
request = self.prepare_request(
method='GET',
url=url,
headers=headers,
params=params,
)
response = self.send(request, **kwargs)
return response
def update_custom_model(
self,
customization_id: str,
*,
name: Optional[str] = None,
description: Optional[str] = None,
words: Optional[List['Word']] = None,
**kwargs,
) -> DetailedResponse:
"""
Update a custom model.
Updates information for the specified custom model. You can update metadata such
as the name and description of the model. You can also update the words in the
model and their translations. Adding a new translation for a word that already
exists in a custom model overwrites the word's existing translation. A custom
model can contain no more than 20,000 entries. You must use credentials for the
instance of the service that owns a model to update it.
You can define sounds-like or phonetic translations for words. A sounds-like
translation consists of one or more words that, when combined, sound like the
word. Phonetic translations are based on the SSML phoneme format for representing
a word. You can specify them in standard International Phonetic Alphabet (IPA)
representation
<code><phoneme alphabet="ipa"
ph="təmˈɑto"></phoneme></code>
or in the proprietary IBM Symbolic Phonetic Representation (SPR)
<code><phoneme alphabet="ibm"
ph="1gAstroEntxrYFXs"></phoneme></code>
**See also:**
* [Updating a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsUpdate)
* [Adding words to a Japanese custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuJapaneseAdd)
* [Understanding
customization](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customIntro#customIntro).
:param str customization_id: The customization ID (GUID) of the custom
model. You must make the request with credentials for the instance of the
service that owns the custom model.
:param str name: (optional) A new name for the custom model.
:param str description: (optional) A new description for the custom model.
:param List[Word] words: (optional) An array of `Word` objects that
provides the words and their translations that are to be added or updated
for the custom model. Pass an empty array to make no additions or updates.
:param dict headers: A `dict` containing the request headers
:return: A `DetailedResponse` containing the result, headers and HTTP status code.
:rtype: DetailedResponse
"""
if not customization_id:
raise ValueError('customization_id must be provided')
if words is not None:
words = [convert_model(x) for x in words]
headers = {}
sdk_headers = get_sdk_headers(
service_name=self.DEFAULT_SERVICE_NAME,
service_version='V1',
operation_id='update_custom_model',
)
headers.update(sdk_headers)
data = {
'name': name,
'description': description,
'words': words,
}
data = {k: v for (k, v) in data.items() if v is not None}
data = json.dumps(data)
headers['content-type'] = 'application/json'
if 'headers' in kwargs:
headers.update(kwargs.get('headers'))
del kwargs['headers']
headers['Accept'] = 'application/json'
path_param_keys = ['customization_id']
path_param_values = self.encode_path_vars(customization_id)
path_param_dict = dict(zip(path_param_keys, path_param_values))
url = '/v1/customizations/{customization_id}'.format(**path_param_dict)
request = self.prepare_request(
method='POST',
url=url,
headers=headers,
data=data,
)
response = self.send(request, **kwargs)
return response
def get_custom_model(
self,
customization_id: str,
**kwargs,
) -> DetailedResponse:
"""
Get a custom model.
Gets all information about a specified custom model. In addition to metadata such
as the name and description of the custom model, the output includes the words and
their translations that are defined for the model, as well as any prompts that are
defined for the model. To see just the metadata for a model, use the [List custom
models](#listcustommodels) method.
**See also:** [Querying a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsQuery).
:param str customization_id: The customization ID (GUID) of the custom
model. You must make the request with credentials for the instance of the
service that owns the custom model.
:param dict headers: A `dict` containing the request headers
:return: A `DetailedResponse` containing the result, headers and HTTP status code.
:rtype: DetailedResponse with `dict` result representing a `CustomModel` object
"""
if not customization_id:
raise ValueError('customization_id must be provided')
headers = {}
sdk_headers = get_sdk_headers(
service_name=self.DEFAULT_SERVICE_NAME,
service_version='V1',
operation_id='get_custom_model',
)
headers.update(sdk_headers)
if 'headers' in kwargs:
headers.update(kwargs.get('headers'))
del kwargs['headers']
headers['Accept'] = 'application/json'
path_param_keys = ['customization_id']
path_param_values = self.encode_path_vars(customization_id)
path_param_dict = dict(zip(path_param_keys, path_param_values))
url = '/v1/customizations/{customization_id}'.format(**path_param_dict)
request = self.prepare_request(
method='GET',
url=url,
headers=headers,
)
response = self.send(request, **kwargs)
return response
def delete_custom_model(
self,
customization_id: str,
**kwargs,
) -> DetailedResponse:
"""
Delete a custom model.
Deletes the specified custom model. You must use credentials for the instance of
the service that owns a model to delete it.
**See also:** [Deleting a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customModels#cuModelsDelete).
:param str customization_id: The customization ID (GUID) of the custom
model. You must make the request with credentials for the instance of the
service that owns the custom model.
:param dict headers: A `dict` containing the request headers
:return: A `DetailedResponse` containing the result, headers and HTTP status code.
:rtype: DetailedResponse
"""
if not customization_id:
raise ValueError('customization_id must be provided')
headers = {}
sdk_headers = get_sdk_headers(
service_name=self.DEFAULT_SERVICE_NAME,
service_version='V1',
operation_id='delete_custom_model',
)
headers.update(sdk_headers)
if 'headers' in kwargs:
headers.update(kwargs.get('headers'))
del kwargs['headers']
path_param_keys = ['customization_id']
path_param_values = self.encode_path_vars(customization_id)
path_param_dict = dict(zip(path_param_keys, path_param_values))
url = '/v1/customizations/{customization_id}'.format(**path_param_dict)
request = self.prepare_request(
method='DELETE',
url=url,
headers=headers,
)
response = self.send(request, **kwargs)
return response
#########################
# Custom words
#########################
def add_words(
self,
customization_id: str,
words: List['Word'],
**kwargs,
) -> DetailedResponse:
"""
Add custom words.
Adds one or more words and their translations to the specified custom model.
Adding a new translation for a word that already exists in a custom model
overwrites the word's existing translation. A custom model can contain no more
than 20,000 entries. You must use credentials for the instance of the service that
owns a model to add words to it.
You can define sounds-like or phonetic translations for words. A sounds-like
translation consists of one or more words that, when combined, sound like the
word. Phonetic translations are based on the SSML phoneme format for representing
a word. You can specify them in standard International Phonetic Alphabet (IPA)
representation
<code><phoneme alphabet="ipa"
ph="təmˈɑto"></phoneme></code>
or in the proprietary IBM Symbolic Phonetic Representation (SPR)
<code><phoneme alphabet="ibm"
ph="1gAstroEntxrYFXs"></phoneme></code>
**See also:**
* [Adding multiple words to a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordsAdd)
* [Adding words to a Japanese custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuJapaneseAdd)
* [Understanding
customization](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customIntro#customIntro).
:param str customization_id: The customization ID (GUID) of the custom
model. You must make the request with credentials for the instance of the
service that owns the custom model.
:param List[Word] words: The [Add custom words](#addwords) method accepts
an array of `Word` objects. Each object provides a word that is to be added
or updated for the custom model and the word's translation.
The [List custom words](#listwords) method returns an array of `Word`
objects. Each object shows a word and its translation from the custom
model. The words are listed in alphabetical order, with uppercase letters
listed before lowercase letters. The array is empty if the custom model
contains no words.
:param dict headers: A `dict` containing the request headers
:return: A `DetailedResponse` containing the result, headers and HTTP status code.
:rtype: DetailedResponse
"""
if not customization_id:
raise ValueError('customization_id must be provided')
if words is None:
raise ValueError('words must be provided')
words = [convert_model(x) for x in words]
headers = {}
sdk_headers = get_sdk_headers(
service_name=self.DEFAULT_SERVICE_NAME,
service_version='V1',
operation_id='add_words',
)
headers.update(sdk_headers)
data = {
'words': words,
}
data = {k: v for (k, v) in data.items() if v is not None}
data = json.dumps(data)
headers['content-type'] = 'application/json'
if 'headers' in kwargs:
headers.update(kwargs.get('headers'))
del kwargs['headers']
headers['Accept'] = 'application/json'
path_param_keys = ['customization_id']
path_param_values = self.encode_path_vars(customization_id)
path_param_dict = dict(zip(path_param_keys, path_param_values))
url = '/v1/customizations/{customization_id}/words'.format(
**path_param_dict)
request = self.prepare_request(
method='POST',
url=url,
headers=headers,
data=data,
)
response = self.send(request, **kwargs)
return response
def list_words(
self,
customization_id: str,
**kwargs,
) -> DetailedResponse:
"""
List custom words.
Lists all of the words and their translations for the specified custom model. The
output shows the translations as they are defined in the model. You must use
credentials for the instance of the service that owns a model to list its words.
**See also:** [Querying all words from a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordsQueryModel).
:param str customization_id: The customization ID (GUID) of the custom
model. You must make the request with credentials for the instance of the
service that owns the custom model.
:param dict headers: A `dict` containing the request headers
:return: A `DetailedResponse` containing the result, headers and HTTP status code.
:rtype: DetailedResponse with `dict` result representing a `Words` object
"""
if not customization_id:
raise ValueError('customization_id must be provided')
headers = {}
sdk_headers = get_sdk_headers(
service_name=self.DEFAULT_SERVICE_NAME,
service_version='V1',
operation_id='list_words',
)
headers.update(sdk_headers)
if 'headers' in kwargs:
headers.update(kwargs.get('headers'))
del kwargs['headers']
headers['Accept'] = 'application/json'
path_param_keys = ['customization_id']
path_param_values = self.encode_path_vars(customization_id)
path_param_dict = dict(zip(path_param_keys, path_param_values))
url = '/v1/customizations/{customization_id}/words'.format(
**path_param_dict)
request = self.prepare_request(
method='GET',
url=url,
headers=headers,
)
response = self.send(request, **kwargs)
return response
def add_word(
self,
customization_id: str,
word: str,
translation: str,
*,
part_of_speech: Optional[str] = None,
**kwargs,
) -> DetailedResponse:
"""
Add a custom word.
Adds a single word and its translation to the specified custom model. Adding a new
translation for a word that already exists in a custom model overwrites the word's
existing translation. A custom model can contain no more than 20,000 entries. You
must use credentials for the instance of the service that owns a model to add a
word to it.
You can define sounds-like or phonetic translations for words. A sounds-like
translation consists of one or more words that, when combined, sound like the
word. Phonetic translations are based on the SSML phoneme format for representing
a word. You can specify them in standard International Phonetic Alphabet (IPA)
representation
<code><phoneme alphabet="ipa"
ph="təmˈɑto"></phoneme></code>
or in the proprietary IBM Symbolic Phonetic Representation (SPR)
<code><phoneme alphabet="ibm"
ph="1gAstroEntxrYFXs"></phoneme></code>
**See also:**
* [Adding a single word to a custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuWordAdd)
* [Adding words to a Japanese custom
model](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customWords#cuJapaneseAdd)
* [Understanding
customization](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-customIntro#customIntro).
:param str customization_id: The customization ID (GUID) of the custom
model. You must make the request with credentials for the instance of the
service that owns the custom model.
:param str word: The word that is to be added or updated for the custom
model.
:param str translation: The phonetic or sounds-like translation for the
word. A phonetic translation is based on the SSML format for representing
the phonetic string of a word either as an IPA translation or as an IBM SPR
translation. A sounds-like is one or more words that, when combined, sound
like the word.
:param str part_of_speech: (optional) **Japanese only.** The part of speech
for the word. The service uses the value to produce the correct intonation
for the word. You can create only a single entry, with or without a single
part of speech, for any word; you cannot create multiple entries with
different parts of speech for the same word. For more information, see
[Working with Japanese
entries](https://cloud.ibm.com/docs/text-to-speech?topic=text-to-speech-rules#jaNotes).
:param dict headers: A `dict` containing the request headers
:return: A `DetailedResponse` containing the result, headers and HTTP status code.
:rtype: DetailedResponse
"""
if not customization_id:
raise ValueError('customization_id must be provided')
if not word:
raise ValueError('word must be provided')
if translation is None:
raise ValueError('translation must be provided')
headers = {}
sdk_headers = get_sdk_headers(
service_name=self.DEFAULT_SERVICE_NAME,