-
Notifications
You must be signed in to change notification settings - Fork 18
Open
Description
Unable to convert ASR Conformer CTC from Nvidia NGC to Riva
Step of reproduction:
- Create conda environment
conda create --name nemo python==3.10.12
conda activate nemo
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
apt-get update && apt-get install -y libsndfile1 ffmpeg
pip install Cython packaging
git clone https://github.com/NVIDIA/NeMo
cd NeMo
./reinstall.sh
pip install nvidia-pyindex
pip install nemo2riva
I'm trying to convert stt_en_conformer_ctc_small model in https://catalog.ngc.nvidia.com/orgs/nvidia/teams/nemo/models/stt_en_conformer_ctc_small/files to Riva file format to use with Nvidia Riva. When i transfer with nemo2riva --out /data_hdd_16t/vuhuy/ASR/models/stt_en_conformer_ctc_small.riva /data_hdd_16t/vuhuy/ASR/models/stt_en_conformer_ctc_small.nemo, it return error:
RuntimeError: The size of tensor a (25000) must match the size of tensor b (9999) at non-singleton dimension 3
Here the full log:
INFO: PyTorch version 2.4.0 available.
[NeMo W 2024-08-22 15:16:20 nemo_logging:349] /data_hdd_16t/nghiavm/NLP/MachineTranslation/NVIDIA_NeMo/NeMo-main/nemo/collections/tts/modules/common.py:206: FutureWarning: 'torch.cuda.amp.autocast(args...) ' is deprecated. Please use 'torch.amp.autocast('cuda', args...) ' instead.
@amp.autocast(False)
INFO: IPATokenizer not found in NeMo, disabling support
[NeMo I 2024-08-22 15:16:20 nemo2riva:38] Logging level set to 20
[NeMo I 2024-08-22 15:16:20 convert:36] Restoring NeMo model from '/data_hdd_16t/vuhuy/ASR/models/stt_en_conformer_ctc_small_1.0.0.nemo'
[NeMo W 2024-08-22 15:16:20 nemo_logging:349] /data_hdd_16t/miniconda3/envs/nvidia-nemo/lib/python3.10/site-packages/lightning_fabric/plugins/environments/slurm.py:204: The 'srun ' command is available on your system but is not used. HINT: If your intention is to run Lightning on SLURM, prepend your python command with 'srun ' like so: srun python /data_hdd_16t/miniconda3/envs/nvidia-nemo/bin/nemo2r ...
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
'Trainer(limit_train_batches=1.0) ' was configured so 100% of the batches per epoch will be used..
'Trainer(limit_val_batches=1.0) ' was configured so 100% of the batches will be used..
'Trainer(limit_test_batches=1.0) ' was configured so 100% of the batches will be used..
'Trainer(limit_predict_batches=1.0) ' was configured so 100% of the batches will be used..
'Trainer(val_check_interval=1.0) ' was configured so validation will run at the end of the training epoch..
[NeMo I 2024-08-22 15:16:20 mixins:173] Tokenizer SentencePieceTokenizer initialized with 128 tokens
[NeMo W 2024-08-22 15:16:20 modelPT:176] If you intend to do training or fine-tuning, please call the ModelPT.setup_training_data() method and provide a valid configuration file to setup the train data loader.
Train config :
manifest_filepath: /data/nemo_asr_set/asr_set_1.4/train_no_appen/tarred_audio_manifest.json
sample_rate: 16000
batch_size: 32
shuffle: true
num_workers: 8
pin_memory: true
use_start_end_token: false
trim_silence: false
max_duration: 20.0
min_duration: 0.1
shuffle_n: 2048
is_tarred: true
tarred_audio_filepaths: /data/nemo_asr_set/asr_set_1.4/train_no_appen/audio__OP_0..2047_CL_.tar
[NeMo W 2024-08-22 15:16:20 modelPT:183] If you intend to do validation, please call the ModelPT.setup_validation_data() or ModelPT.setup_multiple_validation_data() method and provide a valid configuration file to setup the validation data loader(s).
Validation config :
manifest_filepath:
- /manifests/librispeech/librivox-dev-other.json
- /manifests/librispeech/librivox-dev-clean.json
- /manifests/librispeech/librivox-test-other.json
- /manifests/librispeech/librivox-test-clean.json
sample_rate: 16000
batch_size: 32
shuffle: false
num_workers: 8
pin_memory: true
use_start_end_token: false
is_tarred: false
tarred_audio_filepaths: ''
[NeMo W 2024-08-22 15:16:20 modelPT:189] Please call the ModelPT.setup_test_data() or ModelPT.setup_multiple_test_data() method and provide a valid configuration file to setup the test data loader(s).
Test config :
manifest_filepath:
- /manifests/librispeech/librivox-dev-other.json
- /manifests/librispeech/librivox-dev-clean.json
- /manifests/librispeech/librivox-test-other.json
- /manifests/librispeech/librivox-test-clean.json
sample_rate: 16000
batch_size: 32
shuffle: false
num_workers: 8
pin_memory: true
use_start_end_token: false
is_tarred: false
tarred_audio_filepaths: ''
[NeMo I 2024-08-22 15:16:20 features:305] PADDING: 0
[NeMo W 2024-08-22 15:16:21 nemo_logging:349] /data_hdd_16t/nghiavm/NLP/MachineTranslation/NVIDIA_NeMo/NeMo-main/nemo/core/connectors/save_restore_connector.py:682: FutureWarning: You are using 'torch.load ' with 'weights_only=False ' (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for 'weights_only ' will be flipped to 'True '. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via 'torch.serialization.add_safe_globals '. We recommend you start setting 'weights_only=True ' for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
return torch.load(model_weights, map_location='cpu')
[NeMo I 2024-08-22 15:16:21 save_restore_connector:275] Model EncDecCTCModelBPE was successfully restored from /data_hdd_16t/vuhuy/ASR/models/stt_en_conformer_ctc_small_1.0.0.nemo.
[NeMo I 2024-08-22 15:16:21 schema:161] Loaded schema file /data_hdd_16t/miniconda3/envs/nvidia-nemo/lib/python3.10/site-packages/nemo2riva/validation_schemas/nlp-qa-exported-bert.yaml for nemo.collections.nlp.models.QAModel
[NeMo I 2024-08-22 15:16:21 schema:161] Loaded schema file /data_hdd_16t/miniconda3/envs/nvidia-nemo/lib/python3.10/site-packages/nemo2riva/validation_schemas/asr-stt-exported-encdectcmodelbpe.yaml for nemo.collections.asr.models.EncDecCTCModelBPE
[NeMo I 2024-08-22 15:16:21 schema:161] Loaded schema file /data_hdd_16t/miniconda3/envs/nvidia-nemo/lib/python3.10/site-packages/nemo2riva/validation_schemas/nlp-tc-exported-bert.yaml for nemo.collections.nlp.models.TextClassificationModel
[NeMo I 2024-08-22 15:16:21 schema:161] Loaded schema file /data_hdd_16t/miniconda3/envs/nvidia-nemo/lib/python3.10/site-packages/nemo2riva/validation_schemas/nlp-isc-exported-bert.yaml for nemo.collections.nlp.models.IntentSlotClassificationModel
[NeMo I 2024-08-22 15:16:21 schema:161] Loaded schema file /data_hdd_16t/miniconda3/envs/nvidia-nemo/lib/python3.10/site-packages/nemo2riva/validation_schemas/nlp-pc-exported-bert.yaml for nemo.collections.nlp.models.PunctuationCapitalizationModel
[NeMo I 2024-08-22 15:16:21 schema:161] Loaded schema file /data_hdd_16t/miniconda3/envs/nvidia-nemo/lib/python3.10/site-packages/nemo2riva/validation_schemas/nlp-mt-exported-megatronnmtmodel.yaml for nemo.collections.nlp.models.MegatronNMTModel
[NeMo I 2024-08-22 15:16:21 schema:161] Loaded schema file /data_hdd_16t/miniconda3/envs/nvidia-nemo/lib/python3.10/site-packages/nemo2riva/validation_schemas/nlp-mt-exported-encdecmtmodel.yaml for nemo.collections.nlp.models.MTEncDecModel
[NeMo I 2024-08-22 15:16:21 schema:161] Loaded schema file /data_hdd_16t/miniconda3/envs/nvidia-nemo/lib/python3.10/site-packages/nemo2riva/validation_schemas/asr-stt-exported-encdecctcmodel.yaml for nemo.collections.asr.models.EncDecCTCModel
[NeMo I 2024-08-22 15:16:21 schema:161] Loaded schema file /data_hdd_16t/miniconda3/envs/nvidia-nemo/lib/python3.10/site-packages/nemo2riva/validation_schemas/tts-exported-fastpitchmodel.yaml for nemo.collections.tts.models.FastPitchModel
[NeMo I 2024-08-22 15:16:21 schema:161] Loaded schema file /data_hdd_16t/miniconda3/envs/nvidia-nemo/lib/python3.10/site-packages/nemo2riva/validation_schemas/tts-exported-radttsmodel.yaml for nemo.collections.tts.models.RadTTSModel
[NeMo I 2024-08-22 15:16:21 schema:161] Loaded schema file /data_hdd_16t/miniconda3/envs/nvidia-nemo/lib/python3.10/site-packages/nemo2riva/validation_schemas/asr-scr-exported-encdecclsmodel.yaml for nemo.collections.asr.models.classification_models.EncDecClassificationModel
[NeMo I 2024-08-22 15:16:21 schema:161] Loaded schema file /data_hdd_16t/miniconda3/envs/nvidia-nemo/lib/python3.10/site-packages/nemo2riva/validation_schemas/nlp-tkc-exported-bert.yaml for nemo.collections.nlp.models.TokenClassificationModel
[NeMo I 2024-08-22 15:16:21 schema:161] Loaded schema file /data_hdd_16t/miniconda3/envs/nvidia-nemo/lib/python3.10/site-packages/nemo2riva/validation_schemas/tts-exported-hifiganmodel.yaml for nemo.collections.tts.models.HifiGanModel
[NeMo I 2024-08-22 15:16:21 schema:200] Found validation schema for nemo.collections.asr.models.EncDecCTCModelBPE at /data_hdd_16t/miniconda3/envs/nvidia-nemo/lib/python3.10/site-packages/nemo2riva/validation_schemas/asr-stt-exported-encdectcmodelbpe.yaml
[NeMo I 2024-08-22 15:16:21 schema:229] Checking installed NeMo version ... 2.0.0rc2 OK (>=1.1)
[NeMo I 2024-08-22 15:16:21 artifacts:59] Found model at ./model_weights.ckpt
INFO: Checking Nemo version for ConformerEncoder ...
[NeMo I 2024-08-22 15:16:21 schema:229] Checking installed NeMo version ... 2.0.0rc2 OK (>=1.7.0rc0)
[NeMo I 2024-08-22 15:16:21 artifacts:136] Retrieved artifacts: dict_keys(['model_config.yaml', 'tokenizer.model', 'tokenizer.vocab', 'vocab.txt'])
[NeMo W 2024-08-22 15:16:21 nemo_logging:349] /data_hdd_16t/miniconda3/envs/nvidia-nemo/lib/python3.10/site-packages/nemo2riva/cookbook.py:76: FutureWarning: 'torch.cuda.amp.autocast(args...) ' is deprecated. Please use 'torch.amp.autocast('cuda', args...) ' instead.
autocast = torch.cuda.amp.autocast(enabled=True, cache_enabled=False, dtype=torch.float16) if cfg.autocast else nullcontext()
[NeMo I 2024-08-22 15:16:21 cookbook:78] Exporting model EncDecCTCModelBPE with config=ExportConfig(export_subnet=None, export_format='ONNX', export_file='model_graph.onnx', encryption=None, autocast=True, max_dim=100000, export_args={})
[NeMo W 2024-08-22 15:16:22 nemo_logging:349] /data_hdd_16t/nghiavm/NLP/MachineTranslation/NVIDIA_NeMo/NeMo-main/nemo/utils/cast_utils.py:43: FutureWarning: 'torch.cuda.amp.autocast(args...) ' is deprecated. Please use 'torch.amp.autocast('cuda', args...) ' instead.
return torch.cuda.amp.autocast(dtype=torch.bfloat16)
[NeMo E 2024-08-22 15:16:22 cookbook:129] ERROR: Export failed. Please make sure your NeMo model class (<class 'nemo.collections.asr.models.ctc_bpe_models.EncDecCTCModelBPE'>) has working export() and that you have the latest NeMo package installed with [all] dependencies.
Traceback (most recent call last):
File "/data_hdd_16t/miniconda3/envs/nvidia-nemo/bin/nemo2riva", line 8, in <module>
sys.exit(nemo2riva())
File "/data_hdd_16t/miniconda3/envs/nvidia-nemo/lib/python3.10/site-packages/nemo2riva/cli/nemo2riva.py", line 49, in nemo2riva
Nemo2Riva(args)
File "/data_hdd_16t/miniconda3/envs/nvidia-nemo/lib/python3.10/site-packages/nemo2riva/convert.py", line 87, in Nemo2Riva
export_model(
File "/data_hdd_16t/miniconda3/envs/nvidia-nemo/lib/python3.10/site-packages/nemo2riva/cookbook.py", line 130, in export_model
raise e
File "/data_hdd_16t/miniconda3/envs/nvidia-nemo/lib/python3.10/site-packages/nemo2riva/cookbook.py", line 88, in export_model
_, descriptions = model.export(
File "/data_hdd_16t/nghiavm/NLP/MachineTranslation/NVIDIA_NeMo/NeMo-main/nemo/core/classes/exportable.py", line 117, in export
out, descr, out_example = model._export(
File "/data_hdd_16t/nghiavm/NLP/MachineTranslation/NVIDIA_NeMo/NeMo-main/nemo/core/classes/exportable.py", line 197, in _export
output_example = self.forward(*input_list, **input_dict)
File "/data_hdd_16t/nghiavm/NLP/MachineTranslation/NVIDIA_NeMo/NeMo-main/nemo/collections/asr/models/asr_model.py", line 288, in forward_for_export
encoder_output = enc_fun(audio_signal=audio_signal, length=length)
File "/data_hdd_16t/nghiavm/NLP/MachineTranslation/NVIDIA_NeMo/NeMo-main/nemo/collections/asr/modules/conformer_encoder.py", line 461, in forward_for_export
rets = self.forward_internal(
File "/data_hdd_16t/nghiavm/NLP/MachineTranslation/NVIDIA_NeMo/NeMo-main/nemo/collections/asr/modules/conformer_encoder.py", line 583, in forward_internal
audio_signal = layer(
File "/data_hdd_16t/miniconda3/envs/nvidia-nemo/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/data_hdd_16t/miniconda3/envs/nvidia-nemo/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/data_hdd_16t/nghiavm/NLP/MachineTranslation/NVIDIA_NeMo/NeMo-main/nemo/collections/asr/parts/submodules/conformer_modules.py", line 171, in forward
x = self.self_attn(query=x, key=x, value=x, mask=att_mask, pos_emb=pos_emb, cache=cache_last_channel)
File "/data_hdd_16t/miniconda3/envs/nvidia-nemo/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/data_hdd_16t/miniconda3/envs/nvidia-nemo/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
return forward_call(*args, **kwargs)
File "/data_hdd_16t/nghiavm/NLP/MachineTranslation/NVIDIA_NeMo/NeMo-main/nemo/collections/asr/parts/submodules/multi_head_attention.py", line 252, in forward
scores = (matrix_ac + matrix_bd) / self.s_d_k # (batch, head, time1, time2)
RuntimeError: The size of tensor a (25000) must match the size of tensor b (9999) at non-singleton dimension 3
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels