Python Forum
[Abandoned] Get the good type to pass to a transcriber model
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
[Abandoned] Get the good type to pass to a transcriber model
#11
I tested with this:
predicted_text = asr_model.transcribe([Path(audio_source.name)])
It's not that:
TypeError: Object of type PosixPath is not JSON serializable
Traceback:
File "/home/ild/.local/lib/python3.12/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 542, in _run_script
    exec(code, module.__dict__)
File "/home/ild/trans-nemo.py", line 28, in <module>
    predicted_text = asr_model.transcribe([Path(audio_source.name)])
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ild/.local/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
File "/home/ild/miniconda3/lib/python3.12/site-packages/nemo/collections/asr/models/ctc_models.py", line 187, in transcribe
    fp.write(json.dumps(entry) + '\n')
             ^^^^^^^^^^^^^^^^^
File "/home/ild/miniconda3/lib/python3.12/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ild/miniconda3/lib/python3.12/json/encoder.py", line 200, in encode
    chunks = self.iterencode(o, _one_shot=True)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ild/miniconda3/lib/python3.12/json/encoder.py", line 258, in iterencode
    return _iterencode(o, 0)
           ^^^^^^^^^^^^^^^^^
File "/home/ild/miniconda3/lib/python3.12/json/encoder.py", line 180, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
Reply
#12
We can close the topic: I won't be able to go further due to hardware constraints.

Here is my code, CUDA is getting out of memory before being able to transcript it:
## Imports ##
import torch
import streamlit as st
from pathlib import Path
from tempfile import NamedTemporaryFile
from transformers import AutoModelForCTC, Wav2Vec2ProcessorWithLM
import nemo.collections.asr as nemo_asr
import torchaudio

## Initialisation ##
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
asr_model = nemo_asr.models.EncDecCTCModelBPE.from_pretrained("nvidia/stt_fr_conformer_ctc_large")

## Affichage ##
st.title("Facilitateur de compte-rendus")
col1, col2 = st.columns(2)
audio_source=st.sidebar.file_uploader(label="Choisir votre fichier", type=["wav","m4a","mp3","wma"])

## Variables ##
suffix = ""
predicted_sentence = ""

## Traitement ##
#col1.subheader("Modèle utilisé : nvidia/stt_fr_conformer_ctc_large")
if audio_source is not None:
    suffix = Path(audio_source.name).suffix
    col1.write("Démarrage de la transcription")
#    predicted_text = asr_model.transcribe([Path(audio_source.name)])
    with NamedTemporaryFile(suffix=suffix) as temp_file:
        temp_file.write(audio_source.getvalue())
        temp_file.seek(0)
        col2.write(temp_file.name)
        predicted_text = asr_model.transcribe([temp_file.name])
    col1.write("Fichier transcrit :point_right:")
    col2.write(predicted_text)
    col1.sidebar.download_button(label="Télécharger la transcription", data=predicted_text, file_name="transcript.txt",mime="text/plain")
If anyone has a 6+GB GPU or a good CPU with enough RAM and a long time to spend, you can feel free to test it.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  Phonetic transcriber Oderjunkie 0 1,560 Oct-17-2020, 06:19 AM
Last Post: Oderjunkie

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020