pommedeterresautee OP t1_j7uml76 wrote on February 9, 2023 at 3:07 PM

Reply to comment by zzzthelastuser in [P] Get 2x Faster Transcriptions with OpenAI Whisper Large on Kernl by pommedeterresautee

lol unfortunately no, minutes :(

zzzthelastuser t1_j7ulu8h wrote on February 9, 2023 at 3:02 PM

Reply to [P] Get 2x Faster Transcriptions with OpenAI Whisper Large on Kernl by pommedeterresautee

> CUDA graphs require us to capture a graph per input tensor shape, there is a non-negligible warmup time. We measure around 10mn on 2 different machines / GPUs (down from 50mn in our previous Kernl version). One user reported with the new version a bit more than 20mn of warmup time. We are aware of obvious ways to decrease it significantly.

Dumb question, but what's mn? millineconds?

schludy t1_j7ula73 wrote on February 9, 2023 at 2:58 PM

Reply to [P] Creating an embedding from a CNN by zanzagaes2

How do you plot the embeddings in 2D exactly? What is the size of the embeddings that you're trying to visualize?

pommedeterresautee OP t1_j7uk761 wrote on February 9, 2023 at 2:50 PM

Reply to comment by uzibart in [P] Get 2x Faster Transcriptions with OpenAI Whisper Large on Kernl by pommedeterresautee

I just discovered the project https://github.com/ggerganov/whisper.cpp

As written in another comment, there is no way for (recent) CPU (even ARM ones) to be as fast as (recent) GPU on such big model (the list no GPU support in limitations).

https://www.reddit.com/r/MachineLearning/comments/10xp54e/comment/j7tk4fx/?utm_source=share&utm_medium=web2x&context=3

That being said, the project looks super cool, tks for the pointer (I ordered a M2 Max, lots of fun to come :-) )

anders987 t1_j7ujo5v wrote on February 9, 2023 at 2:47 PM

Reply to comment by SnooHesitations8849 in [P] Get 2x Faster Transcriptions with OpenAI Whisper Large on Kernl by pommedeterresautee

There already exists a C++ CPU only implementation of Whisper.

uzibart t1_j7uiryq wrote on February 9, 2023 at 2:40 PM

Reply to [P] Get 2x Faster Transcriptions with OpenAI Whisper Large on Kernl by pommedeterresautee

whisper-cpp comparison?

Locomule t1_j7uintq wrote on February 9, 2023 at 2:39 PM

Reply to [D] Are there any AI model that I can use to improve very bad quality sound recording? Removing noise and improving overall quality by CeFurkan

Greets again, you've been helping me with SD. I Followed your account last night and noticed this post. I'm a recording musician and though I might be able to help you out with this issue.

Here is my result after using the audio editor Reaper to apply EQ and Compression.

imaginethezmell t1_j7uhi86 wrote on February 9, 2023 at 2:31 PM

Reply to comment by 7366241494 in [N] "I got access to Google LaMDA, the Chatbot that was so realistic that one Google engineer thought it was conscious. First impressions" by That_Violinist_18

Google lost 8% of value after their terrible demo lol

Available_Lion_652 OP t1_j7ue7pj wrote on February 9, 2023 at 2:06 PM

Reply to comment by IntelArtiGen in [D] RTX 3090 with i7 7700k, training bottleneck by Available_Lion_652

My motherboard is quite old and the best CPU that I can attach yo it is a i7 7700k. From what I have read, if I will process the dataset before training, than it should not bottleneck. But what I was think was that the preprocessed dataset is held in 32 GB of RAM. The CPU transfers data from RAM to GPU memory. It has only 8 threads. Let s say I want to train from scratch a GPT2. I do not know exactly how much the CPU/RAM frequency will bottleneck the training process. I fon t want to change my whole hardware. If 3090 RTX is to performant and the bottleneck is to high, I was wondering if I can buy a 3060/3080

blackkettle t1_j7ud34i wrote on February 9, 2023 at 1:57 PM

Reply to comment by whata_wonderful_day in [P] Get 2x Faster Transcriptions with OpenAI Whisper Large on Kernl by pommedeterresautee

Are you talking about this paper:

- https://cdn.openai.com/papers/whisper.pdf

maybe I missed it but I can't find any place in that paper where they talk about the trade-offs with respect to real time factor and decoding strategies. RTF vs acc curves for CPU vs GPU for STT typically vary not in terms of absolute performance but in terms of where along the RTF curve you achieve a particular accuracy. That impacts what kinds of tasks you can expect to use the model for, and how you can expect to scale it to real world applications. So far this has been the weakest point for all the Whisper related work (still better off with espnet, k2, speechbrain, etc). This information would be interesting to see if they have it.

IntelArtiGen t1_j7uce7z wrote on February 9, 2023 at 1:52 PM

Reply to [D] RTX 3090 with i7 7700k, training bottleneck by Available_Lion_652

The CPU bottleneck depends on the model and the training process. If you remove all /most of the preprocessing done on CPU it could be fine. I think transformers don't usually bottleneck on CPU but i7 7700k is quite old.

whata_wonderful_day t1_j7ubutx wrote on February 9, 2023 at 1:47 PM

Reply to comment by blackkettle in [P] Get 2x Faster Transcriptions with OpenAI Whisper Large on Kernl by pommedeterresautee

His point is that it's identical. They didn't use quantization or anything that would hurt performance. The whisper paper has a lot of the details you're asking for

leventov t1_j7ubimw wrote on February 9, 2023 at 1:45 PM

Reply to [Discussion] Cognitive science inspired AI research by theanswerisnt42

Top AI researchers (Yoshua Bengio, Yann LeCun) are essentially cognitive scientists. By "cognitive science", I mean here general theories of cognition, not human cognition. If you watch any recent talk by Bengio (example), you recognise that it's a talk about cognitive science at least as much as it is about AI. From his talks, you could also roughly sense the type of problems these researchers are solving when they move to the level of thinking about cognitive science.

Theories of cognitive science and ML/DL form an "abstraction-grounding" stack:
general theories of cognition (intelligence, agency) ->
general theories of DNN working in runtime ->
interpretability theories for a concrete DNN architecture.

pommedeterresautee OP t1_j7ub975 wrote on February 9, 2023 at 1:43 PM

Reply to comment by lpatks in [P] Get 2x Faster Transcriptions with OpenAI Whisper Large on Kernl by pommedeterresautee

Nvidia doc obviously (for sass it’s light), and also some old blog posts very detailed (for shuffle instructions etc).

Tober447 t1_j7u90qp wrote on February 9, 2023 at 1:24 PM

Reply to [P] Creating an embedding from a CNN by zanzagaes2

You could try an autoencoder with CNN layers and a bottleneck of 2 or 3 neurons to be able to visualize these embeddings. The autoencoder can be interpreted as non-linear PCA.

Also, similarity in this embedding space should correlate with similarity of the real images/whatever your CNN extracts from the real images.

netw0rkf10w t1_j7u86nu wrote on February 9, 2023 at 1:16 PM

Reply to [P] Get 2x Faster Transcriptions with OpenAI Whisper Large on Kernl by pommedeterresautee

The work is amazing and the post is very informative. Thanks!

Wrandraall t1_j7u76m1 wrote on February 9, 2023 at 1:07 PM

Reply to comment by clauwen in [P] Get 2x Faster Transcriptions with OpenAI Whisper Large on Kernl by pommedeterresautee

Training =/= inference anyway. Whatever can be reached with CPU inference time, training still benefit by using GPUs from parallelization and caching

jeanfeydy t1_j7u6o4b wrote on February 9, 2023 at 1:03 PM

Reply to comment by CeFurkan in [D] Are there any AI model that I can use to improve very bad quality sound recording? Removing noise and improving overall quality by CeFurkan

I see - best of luck with the other solutions then!

isjuao t1_j7u5xjv wrote on February 9, 2023 at 12:56 PM

Reply to What are the best resources to stay up to date with latest news ? [D] by [deleted]

Hacker News

Fit_Schedule5951 t1_j7u56d7 wrote on February 9, 2023 at 12:49 PM

Reply to comment by CeFurkan in [D] Are there any AI model that I can use to improve very bad quality sound recording? Removing noise and improving overall quality by CeFurkan

Not sure, it's been a while since I last used it

lpatks t1_j7u536l wrote on February 9, 2023 at 12:48 PM

Reply to [P] Get 2x Faster Transcriptions with OpenAI Whisper Large on Kernl by pommedeterresautee

Interesting. Do you have any good resources for learning about PTX/SASS instructions? I've played around with Triton a bit, but it isn't even clear to me where I would see this output.

Reasonable_Page6193 t1_j7u3djh wrote on February 9, 2023 at 12:31 PM

Reply to [D] Are there any AI model that I can use to improve very bad quality sound recording? Removing noise and improving overall quality by CeFurkan

Try https://podcast.adobe.com/enhance (formerly Adobe Shasta). This gave best results for me.

londons_explorer t1_j7u38tk wrote on February 9, 2023 at 12:30 PM

Reply to [D] Are there emergent abilities of image models? by These-Assignment-936

Shadows and the way light interacts/reflects/refracts seem to be emergent behaviour of diffusion image models.

Ask for "A koala next to a glistening wine glass", and you'll probably get cool optical effects on the koala that the model has never seen before.

blackkettle t1_j7u2kd0 wrote on February 9, 2023 at 12:23 PM

Reply to comment by pommedeterresautee in [P] Get 2x Faster Transcriptions with OpenAI Whisper Large on Kernl by pommedeterresautee

Probably my question was not well-formulated. I'm just curious about what the RTF vs Accuracy tradeoff looks like. I'm not questioning whether it works, I'm just curious what the actual performance looks like.

You report on memory usage and beam sizes, as well as relative speedup, but it would be interesting to also see WER performance, as well as the actual absolute RTFs.

pommedeterresautee OP t1_j7u0p8z wrote on February 9, 2023 at 12:03 PM

Reply to comment by blackkettle in [P] Get 2x Faster Transcriptions with OpenAI Whisper Large on Kernl by pommedeterresautee

Using CG doesn't affect the output quality.

What works with Whisper will still work with CG+Whisper.

Recent comments in /f/MachineLearning