TikkunCreation t1_j6iwke2 wrote on January 30, 2023 at 5:36 PM

Reply to [D] What's stopping you from working on speech and voice? by jiamengial

The perception that it’ll be high latency and therefore annoying for users

farmingvillein t1_j6iwb5v wrote on January 30, 2023 at 5:34 PM

Reply to [R] Parsel: A (De-)compositional Framework for Algorithmic Reasoning with Language Models - Stanford University Eric Zelikman et al - Beats prior code generation sota by over 75%! by Singularian2501

I like the big idea, and it is almost certainly indicative of one of the key tools to improve automated programming.

That said, I wish they had avoided the urge to build an intermediate programming language. This is likely unnecessary and is the type of semi-convoluted solution that you only come up with in an academic research lab (or out of true, deep product need--but I think that is highly unlikely the case).

My guess is that the same basic result in the paper could have been shown by using Python or Rust or similar as the root language, with a little work (time that you could have obtained by swapping out effort spent on the harry potter language development).

They do note:

> We generate 16 Python implementations per high-level plan on 100 randomly sampled problems and find that the performance drops to 6%.

But it isn't well-discussed (unless I skimmed too quickly) as to why a separate language is truly needed. They discussion advantages of Parsel, but there doesn't appear to be a deep ablation on why it is really necessary or where its supposed performance benefits come from, or how those could be enforced in other languages.

There is a bunch of discussion in the appendix, but IMO none of it is very convincing. E.g., Parsel enforces certain conventions around testing and validation...great, lets do that in Python or Rust or similar. Or--leveraging the value of LLMs--through a more natural language interface.

Yes, there is benefit to bridging these gap in a "universal" manner...but, as per https://xkcd.com/927/, a new programming language is rarely the right solution.

[deleted] t1_j6iw1ql wrote on January 30, 2023 at 5:33 PM

Reply to [R] Parsel: A (De-)compositional Framework for Algorithmic Reasoning with Language Models - Stanford University Eric Zelikman et al - Beats prior code generation sota by over 75%! by Singularian2501

[deleted]

8-Bit_Soul t1_j6iv8g8 wrote on January 30, 2023 at 5:28 PM

Reply to [D] Simple Questions Thread by AutoModerator

Ball park conceptual number - how long does training take for AI tasks using medical volumetric data? (for example, something along the lines of training for automated segmentation of an organ using 100 CT studies). Are we talking hours? Days? Weeks?

I'm new to ML and I will need a better GPU (and a PSU and maybe a bigger case), and the amount I would be willing to invest depends on how much of a difference it would make in practice. I figure I can get a used RTX 3090 installed for about $1000 or a new RTX 4090 for about $2000, and if training correlates with AI benchmarks, then it looks like a task that takes 1 day for an A100 GPU would take 1.1 days with an RTX 4090 and 1.7 days with an RTX 3090. If the extra $1k reduces the time by weeks or days, then it should eventually be worth the cost. If it reduces the time by hours or minutes, then it's probably not worth the cost.

Thanks!

piman01 t1_j6iusvu wrote on January 30, 2023 at 5:25 PM

Reply to comment by tripple13 in [D] What's stopping you from working on speech and voice? by jiamengial

There's always the cloud

blackkettle t1_j6itdxo wrote on January 30, 2023 at 5:16 PM

Reply to [D] What's stopping you from working on speech and voice? by jiamengial

How familiar are you with the existing frameworks out there for this topic space? There's a lot of active work here; I'm curious about what you are focusing on, and how that reflects against the shortcomings of existing frameworks:

- https://github.com/kaldi-asr/kaldi

- https://github.com/k2-fsa

- https://github.com/espnet/espnet

- https://github.com/speechbrain/speechbrain

- https://github.com/NVIDIA/NeMo

- https://github.com/microsoft/UniSpeech

- https://github.com/topics/wav2vec2 [bajillions of similar]

- https://github.com/BUTSpeechFIT/VBx

this list is of course incomplete, but there is a _lot_ of active work in this space and a lot of opensource. Recently you've also got larger and larger public datasets becoming available. The SOTA is really getting close to commoditization as well.

What sort of OSS intersection or area are you focusing on, and why?

fakesoicansayshit t1_j6itb00 wrote on January 30, 2023 at 5:16 PM

Reply to comment by gunshoes in [D] Remote PhD by TheRealMrMatt

I finished half of a PhD online over a decade ago.

Half your PhD is doing advanced classes, the other half is being a researcher's assistant.

Not sure a PhD means anything nowadays.

qalis t1_j6ir4fh wrote on January 30, 2023 at 5:02 PM

Reply to comment by RogerKrowiak in [D] Simple Questions Thread by AutoModerator

Yes, you can. Variables in tabular learning are (in general) independent in terms of preprocessing. In fact, in most cases you will perform such different preprocessings, e.g. one-hot + SVD for high cardinality categorical variables, binary encoding for simple binary choices, integer encoding for ordinal variables.

MaryTheSaint t1_j6ir2e6 wrote on January 30, 2023 at 5:01 PM

Reply to comment by tealocked in [D] Meta AI Residency 2023 by BeautyInUgly

same here, applied for UK and didn't hear back yet.

qalis t1_j6iqvql wrote on January 30, 2023 at 5:00 PM

Reply to comment by grenouillefolle in [D] Simple Questions Thread by AutoModerator

Somewhat more limited than your question, but I know two such papers: "Tunability: Importance of Hyperparameters of Machine Learning Algorithms" P. Probst et al., and "Hyperparameters and tuning strategies for random forest" P. Probst et al.

Both are on Arxiv. First one concerns tunability of multiple ML algorithms, i.e. how sensitive are they in general to hyperparameter choice. Second one delves deeper into the same area, but specifically for random forests, gathering results from many other works. Using those ideas, I was able to dramatically decrease the computational resources for tuning by better designing hyperparameter grids.

wintermute93 t1_j6ipwqu wrote on January 30, 2023 at 4:54 PM

Reply to [D] What's stopping you from working on speech and voice? by jiamengial

It's a lot harder to find/gather/create/curate a large high quality dataset of audio recordings relevant to a given task than it is for image or tabular data.

JEFFREY_EPSTElN t1_j6iot4n wrote on January 30, 2023 at 4:47 PM

Reply to comment by off99555 in [R] InstructPix2Pix: Learning to Follow Image Editing Instructions by Illustrious_Row_9971

Ah thanks. That's a really clever trick for making training data.

mettle OP t1_j6imy6h wrote on January 30, 2023 at 4:35 PM

Reply to comment by Jean-Porte in [Discussion] ChatGPT and language understanding benchmarks by mettle

perfect, thank you!

Jean-Porte t1_j6imfho wrote on January 30, 2023 at 4:32 PM

Reply to comment by mettle in [Discussion] ChatGPT and language understanding benchmarks by mettle

LAMA, truthfulQA, MMLU, and many others

mettle OP t1_j6im95b wrote on January 30, 2023 at 4:31 PM

Reply to comment by EmmyNoetherRing in [Discussion] ChatGPT and language understanding benchmarks by mettle

this is true so far, it would seem.

you'd think there'd be some clever folks trying to quantify things better.

gunshoes t1_j6im7go wrote on January 30, 2023 at 4:30 PM

Reply to [D] What's stopping you from working on speech and voice? by jiamengial

Atm, my hard drive failed and SSD doesn't come until Tuesday.

In actuallity, I work in the space and the main limitation is hardware. Most small problems still require a ton of storage space and Google Collab ain't giving me a terribyte for audio until I start paying tiers.

mettle OP t1_j6im3ap wrote on January 30, 2023 at 4:30 PM

Reply to comment by fmai in [Discussion] ChatGPT and language understanding benchmarks by mettle

Thanks for this thoughtful answer.

Re: 2, are there solid numbers we would conceptual even be able to get? Are there known ongoing efforts?

mettle OP t1_j6ilm6q wrote on January 30, 2023 at 4:27 PM

Reply to comment by Jean-Porte in [Discussion] ChatGPT and language understanding benchmarks by mettle

Is there some alternative benchmark that measures factual accuracy of output?

Or is that impossible to use and create because any model would overfit that data?

duck_mopsi t1_j6iksqy wrote on January 30, 2023 at 4:21 PM

Reply to comment by duck_mopsi in [D] Simple Questions Thread by AutoModerator

Welp, it seemed like the problem was, that the inputs need to be defined as 2-dimensional with the sequence length as the first parameter. I thought one would give the RNN only 1 dimension of latent noise and get the sequence through reiterating it trough the RNN.

RedYican t1_j6ijnh3 wrote on January 30, 2023 at 4:14 PM

Reply to comment by RedYican in [R] Tsetlin Machine in Medical Research - Striking Differences Between Tsetlin Machine Interpretability and Deep Learning Attention by olegranmo

One more follow-up: Is there any try to combine this with Fuzzy Logic?

RedYican t1_j6ij9j7 wrote on January 30, 2023 at 4:11 PM

Reply to comment by RedYican in [R] Tsetlin Machine in Medical Research - Striking Differences Between Tsetlin Machine Interpretability and Deep Learning Attention by olegranmo

Follow up question: Could one use it on NN embeddings directly?

jiamengial t1_j6iiux3 wrote on January 30, 2023 at 4:09 PM

Reply to [D] DL university research PC suggestions? by seanrescs

Where do you plan to put the machine? If it's anywhere near where you (or anyone else) work I'd recommend getting it liquid cooled if you want to save your hearing.

The A6000s don't have active cooling on themselves and are definitely meant to last a whole lot longer than the 4090's, so will be better if you plan to use the machine for quite a while or want to retain resell value for the future

tripple13 t1_j6ihcum wrote on January 30, 2023 at 3:59 PM

Reply to [D] What's stopping you from working on speech and voice? by jiamengial

GPUs my friend. GPUs. I pray everyday, one day, an H100 may come my way. And yet, everyday, I pray, no H100 is yet here to stay.

RedYican t1_j6ih1ui wrote on January 30, 2023 at 3:57 PM

Reply to comment by olegranmo in [R] Tsetlin Machine in Medical Research - Striking Differences Between Tsetlin Machine Interpretability and Deep Learning Attention by olegranmo

Does it make sense to combine Tsetlin Machine with NNs (language understanding) via triplets?

If we had some statements S_n about entity X and then some other statement as training example Sn+1 could one use TM to discover what other statements matter for Sn+1?

EDIT: found your other paper - https://arxiv.org/pdf/2102.10952.pdf

the_Wallie t1_j6igyba wrote on January 30, 2023 at 3:56 PM

Reply to [D] What's stopping you from working on speech and voice? by jiamengial

2 things.

there is still a ton of room for valuable innovation with structured data
the cost of processing is typically astronomical, and the return hard to quantify.

In short I see this tech as a very specific solution to a very specific set of problems.

Recent comments in /f/MachineLearning