Recent comments in /f/MachineLearning

TheCoconutTree t1_j6jjb43 wrote

Discrete features as training data:

Say I am using SQL table rows as training data input for a deep neural net classifier. One of the columns contains a number from 1-5 representing a discrete value, say type of computer connection. It could be wifi, mobile-data, LAN, etc. What would be the best way to represent as input features? Right now I'm thinking split into a five dimensional vector, one for each possible value. Then pass 0 or 1 depending on whether a given feature is selected. I'm worried that including the range of values as a single vector would lead to messed up learning since one discrete value doesn't have any meaningful closeness to it's nearest discrete neighbor.

1

psma t1_j6jhwdl wrote

Not sure how. If I have, e.g. a PyTorch model how do I deploy it for streaming data without having to rewrite it in another framework? (e.g. stateful convolutions, ability to receive an arbitrary number of samples as input, etc). It's doable, but mostly amounts to rewriting your model. This should be automated.

3

farmingvillein t1_j6jgv48 wrote

And this isn't a good thing, it is a necessary thing--we do it because someone bundled some logic together and you need to interact with it.

None of this addresses whether or why something like Parsel is necessary as an intermediate step. The authors do very little to justify the necessity of an intermediate representation; there is no meaningful analysis of why it apparently performs better, nor an ablation analysis to try to close the gaps.

The key benefits--like enforced test cases--could, hypothetically, very easily be enforced in something like Python, or many other languages.

And given the massive volumes of training data we have for these other languages, there are a lot of good reasons to think that we should be able to see equal or better behavior than with a wholly manufactured pseudocode (effectively) language.

The paper would have been much more convincing and interesting if, e.g., they started with something like python and progressively added the restrictions that apparently helped Parsel provide higher quality results.

0

farmingvillein t1_j6jdazy wrote

This is, at best, a distinction without a difference.

The authors literally describe it as "language".

It gets "compiled".

It generates a "Parsel program".

It holds a distinct learning curve such that a user can be an "expert".

The point here is that it is a unique specification that needs to be separately learned--it asks the user to learn, in essence, a domain-specific language. Or, if you prefer, a domain-specific specification; the point stands either way.

4

currentscurrents t1_j6jbokk wrote

I think hallucination occurs because of the next-word-prediction task on which these models were trained. No matter how good a model is, it can never predict the irreducible entropy of the sentence - the 1.5 bits per word or whatever that contains the actual information content. The best it can do is guess.

This is exactly what hallucination looks like; all the sentence structure is right, but the information is wrong. Unfortunately, this is also the most important part of the sentence.

4

jiamengial OP t1_j6j8c8c wrote

To go into your question further, one area that might be really interesting is open standards or formats for speech data; like the MLF formats in HTK and Kaldi but, like, modern, so that (to the point of some others here w.r.t. data storage costs) datasets can be hosted more centrally and people don't have to reformat them to their own data storage structures (which, let's face it, is basically someone's folder structure)

1

EmmyNoetherRing t1_j6j7zq4 wrote

I wouldn’t mind being one of those folks. But you make a good point that the old rubrics may not be capturing it.

If you want to nail down what users are observing as its comparison to human performance, practically speaking you may need to shift to diagnostics that were designed to evaluate human performance. With the added challenge of avoiding tests where the answer sheet would already be in its training data.

1

jiamengial OP t1_j6j6ruc wrote

If anything this is what's motivating me; getting Kaldi (or any of these other repos) to compile and run on your own data is usually painful enough that it's putting off anyone who isn't already knowledgeable in the area, where wrappers such as pykaldi and Montreal Forced Aligner try to result a lot of problems, but only really add to it.

I've personally had great experiences with repo's like NeMo, though that was mainly through nailing myself to a specific commit in the main branch and heavily wrapping various classes I needed to use (I still have no idea what a manifest file format should look like)

The field is still incredibly recipe-heavy in terms of setting up systems and running them; if you were someone testing the waters with speech processing (especially if you want to go beyond STT or vanilla TTS), there little to nothing that compares to the likes of HuggingFace for the text side

3

seanrescs OP t1_j6j2fef wrote

It can be stored in an active lab environment or away if noise is an issue, its more about which will give more utility for the longest time. It seems a6000 is the better choice from Tim Dettmers, will probably go with that one if I can get one quoted at a good price

2

HermanCainsGhost t1_j6iy9fn wrote

Sounds like an issue you should talk to your psychologist about. I certainly feel no physical sensation when looking at AI art (or any art) beyond "oh this looks good" or "this looks ugly" (if those even count as physical sensations).

It's very weird to have such a visceral feeling of disgust just based on looking at art.

> the composites between different images to produce the final result

Lol, that's not how AI art works. Are you sure you're in the right place? See that's the problem being in a space like this - you are very likely talking to someone who actually knows how things work.

AI art works by denoising, it isn't a "composite". It isn't "mixing images". It doesn't have images to mix.

Stable Diffusion for example, was trained on 240 terabytes of data - 2.3 billion 512x512 images, and the models are between 2 to 8 gigabytes of data. That means equivalent to about 1-4 bytes of data per image (with a 512x512 image being a bit bigger than 250 kilobytes in total size).

Suffice to say, you cannot compress 250,000 bytes of data into 1-4 bytes of data (mathematically, it is impossible). If that level of compression was possible, that would be the bigger story compared to AI art, because data transmission just got a wholllllllllleeeeeee lot faster, by orders of magnitude.

So yeah, get out of here with that "composite" nonsense. There's no composite. It's literally mathematically impossible for there to be a composite.

1