Recent comments in /f/MachineLearning

deephugs t1_jbtqk9c wrote

The devil is in the details. Getting robots to work reliably in the gritty dirty environments of agtech is incredibly difficult. Manipulation, even with modern ML and CV, is still very difficult. Let's just say there is a reason there aren't a ton of robotics companies selling a product such as the one you suggested.

2

quitenominal t1_jbtptri wrote

An embedding is a numerical representation of some data. In this case the data is text.

These representations (read list of numbers) can be learned with some goal in mind. Usually you want the embeddings of similar data to be close to one another, and the embeddings of disparate data to be far.

Often these lists of numbers representing the data are very long - I think the ones from the model above are 768 numbers. So each piece of text is transformed into a list of 768 numbers, and similar text will get similar lists of numbers.

What's being visualized above is a 2 number summary of those 768. This is referred to as a projection, like how a 3D wireframe casts a 2D shadow. This lets us visualize the embeddings and can give a qualitative assessment of their 'goodness' - a.k.a are they grouping things as I expect? (Similar texts are close, disparate texts are far)

4

Simusid OP t1_jbtp8wr wrote

Given three sentences:

  • Tom went to the bank to make a payment on his mortgage.
  • Yesterday my wife went to the credit union and withdrew $500.
  • My friend was fishing along the river bank, slipped and fell in the water.

Reading those you immediately know that the first two are related because they are both about banks/money/finance. You also know that they are unrelated to the third sentence even though the first and third share the word "bank". If we had naively encoded a strictly word based model, it might incorrectly associate the first and third sentences.

What we want is a model that can represent the "semantic content" or idea behind a sentence in a way that we can make valid mathematical comparisons. We want to create a "metric space". In that space, each sentence will be represented by a vector. Then we use standard math operations to compute the distances between the vectors. In other words, the first two sentences will have vectors that point basically in the same direction, and the third vector will point in a very different direction.

The job of the language models (BERT, RoBERTa, all-mpnet-v2, etc) are to do the best job possible turning sentences into vectors. The output of these models are very high dimension, 768 dimensions and higher. We cannot visualize that, so we use tools like UMAP, tSNE, PCA, and eig to find the 2 or 3 most important components and then display them as pretty 2 or 3D point clouds.

In short, the embedding is the vector that represents the sentence in a (hopefully) valid metric space.

19

wikipedia_answer_bot t1_jbtl62p wrote

**In mathematics, an embedding (or imbedding) is one instance of some mathematical structure contained within another instance, such as a group that is a subgroup. When some object

    X
  

{\displaystyle X}

is said to be embedded in another object

    Y
  

{\displaystyle Y}

, the embedding is given by some injective and structure-preserving map

    f
    :
    X
    →
    Y
  

{\displaystyle f:X\rightarrow Y}

.**

More details here: <https://en.wikipedia.org/wiki/Embedding>

This comment was left automatically (by a bot). If I don't get this right, don't get mad at me, I'm still learning!

^(opt out) ^(|) ^(delete) ^(|) ^(report/suggest) ^(|) ^(GitHub)

−1

koolaidman123 t1_jbtkuif wrote

there are some fairly annoying things with pytorch lightning, and somethings are definitely harder to do in lightning due to how it's structured. but overall i find for practical purposes i've been liking lightning a lot more than pytorch + accelerate, especially now you can basically use colossal ai with lightning over deepspeed

4

onebigcat OP t1_jbtjqc4 wrote

I guess it’s a matter of how you define unsupervised, but isn’t SSL closer to supervised learning because there’s a ground-truth to compare the prediction to? Whereas if you’re just clustering some high dimensional data, you might not know what the “true” or most accurate way of clustering that information might be, especially in something like genomics where there’s a lot of information that has an unknown purpose.

−1

rshah4 t1_jbtfzig wrote

Two quick tips for finding the best embedding models:

Sentence Transformers documentation compares models: https://www.sbert.net/docs/pretrained_models.html

Massive Text Embedding Benchmark (MTEB) Leaderboard has 47 different models: https://huggingface.co/spaces/mteb/leaderboard

These will help you compare different models across a lot of benchmark datasets so you can figure out the best one for your use case.

50

montcarl t1_jbtexjk wrote

This is an important point. The performance similarities indicate that the sentence lengths of the 20k dataset were mostly within the SentenceTransformer max length cutoff. It would be nice to confirm this and also run another test with longer examples. This new test should result in a larger performance gap.

10

Real_Revenue_4741 t1_jbteqca wrote

YOLO is not enough to create these robots. The difficult part of robotics is being able to actuate from visual feedback. The method you are mentioning is called "visual servoing," and will not be robust enough to actually work. Also, the under 3K price point is quite a bit lower than what you would expect for these projects.

2

Simusid OP t1_jbt91tb wrote

My main goal was to just visualize the embeddings to see if they are grossly different. They are not. That is just a qualitative view. My second goal was to use the embeddings with a trivial supervised classifier. The dataset is labeled with four labels. So I made a generic network to see if there was any consistency in the training. And regardless of hyperparameters, the OpenAI embeddings seemed to always outperform the SentenceTransformer embeddings, slightly but consistency.

This was not meant to be rigorous. I did this to get a general feel of the quality of the embeddings, plus to get a little experience with the OpenAI API.

30