Recent comments in /f/MachineLearning

marcelomedre t1_j5xxv7t wrote

Hi, I have a question about k-means. I have a data frame with 100 variables after removing low variance and high correlated ones. I know that the data must be normalized for the kmeans, specially to remove the range dependency, but I am facing a problem that if I do normalize my data the algorithm is not properly separating the clusters. I have 3 variables ranges in my data:

  • 0-10^4;
  • -10^3 - 10^3;
  • 0 - 10^3

I have at least 5 very specific clusters that I could characterize by not scaling the data, but I am not comfortable with this procedure.

I couldn’t find a reasonable explanation with is the algorithm performing better in non-scaled data instead of the scaled one.

1

Gershel t1_j5xxpny wrote

1 WR (2) and 2 WA (4). Don't think it gets in unless I get to change the WR with a good rebuttal. But he/she basically didn't get the motivation and didn't understand the method...

1

olegranmo OP t1_j5xpnj2 wrote

Great question! Rudin et al.’s approach elegantly builds an optimal decision tree through search. TM learns online, processing one example at a time, like a neural network. Also, like logistic regression, TM adds up evidence from different features, however, it builds non-linear logical rules, instead of operating on single features. TM also supports convolution for image processing and time series. It can also learn from penalties and rewards addressing the contextual bandit problem. Finally, TMs allow self-supervised learning by means of an auto-encoder. So, quite different from decision trees.

4

Kacper-Lukawski t1_j5xp10a wrote

Each vector may have a payload object: https://qdrant.tech/documentation/payload/ Payload attributes can be used to make some additional constraints on the search results: https://qdrant.tech/documentation/filtering/ The unique feature is the filtering is already built-in into the vector search phase, so there is no need to pre- or postfilter the results.

1

w2ex t1_j5xo3bp wrote

There are of course pre-trained models for CNN (most of the time it is pre-trained on ImageNet in a supervised manner). If you talk about self-supervised pre-training for CNN specifically have a look at the recent papers Convnext 2 or SparK (BERT-style pretraining for convnets).

6

veb101 t1_j5xjuy0 wrote

I'm also starting a similar project, but it just involves writing DDPM from scratch. In the past few days I saw some papers regarding diffusion in the medical domain, maybe you can skim through that and see how they are used in that domain.

0

dineNshine t1_j5xeqvi wrote

Embedding watermarks into images directly is one thing. OP suggested changing model parameters such that the model produces watermarked images, which is different. Editing model parameters in a functionally meaningful way would be hard without affecting performance. It seems like you are referring to a postprocessing approach, which is along the lines of what I recommended in general for curating model outputs. In this instance, this kind of solution wouldn't perform the function OP intended, which is preventing users from generating images without the watermark (since postprocessing is not an integral part of the model and is easy to remove from the generation process).

It is conceivable that the parameters could be edited in an otherwise non-disruptive way, although unlikely imo. I don't like this kind of approach in general though. The community seems to channel a lot of energy into making these models worse to "protect people from themselves". I despise this kind of intellectual condescension.

1

catndante t1_j5x6k5e wrote

Hi, i have a simple question about DDPM model. I'm not so sure, but I think I have read the post saying that when T=1000, using 1,000 models will perform better but its computationally too redundant, so DDPM used same model for evert step t. Is this argument correct? If centers with huge computation does this, will the performance be better?

2