Recent comments in /f/MachineLearning

TrevorIRL t1_j6xhuvh wrote

Your right, it was just some quick napkin math, I’m not saying it’s guaranteed.

I would however say that even if you said only 10% of users would pay, you are still at $20 000 000.

5% is still $10 000 000.

Imagine having a product better than Google, being able to improve productivity and save hours in your business, and not having to fear that too many people are using it when you need it most.

I guarantee we see more than 5% of users willing to shell out $20/mo for this.

Edit: This is also a product that’s going to continue to get better over time!

2

Acceptable-Cress-374 t1_j6xgzoa wrote

> going to be taken for a ride by other bots

So.. bots are a thing? :)

What I'm trying to say is this: if being close to GTO is better than humans, your bot doesn't need to always play perfectly to not be detected. And if you say there's no GTO yet that means there's no standard yet.

To re-visit the chess analogy, in chess they compare each player's moves against top engines and come up with a score. Either centipawn loss or whatever else they do (chessdotcom doesn't comment on their measures, understandably so). What tools would a poker TO employ? Are there even such tools? And would your own bot even resemble that?

I'm still not convinced this is as easy as you said...

edit:

> It would also be super stupid as a human to try and play only GTO if you knowy ou play against other humans. While GTO guarantees that you - on average - don't lose it is by FAR inferior to looking for exploitative spots. Trying to play GTO-ish is the baseline you go back to when you don't know what to do - not the default strat as a player

Well ...

> Pluribus, a new AI bot we developed in collaboration with Carnegie Mellon University, has overcome this challenge and defeated elite human professional players in the most popular and widely played poker format in the world: six-player no-limit Texas Hold'em poker. Pluribus defeated pro players in both a “five AIs + one human player” format and a “one AI + five human players” format. If each chip was worth a dollar, Pluribus would have won an average of about $5 per hand and would have made about $1,000/hour playing against five human players. These results are considered a decisive margin of victory by poker professionals.

I don't have a quote handy, but I remember listening to a podcast with the creator of pluribus, and they didn't specifically code an "exploitative" strategy, AFAIK. Whatever their bot did, seemed to work tho... So not that stupid? :)

1

thevillagersid t1_j6xgoi5 wrote

Look into getting setup with a cloud solution, like Amazon Sage Maker or something similar. The potential gains to moving to a more powerful machine will be strongly dependent, however, on what is causing your code to take so long to execute. If it's just taking a long time because you're searching over a massive, high dimensional grid, moving to a better machine might offer limited improvements, and you might need to look into splitting the workload across a cluster of machines.

1

vwvwvvwwvvvwvwwv t1_j6xgaqc wrote

I've had success with normalizing flows in problems where both directions of the transformation were important (although presumably an autoencoder might work just as well).

This was published yesterday: Flow Matching for Generative Modeling

TL;DR: We introduce a new simulation-free approach for training Continuous Normalizing Flows, generalizing the probability paths induced by simple diffusion processes. We obtain state-of-the-art on ImageNet in both NLL and FID among competing methods.

Abstract: We introduce a new paradigm for generative modeling built on Continuous Normalizing Flows (CNFs), allowing us to train CNFs at unprecedented scale. Specifically, we present the notion of Flow Matching (FM), a simulation-free approach for training CNFs based on regressing vector fields of fixed conditional probability paths. Flow Matching is compatible with a general family of Gaussian probability paths for transforming between noise and data samples---which subsumes existing diffusion paths as specific instances. Interestingly, we find that employing FM with diffusion paths results in a more robust and stable alternative for training diffusion models. Furthermore, Flow Matching opens the door to training CNFs with other, non-diffusion probability paths. An instance of particular interest is using Optimal Transport (OT) displacement interpolation to define the conditional probability paths. These paths are more efficient than diffusion paths, provide faster training and sampling, and result in better generalization. Training CNFs using Flow Matching on ImageNet leads to state-of-the-art performance in terms of both likelihood and sample quality, and allows fast and reliable sample generation using off-the-shelf numerical ODE solvers.

4

iqisoverrated t1_j6xg46i wrote

>Do you do this against a median of other players, against GTO, or what?

Against GTO. Against a median of other players would make no sense.

>'ve seen streamers playing 3-4 tables at once and playing pretty close to GTO

Since GTO doesn't even exist yet for many handed play...press 'x' to doubt. Human players are still pretty far from GTO. There were already challenges with best of the best heads-up players against GTO bots and they lost (mirror matches so it wasn't due to variance in hands). Someone playing 4 tables at the same time? No. Nowhere close to GTO. Maybe preflop with charts, but that's as good as it gets.

(It would also be super stupid as a human to try and play only GTO if you knowy ou play against other humans. While GTO guarantees that you - on average - don't lose it is by FAR inferior to looking for exploitative spots. Trying to play GTO-ish is the baseline you go back to when you don't know what to do - not the default strat as a player)

​

>What about making your own version of "spin the wheel" strategy where, depending on where you're at in the tournament ICM wise, you switch between strategies, adjust your opening hands, raising spots, etc. Sure you'd get away from Nash equilibrium, but you'd probably still rake in money.

Well then you have a bot that is going to be taken for a ride by other bots ;-)

If someone fields a bot he has to be aware that bots are a thing...implementing a losing strategy to another scammer is probably not something he'd put so much effort in.

1

IWantAGrapeInMyMouth t1_j6xfc23 wrote

Hope this finds you well,

Machine learning can facilitate the use of managerial buzzwords by enabling natural language processing algorithms to identify and categorize key phrases and terminology commonly used in management and corporate settings. This can facilitate the generation of buzzword-rich language in real-time, empowering individuals to communicate more effectively and authentically within a business context. Additionally, machine learning can also be leveraged to analyze large datasets, identifying emerging buzzwords and trends in management speak, thus allowing individuals to stay ahead of the curve and stay relevant in the constantly evolving corporate landscape.

Best,

[YOUR NAME]

(I'd say it's pretty much got it nailed)

77

DigThatData t1_j6xexyf wrote

> That models that memorize better generalize better has been observed in large language models

I think this is an incorrect reading here. increasing model capacity is a reliable strategy for increasing generalization (Kaplan et al 2020, Scaling Laws), and larger capacity models have a higher propensity to memorize (your citations). The correlations discussed in both of those links are to capacity specifically, not generalization ability broadly. scaling law research has recently been demonstrating that there is probably a lot of wasted capacity in certain architectures, which suggests that the generalization potential of those models could be achieved with a much lower potential for memorization. see for example Tirumala et al 2022, Chinchilla.

which is to say: you're not wrong that a lot of recently trained models that generalize well have also been observed to memorize. but I don't think it's accurate to suggest that the reason these models generalize well is linked to a propensity/ability to memorize. it's possible this is the case, but I don't think anything suggesting this has been demonstrated. it seems more likely that generalization and memorization are correlated through the confounder of capacity, and contemporary research is actively attacking the problem of excess capacity in part to address the memorization question specifically.

EDIT: Also... I have some mixed feelings about that last paper. It's new to me and I just woke up so I'll have to take another look after I've had some coffee, but although their approach feels intuitively sound from the direction of the LOO methodology, their probabilistic formulation of memorization I think is problematic. They formalize memorization using a definition that appears to me to be indistinguishable from an operational definition of generalizability. Not even OOD generalizability: perfectly reasonable in-distribution generalization to unseen data, according to these researchers, would have the same properties as memorization. That's... not helpful. Anyway, need to read this closer, but "lower posterior likelihood" to me seems fundamentally different from "memorized". Their approach appears to make no effort to distinguish between a model that had "memorized" a training datum and one that had "learned" meaningful features in the neighborhood of a datum that has high [leverage](https://en.wikipedia.org/wiki/Leverage_(statistics). Are they detecting memorization or outlier samples? If the "outliers" are valid in distribution samples, removing them harms the diversity of the dataset and the model may have significantly less opportunity to learn features in the neighborhood of those observations (i.e. they are high leverage). My understanding is that the problem of memorization is generally more pathological in high density regions of the data, which would be undetectable by their approach.

1

throwaway2676 t1_j6xerk7 wrote

I think a lot of people would pay for the initial model they first released. Since then they've been censoring the shit out of it to avoid controversy, and a fair amount of the hype died down among the average joes.

At this point I think their main target demo will be white collar workers who use it to make work easier. However, the hype will pick back up once they connect it to the internet.

3

Nhabls t1_j6xemzb wrote

GPT-3 didn't cost a billion to train

It does cost a LOT of money to run, which is why you're unlikely to "see better" for the short and medium term future. Unless you're into paying hundreds to thousands per month for this functionality

26

Acceptable-Cress-374 t1_j6xd7dx wrote

> You can look at extremely low frequency plays that hit exactly the right frequency where a human would use an always/never approach. If you see such plays in different spots then you can be fairly confident it's a bot

Do you do this against a median of other players, against GTO, or what?

And if you restrict your bot to ~3 bet-sizes and GTO + ICM for tournaments, how'd you detect that? It wouldn't necessarily be the best strategy, but it would probably get your bot in the money a majority of times. I've seen streamers playing 3-4 tables at once and playing pretty close to GTO with preset betting buttons as well. You'd detect those as bots as well?

What about making your own version of "spin the wheel" strategy where, depending on where you're at in the tournament ICM wise, you switch between strategies, adjust your opening hands, raising spots, etc. Sure you'd get away from Nash equilibrium, but you'd probably still rake in money.

The idea that you consider this easy to spot is pretty wild to me. I'd love to read some research in this area, if you have some sources on bot detection in online 6+ NLHE.

1

znihilist t1_j6xcp1i wrote

Good point, but the way I see it these two things look very similar but don't end up being similar in the way we thought or wanted. Compression takes one input and generates an output, the object (the file if you want) is only one thing, an episode of house. We'd argue that both versions are loosely identical, just differ in the underlying presentation (their 0's and 1's are different but they render the same object). Also, that object can't generate another episode of house (that aired a day early), or a none existing episode of house that he takes over the world, or where he's a Muppet. As the diffusion models don't have a copy, then the comparison falls on that particular aspect as none-applicable.

I do think, the infringement aspect is going to end up being by the user and not by the tool. Akin to how just because your TV can play pirated content, we assign the blame on the user and not the manufacturer of the TV. So it may end up being that creating these models is fine, but if you recreate something copyrighted, then that will be on you.

Either way, this is going to be one interesting supreme court decision (because I think it is definitely going there).

0