Recent comments in /f/MachineLearning

bartturner t1_j7lugv5 wrote

Geeze. Who do you think invented Transformers?

https://en.wikipedia.org/wiki/Transformer_(machine_learning_model)

NO!!! GANs were invented by Ian while he was working at Google. It is a pretty interesting story.

The vast majority of the major AI breakthroughs from the last decade+ came from Google.

OpenAI really does NOT do R&D. THey more use the R&D from others and mostly Google.

−3

harharveryfunny t1_j7lu67f wrote

What underlying are you talking about? Are you even familiar with the "Attention" paper and it's relevance here? Maybe you think OpenAI use Google's Tensorflow? They don't.

GANs were invented by Ian Goodfellow while he was a student at. U.Montreal, before he ever joined Google.

No - TPUs are not key to deploying at scale unless you are targeting Google cloud. Google is a distant 3rd in cloud marketshare behind Microsoft and Amazon. OpenAI of course deploy on Microsoft Azure, not Google.

2

bitemenow999 t1_j7lqudl wrote

If you want to be an ML scientist and build actual models then you just need a lot of math and just enough programming skills for prototyping, go with any language and if you can code what you want then that is great. One thing to note is I have in my experience seen people only with a grad education and research experience in this field and some of them don't code they just write down algos and let developers implement that, so you might want to consider that.

If you want to be MLOps or data engineer that doesn't require much math or an advance degree, then start with books specific for those fields since these roles have slightly different stack.

One rule of thumb, if you are just dipping your toes in, is to start with a language that has great and free resources available, for ML (learning and prototyping) that happens to be python, but you need C++ if you actually want to deploy your model for a decent size industrial project.

0

god_is_my_father t1_j7lp44r wrote

Focus on python. It's going to be a MUCH easier barrier to entry. If you see a specific niche you'd like to focus on with C++ then sure - learn that. But I would not recommend starting with C++. I did start there 20+ years ago and wish there was an easier way in back then. My main gripe is there's 100 standard ways to do things in C++ whereas python code tends to be fairly uniform across enterprises / projects / etc.

1

yaosio t1_j7lnkh9 wrote

If you look at what you.com does they cite the claims their bot makes by linking to the pages the data come from, but only sometimes. When it doesn't cite something you can be sure that it's just making it up. In the supposed Bing leak it was doing the same thing, citing it's sources.

If they can force it to always provide a source, and if it can't then it won't say it, that could fix it. However, there's still the problem that the model doesn't know what's true and what's false. Just because it can cite a source doesn't mean the source is correct. This is not something that the model can learn by being told. To learn by being told assumes that it's data is correct, which can't be assumed. A researcher could tell the model, "all cats are ugly", which is obviously not true, but the model will say all cats are ugly because it was taught that. Models will need to have a way to determine on their own what is true and what isn't true, and explain it's reasoning.

1

Baggins95 t1_j7lm18c wrote

It most likely depends on where you want to go. Python definitely has the higher usefulness for data science and machine learning in the narrower sense. But if you want to go deeper into high performance computing or work really close to the periphery, then you will benefit much more from C++. I learned C++ first and later Python in my studies. Looking at some of my colleagues, that doesn't seem to have been the worst way to go. Of course, others are also right when they advise you not to put too much weight on the choice of a programming language. It's just that Data Science is very diverse in its manifestations these days. And in some jobs it is very much in demand that you are a passable programmer and not just able to plug Excel macros together. So it does have a certain relevance which tools you can handle.

1

scottyLogJobs t1_j7llrrl wrote

Why? Compare the top two images. It is a demonstration that they trained on Getty images but there’s no way anyone could argue that the nightmare fuel on the right deprives Getty of any money. Do you remember when Getty sued Google images and won? Sure Google is powerful and makes plenty of money, but now image search is way worse for consumers than it was a decade ago- you can’t just open the image or even a link to the image, you have to follow it back to their page and dig around for it, probably never finding it at all. Ridiculous that effectively embedding a link isn’t considered fair use, you’d still need to pay to use a Getty image 🤷‍♂️

Setting aside the fact that Getty is super hypocritical and constantly violates copyright law, and then effectively uses their litigators to push around smaller groups, if they win it’s just going to be another step that means only the big companies have access to data, making it impossible for smaller players to compete.

People fighting against technological advancement and innovation are always on the wrong side of history. There will always be a need for physical artists, digital artists, photographers, etc, because the value of art is already incredibly subjective, the value is generated by the artist, not the art, and client needs are so specific, detailed and iterative that an AI can’t achieve them.

Instead of seeing this tool as an opportunity for artists, they fight hopelessly against innovation and throw their lot in with huge bully companies like Getty Images.

4

aicharades OP t1_j7lg2qa wrote

One cool feature of the LangChain framework, https://langchain.readthedocs.io/en/latest/, is that you can easily switch the model you use. So when the ChatGPT API comes out, LangChain allows you to easily move models without upending your pipeline.

This currently uses the latest available API model, text-davinci-003.

Models were a really interesting set of choices for map reduce. happy to share my experiences if anyone is looking for tips

4