Recent comments in /f/MachineLearning
frequenttimetraveler t1_j6ykioh wrote
Reply to comment by ThunderySleep in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata
I'm looking forward to finding out that peopel who write nice letters and look good on cam are just as dumb as the minions they manage.
Sirisian t1_j6yja6v wrote
Reply to comment by djc1000 in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata
Part of this is about brand identity also. Even if a technology isn't perfect some companies try to get in early. This is similar to virtual reality and mixed reality trends. The industry sees an inevitable future and want to be the name people think of. If one assumes gradual improvements until ~2045, then this is long-term planning. (Or short-term depending on improvements expected. It's possible MS has insider information that skews their motives).
LetterRip t1_j6yj4z2 wrote
Reply to comment by Nhabls in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata
GPT-3 can be quantized to 4bit with little loss, to run on 2 Nvidia 3090's/4090's (Unpruned, pruned perhaps 1 3090/4090). At 2$ a day for 8 hours of electricity to run them, and 21 working days per month. That is 42$ per month (plus amortized cost of the cards and computer to store them).
MadScientist-1214 t1_j6yj0v6 wrote
Some models actually just use [0, 1] normalization (divide by 255). Some normalization is necessary, but [0, 1] is enough. On real world datasets, computing the specific mean/std never gave me better results.
Acceptable-Cress-374 t1_j6yil6g wrote
Reply to comment by sgramstrup in [D]How Will Open Source Alternatives Compete With GPT3? by noellarkin
> Their resources will always be larger, and they will keep accelerating faster on the exponential curve.
Sure, they'll have more money to throw at a problem, but also more incentive to throw that money into other money-making stuff. Open-source models might not necessarily go the same path, and even if under-trained or less-optimized, they might still be a tremendous help once a community gets to play with them.
ThunderySleep t1_j6yhxqd wrote
Reply to comment by frequenttimetraveler in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata
Got to be honest, the biggest thing I'm not looking forward to is every vapid person with a bogus job being able to write as though they're an intelligent important person. Like how Grammarly allowed dumb people to hide the fact that they can barely read and write.
puppet_pals t1_j6ygho0 wrote
ImageNet normalization is an artifact of the era of feature engineering. In the modern era you shouldn’t use it. It’s unintuitive and overfits the research dataset.
IshKebab t1_j6ygd21 wrote
Reply to comment by djc1000 in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata
Doesn't seem like a waste to me. If it works (big if!) I can see it cutting out a lot of tedious tasks.
gyanster t1_j6yerid wrote
Reply to [N] Microsoft integrates GPT 3.5 into Teams by bikeskata
Clippy 2.0
Ronny_Jotten t1_j6yenlh wrote
Reply to comment by JigglyWiener in [R] Extracting Training Data from Diffusion Models by pm_me_your_pay_slips
Adobe doesn't ship Photoshop with a button that produces an image of Mickey Mouse. They would be sued by Disney. The AI models do. They are not the same. It seems unlikely that Disney will find it "not worth chasing"; they spend millions defending their intellectual property.
bokonator t1_j6yecgt wrote
Reply to comment by Nhabls in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata
Microsoft recently paid 10B$ to get full access to the model and allow openAI full access to Azure GPUs and a 49% ownership.
AristosTotalis t1_j6ye5hn wrote
Reply to comment by Nhabls in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata
yep. $1B in cash but they have to use Azure as their exclusive compute cloud compute provider, which Microsoft probably sells to OAI at ~cost
I think it' safe to assume that 2/3 of that will go towards training & inference, and if you also assume M doesn't make nor lose money selling compute (and in fact they get to strengthen Azure as a cloud infra player), they really only paid ~$300M to invest in OAI at what seems like a great price in hindsight
Franck_Dernoncourt t1_j6ydkiu wrote
> I was surprised at how much better GPT3 davinci 003 performed compared to AI21's 178B model. AI21's Jurassic 178B seems to be comparable to GPT3 davinci 001.
on which tasks?
> Of course, I didn't expect the smaller models to be on par with GPT-3
You could read Tianyi Zhang, Faisal Ladhak, Esin Durmus, Percy Liang, Kathleen McKeown, Tatsunori B. Hashimoto. Benchmarking Large Language Models for News Summarization. arXiv:2301.13848.:
> we find instruction tuning, and not model size, is the key to the LLM’s zero-shot summarization capability
plocco-tocco t1_j6yc79c wrote
Reply to [D] Why do LLMs like InstructGPT and LLM use RL to instead of supervised learning to learn from the user-ranked examples? by alpha-meta
I would also like to know from anyone who might have a clue, can RLHF offer any significant boost to machine translation to offer better language-to-language translation?
wintermute93 t1_j6yc6ia wrote
Reply to comment by ReginaldIII in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata
Oh, nice, autogenerated meeting minutes and stuff is a great QOL feature. I, uh, probably should have read the article, oops
bigabig t1_j6yc5gr wrote
Reply to [N] Microsoft integrates GPT 3.5 into Teams by bikeskata
Is the automatic transcription done with openai whisper?
TheTerrasque t1_j6ybrk0 wrote
Reply to comment by Nhabls in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata
Well, you got deepmind's chinchilla model, and Google's CALM approach that can increase the speed of interference by maybe 3x - in addition to other tricks..
ReginaldIII t1_j6ybiju wrote
Reply to comment by wintermute93 in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata
This isn't being used for autocomplete or any user text generation purposes though.
They're using it to summarize and make todo lists from the Whisper extracted transcripts of video meetings. Users aren't getting a frontend to run arbitrary stuff through the model. Seems like a pretty legitimate use case.
[deleted] t1_j6yawf0 wrote
Reply to comment by frequenttimetraveler in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata
[removed]
[deleted] t1_j6yatgd wrote
Reply to comment by [deleted] in [N] Microsoft integrates GPT 3.5 into Teams by bikeskata
Hmm, I don't know that one.
LeanderKu t1_j6y7cge wrote
Reply to [N] Microsoft integrates GPT 3.5 into Teams by bikeskata
I actually find automatically generating notes to be a smart and useful application. I often have 1 on 1 remote meetings and I find it difficult to both present and discuss my work while also taking notes. It often happens to me that I focus on something so that I forget I should also take notes, which I then notice a week later when I have forgotten half of the tasks. If it would work reliably then I can imagine it to be a very useful addition.
I have never used teams though, everything's on zoom.
sgramstrup t1_j6y7any wrote
Reply to comment by Single_Blueberry in [D]How Will Open Source Alternatives Compete With GPT3? by noellarkin
In an exponential future, models from big corp will feel like they are light-years beyond open models. Their resources will always be larger, and they will keep accelerating faster on the exponential curve.
crt09 t1_j6y5x4t wrote
Reply to comment by koolaidman123 in [D] Why do LLMs like InstructGPT and LLM use RL to instead of supervised learning to learn from the user-ranked examples? by alpha-meta
This paper seems very relevant: https://arxiv.org/abs/2205.13636 I haven't read it closely enough to give strong opinions with confidence but it seems to beat PPO with a token level loss thats works similar to the Upside Down Reinforcement Learning paper, where you give a target reward between 1 and 5 as an input token before the prompt and train it to output a response of a coressponding quality, trained on the standard LM loss on an existing target output with the given 1-5 reward rank. Then during inference you just append 1 to the start of the prompt and it outputs a response of high quality
butter14 t1_j6y4rnd wrote
Reply to comment by Monoranos in [N] OpenAI starts selling subscriptions to its ChatGPT bot by bikeskata
It essentially operates the same way as humans digesting content and then outputting content from the ingested data.
BasilLimade t1_j6ykjaz wrote
Reply to [p] I built an open source platform to deploy computationally intensive Python functions as serverless jobs, with no timeouts by seattleite849
I'm looking at making a docker image to host on AWS ECR, to contain some python code and dependencies (over 250MB of dependencies, so I can't just zip up my modules as a lambda "layer"). How does this compare to making my own docker lambda image?