Recent comments in /f/MachineLearning
lifesthateasy t1_jalerz5 wrote
Reply to comment by currentscurrents in [D] Blake Lemoine: I Worked on Google's AI. My Fears Are Coming True. by blabboy
Who's talking about intelligence? Of course artificial intelligence is intelligence. It's in the name. I'm saying it's not sentient.
iTrooz_ t1_jale5ca wrote
Reply to [D] OpenAI introduces ChatGPT and Whisper APIs (ChatGPT API is 1/10th the cost of GPT-3 API) by minimaxir
I hope the API doesn't have the same restrictions as https://chat.openai.com
polipopa t1_jald898 wrote
Reply to [D] Podcasts about ML research? by Tight-Vacation-9410
Check out robot brains by Pieter Abbeel
fmai t1_jalcs0x wrote
Reply to comment by lucidraisin in [D] OpenAI introduces ChatGPT and Whisper APIs (ChatGPT API is 1/10th the cost of GPT-3 API) by minimaxir
AFAIK, flash attention is just a very efficient implementation of attention, so still quadratic in the sequence length. Can this be a sustainable solution for when context windows go to 100s of thousands?
currentscurrents t1_jalcgvw wrote
Reply to comment by lifesthateasy in [D] Blake Lemoine: I Worked on Google's AI. My Fears Are Coming True. by blabboy
Who says intelligence has to work exactly like our brain?
A Boeing 747 is very different from a bird, even though they fly on the same principles.
th1nk2much t1_jalc88z wrote
Reply to [D] Are Genetic Algorithms Dead? by TobusFire
I recently used a genetic algorithm in a supply chain application. Not the fastest algo but we made it work for our purpose
[deleted] t1_jalc0j9 wrote
Reply to [D] Are Genetic Algorithms Dead? by TobusFire
[deleted]
Lychee7 t1_jalbr7l wrote
Reply to [D] OpenAI introduces ChatGPT and Whisper APIs (ChatGPT API is 1/10th the cost of GPT-3 API) by minimaxir
Criteria for tokens ? Complex, longer the prompt more tokens it'll use ?
ID4gotten t1_jalb9vx wrote
Reply to [P] ChatRWKV v2 (can run RWKV 14B with 3G VRAM), RWKV pip package, and finetuning to ctx16K by bo_peng
It's not the best at Q&A or chat (yet), but kudos for all the work behind this super interesting approach. Maybe with time it will continue to improve, and I like seeing non-transformer methods showing some potential.
bernhard-lehner t1_jalb613 wrote
I don't think he actually "worked on Google's AI", as in being involved in the research and development part.
Dendriform1491 t1_jalb2vb wrote
Reply to [D] Are Genetic Algorithms Dead? by TobusFire
Genetic algorithms require you to create a population where the genetic operators are applied (mutation, crossover and selection).
Creating a population of neural networks implies having multiple slightly different copies of the neural network to be optimized (i.e.: the population).
This can be more computationally expensive than other techniques which will do all the learning "in-place".
MysteryInc152 t1_jalau7e wrote
Reply to comment by currentscurrents in [D] OpenAI introduces ChatGPT and Whisper APIs (ChatGPT API is 1/10th the cost of GPT-3 API) by minimaxir
Sorry i meant the really large scale models. Nobody has gotten a gpt-3/chinchilla etc scale model to actually distill properly.
currentscurrents t1_jalajj3 wrote
Reply to comment by MysteryInc152 in [D] OpenAI introduces ChatGPT and Whisper APIs (ChatGPT API is 1/10th the cost of GPT-3 API) by minimaxir
DistillBERT worked though?
Disastrous_Elk_6375 t1_jala1ee wrote
Wasn't this settled once and for all when they had the same exact model he worked on claim (very convincingly) that it was a frog and doing frog things, or something like this?
ninjasaid13 t1_jal8x3x wrote
Reply to comment by currentscurrents in [D] What are the most known architectures of Text To Image models ? by AImSamy
>You can only run what fits on your available hardware.
Precisely.
MysteryInc152 t1_jal7d3p wrote
Reply to comment by currentscurrents in [D] OpenAI introduces ChatGPT and Whisper APIs (ChatGPT API is 1/10th the cost of GPT-3 API) by minimaxir
Distillation doesn't work for token predicting language models for some reason.
LetterRip t1_jal4y8i wrote
Reply to comment by bjergerk1ng in [D] OpenAI introduces ChatGPT and Whisper APIs (ChatGPT API is 1/10th the cost of GPT-3 API) by minimaxir
Certainly that is also a possibility. Or they might have done teacher student distillation.
LetterRip t1_jal4vgs wrote
Reply to comment by cv4u in [D] OpenAI introduces ChatGPT and Whisper APIs (ChatGPT API is 1/10th the cost of GPT-3 API) by minimaxir
Yep, or a mix between the two.
GLM-130B quantized to int4, OPT and BLOOM int8,
https://arxiv.org/pdf/2210.02414.pdf
Often you'll want to keep the first and last layer as int8 and can do everything else int4. You can quantize based on the layers sensitivity, etc. I also (vaguely) recall a mix of 8bit for weights, and 4bits for biases (or vice versa?),
Here is a survey on quantization methods, for mixed int8/int4 see the section IV. ADVANCED CONCEPTS: QUANTIZATION BELOW 8 BITS
https://arxiv.org/pdf/2103.13630.pdf
Here is a talk on auto48 (automatic mixed int4/int8 quantization)
https://www.nvidia.com/en-us/on-demand/session/gtcspring22-s41611/
WikiSummarizerBot t1_jal4gkz wrote
Reply to comment by deviantkindle in [D] Are Genetic Algorithms Dead? by TobusFire
>In radio communications, an evolved antenna is an antenna designed fully or substantially by an automatic computer design program that uses an evolutionary algorithm that mimics Darwinian evolution. This procedure has been used in recent years to design a few antennas for mission-critical applications involving stringent, conflicting, or unusual design requirements, such as unusual radiation patterns, for which none of the many existing antenna types are adequate.
^([ )^(F.A.Q)^( | )^(Opt Out)^( | )^(Opt Out Of Subreddit)^( | )^(GitHub)^( ] Downvote to remove | v1.5)
deviantkindle t1_jal4ff1 wrote
Reply to comment by discord-ian in [D] Are Genetic Algorithms Dead? by TobusFire
My fave has always been the radio antenna designed by a GA
SaltyStackSmasher OP t1_jal0r2l wrote
Reply to comment by CMUOresama in [D] backprop through beam sampling ? by SaltyStackSmasher
thanks a lot for this. will definitely take a look
cv4u t1_jakzhqj wrote
Reply to comment by LetterRip in [D] OpenAI introduces ChatGPT and Whisper APIs (ChatGPT API is 1/10th the cost of GPT-3 API) by minimaxir
LLMs can quantize to 8 bit or 4 bit?
ab3rratic t1_jakzbv6 wrote
Reply to [D] Are Genetic Algorithms Dead? by TobusFire
GAs are not great for expensive-to-evaluate functions. And those have become kind of relevant lately.
bjergerk1ng t1_jakt9hi wrote
Reply to comment by currentscurrents in [D] What are the most known architectures of Text To Image models ? by AImSamy
Source about Google using ViT?
currentscurrents t1_jalfj60 wrote
Reply to comment by lifesthateasy in [D] Blake Lemoine: I Worked on Google's AI. My Fears Are Coming True. by blabboy
How could we even tell if it was? You can't even prove to me that you're sentient.
We don't have tools to study consciousness, or an understanding of the principles it operates on.