Recent comments in /f/MachineLearning
danielbln t1_j7ovvql wrote
Reply to comment by astrange in [N] Google: An Important Next Step On Our AI Journey by EducationalCicada
What we all want is that Alexa/Siri/Home have modern LLM conversational features, on addition to reliably turn on/off our lights or give us the weather. Ever since ChatGPT came out, interacting with a home assistance feels even more like pulling nails than it used to.
ramv0001 OP t1_j7ovtm5 wrote
Reply to comment by mikljohansson in [Discussion] Best practices for taking deep learning models to bare metal MCUs by ramv0001
Something like the ultralow power ARC EM series.
Parzival_007 t1_j7otlv7 wrote
Good work ! Ill give it a read and give any feedback !
nmfisher t1_j7osgdc wrote
Reply to comment by gunshoes in [D] Which is the fastest and lightweight ultra realistic TTS for real-time voice cloning? by akshaysri0001
FS2 is fine for training a TTS model from scratch, but I haven't come across a good FS2 model for cloning (which is basically zero-shot TTS).
mjaltthrowaway t1_j7oqeo1 wrote
Reply to [D] What techniques can I use to tell if a problem is likely enough to be solved by ML so as to justify compiling the dataset? by SnuggleWuggleSleep
I suppose the first question that comes to mind for me is: what problems exist in the vaccine world (besides poor customer sentiment) that ML/AI could potentially solve or enhance? Personalization? - to someone.
I suppose the first question that comes to mind for me is: what problems exist in the vaccine world (besides poor customer sentiment) that ML/AI could potentially solve or enhance? Personalization?
Maybe OP can use a similar method of analysis.
Large_Ordinary_8151 t1_j7opk1s wrote
Reply to [D] What do you think about this 16 week curriculum for existing software engineers who want to pursue AI and ML? by Imaginary-General687
If each of this week has practical project with real-life data then it sounds very interesting. I did a MSc in Applied math so this course would suit me well. Not sure about CS people with no or little math/statistics background.
UnusualClimberBear t1_j7opc2r wrote
Reply to comment by UnusualClimberBear in Model/paper ideas: reinforcement learning with a deterministic environment [D] by EmbarrassedFuel
Also if your world is deterministic but you cannot build a good model of it, it may be that you are close to the situation of games such as Go, and Monte Carlo Tree search algorithms are an option to consider (variants of UCT with or without function approximation)
serge_cell t1_j7op8l0 wrote
Reply to [D] What do you think about this 16 week curriculum for existing software engineers who want to pursue AI and ML? by Imaginary-General687
In my experience many of software engineers forgot most of linear algebra and calculus if they knew them from the start. Some also forgot probailty/statistics. If there was no preliminary requirements for participants course should start from refreshing those areas.
Narabedla t1_j7onh4s wrote
Reply to [D] What do you think about this 16 week curriculum for existing software engineers who want to pursue AI and ML? by Imaginary-General687
For the future i'd advise to either not use forced black color with transparent background (just use a white one then) or do not use a forced font color, sadly i cant really read it, as i am using nightmode.
Hopefully some lightmode users can help you :)
TaXxER t1_j7omop6 wrote
Reply to comment by JustOneAvailableName in [N] Getty Images sues AI art generator Stable Diffusion in the US for copyright infringement by Wiskkey
Generative models do redistribute though, often outputting near copies:
https://arxiv.org/pdf/2203.07618.pdf
Copyright does not only cover republishing, but also covers derived work. I think it is a very reasonable position to consider all generative model output o for which some training set image Xi had a particularly large influence on o, to be derived work from Xi.
Similar story holds true for code generation models and software licensing: copilot was trained on lots of software repos that had software licenses that require all derived work to be licensed under an at least equally permissive license. Copilot may very well output a specific code snippets particularly based on what it has seen in a particular repo, thereby potentially opening up the user to the obligation to the licensing constraints that come with deriving work from that repo.
I’m an applied industry ML researcher myself, and am very enthousiastic about the technology and state of ML. But I also think that as a field as a whole we have unfortunately been careless about ethical and legal aspects.
_Arsenie_Boca_ OP t1_j7ommq8 wrote
Reply to comment by PassingTumbleweed in [D] Papers that inject embeddings into LMs by _Arsenie_Boca_
Yes, seamless joint training is definitely one of the perks. I will look further if I can find anything about the effectiveness of different injection/fusion mechanisms.
JustOneAvailableName t1_j7oknmi wrote
Reply to comment by TaXxER in [N] Getty Images sues AI art generator Stable Diffusion in the US for copyright infringement by Wiskkey
Copyright is about redistribution and we're talking pubicly available data. I don't want/need to give consent to specific people/companies to allow them to read this comment. Nor do I think it should now be up to reddit to decide what is and isn't allowed
mikljohansson t1_j7ok819 wrote
What kind of MCU are you targeting? It depends a lot of the capabilities of the MCU, how fast is it, how much memory, does it have a dedicated NPU/TPU, vector instructions, ..
Brian-Hose225 t1_j7ojw66 wrote
Reply to comment by Costinesti in [P] We've built ChatGPT for your pdf files. by [deleted]
Thanks
Brian-Hose225 t1_j7ojvsb wrote
Reply to comment by wkmowgli in [P] We've built ChatGPT for your pdf files. by [deleted]
Me too
crazymonezyy t1_j7ojv39 wrote
Reply to comment by new_name_who_dis_ in [N] Google: An Important Next Step On Our AI Journey by EducationalCicada
> But yes, anyone reading please don't use ChatGPT instead of google search unless you don't care about the responses being made up.
The general public is not reading this sub, and ChatGPT is being sold to them by marketing and sales hacks without this disclaimer. We're way past the point of PSAs.
TaXxER t1_j7ojt22 wrote
Reply to comment by JustOneAvailableName in [N] Getty Images sues AI art generator Stable Diffusion in the US for copyright infringement by Wiskkey
As much as I like ML, it’s hard to argue that training ML models on data without consent, let alone even copyrighted data, would somehow be OK.
mikljohansson t1_j7ojjjm wrote
I have been building a PyTorch > ONNX > TFlite > TFMicro toolchain for a project to get a vision model running on an ESP32-CAM with PlatformIO and Arduino framework. Perhaps it could be of use as a reference
https://github.com/mikljohansson/mbot-vision
Some caveats to consider when embarking on this kind of project
-
PyTorch/ONNX is channels-first memory format, while tensorflow is channels-last. Converting the model with onnx-tf inserts lots of Transpose ops in the graph which decreases performance (with 3x for my model) and increased memory usage. I'm using onnx2tf module instead, which also coverts operators to channels-last
-
You may want to fully quantize the model to int8, since fp16/fp32 is really slow on smaller MCUs, especially those lacking FPUs and vector instructions. And watch out for Quantize/Dequantize ops in the converted graph, it means some op didn't support quantization so needed to be wrapped and executed (slowly) in fp16/fp32 mode
-
There may be lots of performance to gain by using hardware optimized kernels, but it depends on what MCU and what operators your model is using. E.g. for ESP32 there's ESP-NN which greatly sped up inference times for my project (2x)
https://github.com/espressif/esp-nn https://github.com/espressif/tflite-micro-esp-examples
And for really tiny MCUs there's this library which could perhaps be useful, it doesn't support so many operators but it does work in my testing for simple networks
https://github.com/sipeed/TinyMaix
- How to figure out memory needs and performance. It's a bit trickier, I've simply been using for example torchinfo module, and the graph output and graph statistics that onnx2tf displays to see how many muls the model is using and the approximate parameter and tensor memory usage. Then I've had an improvement cycle where I've "trained" the model for 1 step, deployed it to the hardware to measure the FPS and then adjust the hyperparameters and model architecture until I have an FPS that is acceptable. Then train it fully to see if that model config can do the job. And then iterate...
MLRecipes OP t1_j7oec5y wrote
Reply to comment by thiru_2718 in [N] New Book on Synthetic Data: Version 3.0 Just Released by MLRecipes
No, it does encompass GLM but the technique also works when there is no response (you then need to put a constraints on the parameter) or with truly non linear models with time series examples in the book. Or for particular clustering cases. I like to call it unsupervised regression, but a particular case with appropriate constraint on the parameters corresponds to classic regression. More about it here. As for shape classification, see here.
astrange t1_j7oduw3 wrote
Reply to comment by MysteryInc152 in [N] Google: An Important Next Step On Our AI Journey by EducationalCicada
This is wishful thinking. ChatGPT, being a computer program, doesn't have features it's not designed to have, and it's not designed to have this one.
(By designed, I mean has engineering and regression testing so you can trust it'll work tomorrow when they redo the model.)
I agree a fine tuned LLM can be a large part of it, but virtual assistants already have LMs and obviously don't always work that well.
Fit-Meet1359 t1_j7oa9yf wrote
Reply to comment by infinity in [N] Microsoft announces new "next-generation" LLM, will be integrated with Bing and Edge by currentscurrents
You will be able to expand the sidebar thing, or go directly to the Chat tab, to talk to it in full screen just like ChatGPT. The search page sidebar is only there to make the new experience more visible. See https://medium.com/@owenyin/scoop-oh-the-things-youll-do-with-bing-s-chatgpt-62b42d8d7198
[deleted] t1_j7o8z4x wrote
[deleted]
thiru_2718 t1_j7o82dn wrote
Nice work! There's some intriguing sections here that I definitly want to take a look at.
Quick question, with regards to this quote in the preface: "For instance, regression techniques ... are presented as a single method, without using advanced linear algebra."
Are you referring to Generalized Linear Models? I don't see any references to GLMs, in my brief skim, but I can't think of how else regression can be presented as a single method.
Also, is there any place where we can get a preview of "Shape Classification and Synthetization via Explainable AI" section?
infinity t1_j7o28ve wrote
Reply to [N] Microsoft announces new "next-generation" LLM, will be integrated with Bing and Edge by currentscurrents
Is it just me who finds the clunky UX over bing underwhelming? Ditto over you.com that fails to generate anything for me 50% of the times. I wish these companies spent some time thinking about the chat UX as they integrate with search. ChatGPT has a really great and simple UX, and works really great for some use cases which I really like.
ramv0001 OP t1_j7ovy25 wrote
Reply to comment by mikljohansson in [Discussion] Best practices for taking deep learning models to bare metal MCUs by ramv0001
Yes, completely agree on the onnx2tf.
Have you tried using emulators instead of actual hardware?