Recent comments in /f/MachineLearning
tripple13 t1_j7dv4j7 wrote
Reply to comment by AdFew4357 in Are PhDs in statistics useful for ML research? [D] by AdFew4357
You're hired!
candidhorse4 OP t1_j7dscul wrote
Reply to comment by suflaj in What text to speech does this guy use? [R] by candidhorse4
which azure voices are the most realistic?
[deleted] t1_j7dm6bs wrote
AngelKitty47 t1_j7dluel wrote
I think our brains dont think in language but use language to describe our thoughts so how do you teach a machine to think thoughts?
danielgafni t1_j7dkl6x wrote
English is a pretty simple language in comparison to other popular languages. Not sure why do you think it’s more complex than Chinese…
Competitive-Rub-1958 t1_j7dhy93 wrote
Reply to comment by Freed4ever in [N] "I got access to Google LaMDA, the Chatbot that was so realistic that one Google engineer thought it was conscious. First impressions" by That_Violinist_18
Google is a leader in DL research. That's a fact. They chose to keep most of their research internal because as above commenters said, they don't have much to gain through it - marketing and hype lasts only so long.
> It's about the UX
what UX? its just a normal frontend mate
> scalability
You do realize Google were serving LLMs before OAI was even hypothesized? Or that they have TPUs which are far more scalable and cost efficient, which could already rip major players apart.
> liability
OAI have fought nothing liability or legality-wise. They just remain in a gray area and hopes no one focuses on them (bad luck, they got caught in the AI art lawsuits too)
uotsca t1_j7dhgjb wrote
MadScientist-1214 t1_j7dh2rw wrote
From a linguistic perspective, no language is more efficient than another language. Switching to an Asian language like Chinese would not necessarily be a better representation for the neural network than English. Mandarin Chinese is a very analytical language with a low inflectional morphology, but it is no less complex. For example, it has a large number of modal particles that have no equivalent in English.
In linguistics, there are also attempts to convert languages into other forms of representation. The natural semantic metalanguage (NSM), for example, reduces words to a set of semantic primitives.
I am a bit more skeptical from what I have seen both in linguistics and in NLP.
gunshoes t1_j7dg38g wrote
Depends on your problem space. If you're talking about NLP/Speech applications, English is the most popular simply because it's the most resources language available and has a larger market application.
Even then, most models only show good performance with prestige dialects. Minority dialects such as AAVE notorious suffer with modern models.
noobgolang t1_j7dff9f wrote
What?
Zophike1 t1_j7de8p2 wrote
Eliu plz ?
SoulflareRCC t1_j7d86om wrote
Reply to 15 years old and bad at math [D] by Daniel_C_____
Just start learning college math now instead of waiting ..
Mechanical_Number t1_j7d4mlo wrote
Reply to comment by AdFew4357 in Are PhDs in statistics useful for ML research? [D] by AdFew4357
I think the degree of "classical"/"foundational" relates to your exact thesis topic so it is hard to judge. And even then you can always put spins on it. For example a PhD thesis topic on"Reformulations of James-Stein estimators in the context of letf-censored data" would be indeed quite classical but then again if you want to focus on bio-themed ML applications, having a strong theoretical background on working with censored data. More directly, standard guidelines apply:
- Collaborate with people outside your domain. It doesn't matter much if those are from the Biosciences, the Medical School or the Languages school. Show that while you are specialised, you can apply your specialism. (see point 3 too) This can also help to get a foot through the door for conference papers. (see point 2 too)
- Publish multiple (non-junk) papers. That one, final year, awesome paper of Annal of Statistics might be the clutch 3-pointer for that junior faculty position but a steady research output stream even if less impactful shows you can deliver continuously. Early on it is actually quite hard particularly for non-specialists to evaluate the significance of a publication.
- Code reasonably well. I am not talking C++ template meta-programming here, but be able to show that you can create an R or a Python package with some reasonable structure and quality. Extra point if you can use "ML tools" like JAX or PyTorch Lighting. You are not going to be lone gunman, you will be part of team. (Relates to point 1 a bit)
- Know your ML fundamentals well. That's not that hard; you are a Stats PhD, being able to adequately explain GBMs or NNs Backprop or GMMs or Langangians means that someone from ML can talk shop with you. Yeah, you won't know particulars of AutoDiff or PPO; who cares? They won't probably know them either unless they are actively working on the matter.
- Network. There is some network connectivity involved in hiring as well as in information diffusion. In addition, references matter. I remember out head of recruitment at my first job telling me that if I knew a good candidate I should let them know because for each position they would weed out internal referrals first (it makes sense, someone has already done "some screening").
rerroblasser t1_j7d4a7l wrote
Reply to [N] GitHub CEO on why open source developers should be exempt from the EU’s AI Act by EmbarrassedHelp
So all they need to do is support repos that are geo blocked for the EU. Everyone else can move on with their lives. It's the GDPR all over again.
-UltraAverageJoe- t1_j7d18mw wrote
Reply to comment by 7366241494 in [N] "I got access to Google LaMDA, the Chatbot that was so realistic that one Google engineer thought it was conscious. First impressions" by That_Violinist_18
I saw spoken examples of the full version during a Google presentation in college a few years ago and it was scary good. I wouldn’t have known it wasn’t a person if I hadn’t been told in advance.
[deleted] t1_j7cyayu wrote
Reply to comment by DigThatData in [R] Extracting Training Data from Diffusion Models by pm_me_your_pay_slips
[removed]
Freed4ever t1_j7cr91y wrote
Reply to comment by geeky_username in [N] "I got access to Google LaMDA, the Chatbot that was so realistic that one Google engineer thought it was conscious. First impressions" by That_Violinist_18
I don't work at Google, but I can see there are truths in it. Look at Waymo, they were the leader but now what? Their science might still be the best, but without taking the risk, and iterating (the engineering part), they will fall behind. ChatGPT might be the wake up call that they need. How they re-act in the next couple of years will define Google as a company.
AdFew4357 OP t1_j7cq0v7 wrote
Reply to comment by tripple13 in Are PhDs in statistics useful for ML research? [D] by AdFew4357
Sorry, I only speak Bayesian.
tripple13 t1_j7cpqqm wrote
Sure, could very well be.
Just have to leave all your p-values at the door.
suflaj t1_j7cpf1d wrote
Reply to comment by candidhorse4 in What text to speech does this guy use? [R] by candidhorse4
Azure
This is due to 2 issues both of these have and Azure mitigates to an extent:
- they both lack humanity, i.e. they can at most be convincing as human prompt readers, but not anything else
- those without a better ear and headphones probably do not notice a certain ring those two have, which a human voice cannot replicate - it might be that this effect is added to make the voices sharper, but ultimately it will make people like me, as well as robovoice detectors be able to more easily distinguish them as TTS
candidhorse4 OP t1_j7cnaci wrote
Reply to comment by suflaj in What text to speech does this guy use? [R] by candidhorse4
i dont think they have, so what do you think then as a whole, which one is the best in replicating the human voice with all its nuances?
I_will_delete_myself t1_j7cn00u wrote
Reply to comment by Freed4ever in [N] "I got access to Google LaMDA, the Chatbot that was so realistic that one Google engineer thought it was conscious. First impressions" by That_Violinist_18
They benefit from releasing the paper because it gives other researchers inspiration and allows Google to get free R&D. The researcher then releases another paper and Google gets to benefit from that.
AdFew4357 OP t1_j7ckwcf wrote
Reply to comment by Mechanical_Number in Are PhDs in statistics useful for ML research? [D] by AdFew4357
How would you recommend a student like me, whose a phd student in statistics, to be marketable for industry related ML research? I’m worried that in my time as a phd statistics student, my work will be too “classical” and “foundational” and lie more in the statistics domain rather than ML, and not be attractive for recruiters in the ML research space. How would you advise myself to be come off as more of a ML researcher than a pure theoretical statistician? Just focus on more ML related applications in my research?
suflaj t1_j7ckp0d wrote
Reply to comment by candidhorse4 in What text to speech does this guy use? [R] by candidhorse4
Yes. Although impressive in the number of languages and voices, it does not match Azure's more expressive prosody. I have listened to far too many robocalls, so that kind of magic is gone for me.
Someone else might consider it more humanlike, as it's all subjective. Have they published benchmark scores yet?
[deleted] t1_j7e5rw0 wrote
Reply to [D] Is English the optimal language to train NLP models on? by MrOfficialCandy
[deleted]