[deleted] t1_j7e5rw0 wrote on February 6, 2023 at 3:12 AM

Google is a leader in DL research. That's a fact. They chose to keep most of their research internal because as above commenters said, they don't have much to gain through it - marketing and hype lasts only so long.

> It's about the UX

what UX? its just a normal frontend mate

> scalability

You do realize Google were serving LLMs before OAI was even hypothesized? Or that they have TPUs which are far more scalable and cost efficient, which could already rip major players apart.

> liability

OAI have fought nothing liability or legality-wise. They just remain in a gray area and hopes no one focuses on them (bad luck, they got caught in the AI art lawsuits too)

uotsca t1_j7dhgjb wrote on February 6, 2023 at 12:02 AM

Reply to [D] Is English the optimal language to train NLP models on? by MrOfficialCandy

No

MadScientist-1214 t1_j7dh2rw wrote on February 5, 2023 at 11:59 PM

Reply to [D] Is English the optimal language to train NLP models on? by MrOfficialCandy

From a linguistic perspective, no language is more efficient than another language. Switching to an Asian language like Chinese would not necessarily be a better representation for the neural network than English. Mandarin Chinese is a very analytical language with a low inflectional morphology, but it is no less complex. For example, it has a large number of modal particles that have no equivalent in English.

In linguistics, there are also attempts to convert languages into other forms of representation. The natural semantic metalanguage (NSM), for example, reduces words to a set of semantic primitives.

I am a bit more skeptical from what I have seen both in linguistics and in NLP.

gunshoes t1_j7dg38g wrote on February 5, 2023 at 11:51 PM

Reply to [D] Is English the optimal language to train NLP models on? by MrOfficialCandy

Depends on your problem space. If you're talking about NLP/Speech applications, English is the most popular simply because it's the most resources language available and has a larger market application.

Even then, most models only show good performance with prestige dialects. Minority dialects such as AAVE notorious suffer with modern models.

noobgolang t1_j7dff9f wrote on February 5, 2023 at 11:46 PM

Reply to [D] Is English the optimal language to train NLP models on? by MrOfficialCandy

What?

Zophike1 t1_j7de8p2 wrote on February 5, 2023 at 11:38 PM

Reply to [R] On the Expressive Power of Geometric Graph Neural Networks by chaitjo

Eliu plz ?

SoulflareRCC t1_j7d86om wrote on February 5, 2023 at 10:53 PM

Reply to 15 years old and bad at math [D] by Daniel_C_____

Just start learning college math now instead of waiting ..

Mechanical_Number t1_j7d4mlo wrote on February 5, 2023 at 10:27 PM

Reply to comment by AdFew4357 in Are PhDs in statistics useful for ML research? [D] by AdFew4357

I think the degree of "classical"/"foundational" relates to your exact thesis topic so it is hard to judge. And even then you can always put spins on it. For example a PhD thesis topic on"Reformulations of James-Stein estimators in the context of letf-censored data" would be indeed quite classical but then again if you want to focus on bio-themed ML applications, having a strong theoretical background on working with censored data. More directly, standard guidelines apply:

Collaborate with people outside your domain. It doesn't matter much if those are from the Biosciences, the Medical School or the Languages school. Show that while you are specialised, you can apply your specialism. (see point 3 too) This can also help to get a foot through the door for conference papers. (see point 2 too)
Publish multiple (non-junk) papers. That one, final year, awesome paper of Annal of Statistics might be the clutch 3-pointer for that junior faculty position but a steady research output stream even if less impactful shows you can deliver continuously. Early on it is actually quite hard particularly for non-specialists to evaluate the significance of a publication.
Code reasonably well. I am not talking C++ template meta-programming here, but be able to show that you can create an R or a Python package with some reasonable structure and quality. Extra point if you can use "ML tools" like JAX or PyTorch Lighting. You are not going to be lone gunman, you will be part of team. (Relates to point 1 a bit)
Know your ML fundamentals well. That's not that hard; you are a Stats PhD, being able to adequately explain GBMs or NNs Backprop or GMMs or Langangians means that someone from ML can talk shop with you. Yeah, you won't know particulars of AutoDiff or PPO; who cares? They won't probably know them either unless they are actively working on the matter.
Network. There is some network connectivity involved in hiring as well as in information diffusion. In addition, references matter. I remember out head of recruitment at my first job telling me that if I knew a good candidate I should let them know because for each position they would weed out internal referrals first (it makes sense, someone has already done "some screening").

rerroblasser t1_j7d4a7l wrote on February 5, 2023 at 10:25 PM

Reply to [N] GitHub CEO on why open source developers should be exempt from the EU’s AI Act by EmbarrassedHelp

So all they need to do is support repos that are geo blocked for the EU. Everyone else can move on with their lives. It's the GDPR all over again.

-UltraAverageJoe- t1_j7d18mw wrote on February 5, 2023 at 10:03 PM

Reply to comment by 7366241494 in [N] "I got access to Google LaMDA, the Chatbot that was so realistic that one Google engineer thought it was conscious. First impressions" by That_Violinist_18

I saw spoken examples of the full version during a Google presentation in college a few years ago and it was scary good. I wouldn’t have known it wasn’t a person if I hadn’t been told in advance.

[deleted] t1_j7cyayu wrote on February 5, 2023 at 9:43 PM

Reply to comment by DigThatData in [R] Extracting Training Data from Diffusion Models by pm_me_your_pay_slips

[removed]

Freed4ever t1_j7cr91y wrote on February 5, 2023 at 8:54 PM

Reply to comment by geeky_username in [N] "I got access to Google LaMDA, the Chatbot that was so realistic that one Google engineer thought it was conscious. First impressions" by That_Violinist_18

I don't work at Google, but I can see there are truths in it. Look at Waymo, they were the leader but now what? Their science might still be the best, but without taking the risk, and iterating (the engineering part), they will fall behind. ChatGPT might be the wake up call that they need. How they re-act in the next couple of years will define Google as a company.

AdFew4357 OP t1_j7cq0v7 wrote on February 5, 2023 at 8:46 PM

Reply to comment by tripple13 in Are PhDs in statistics useful for ML research? [D] by AdFew4357

Sorry, I only speak Bayesian.

tripple13 t1_j7cpqqm wrote on February 5, 2023 at 8:44 PM

Reply to Are PhDs in statistics useful for ML research? [D] by AdFew4357

Sure, could very well be.

Just have to leave all your p-values at the door.

suflaj t1_j7cpf1d wrote on February 5, 2023 at 8:42 PM

Reply to comment by candidhorse4 in What text to speech does this guy use? [R] by candidhorse4

Azure

This is due to 2 issues both of these have and Azure mitigates to an extent:

they both lack humanity, i.e. they can at most be convincing as human prompt readers, but not anything else
those without a better ear and headphones probably do not notice a certain ring those two have, which a human voice cannot replicate - it might be that this effect is added to make the voices sharper, but ultimately it will make people like me, as well as robovoice detectors be able to more easily distinguish them as TTS

candidhorse4 OP t1_j7cnaci wrote on February 5, 2023 at 8:28 PM

Reply to comment by suflaj in What text to speech does this guy use? [R] by candidhorse4

i dont think they have, so what do you think then as a whole, which one is the best in replicating the human voice with all its nuances?

I_will_delete_myself t1_j7cn00u wrote on February 5, 2023 at 8:26 PM

Reply to comment by Freed4ever in [N] "I got access to Google LaMDA, the Chatbot that was so realistic that one Google engineer thought it was conscious. First impressions" by That_Violinist_18

They benefit from releasing the paper because it gives other researchers inspiration and allows Google to get free R&D. The researcher then releases another paper and Google gets to benefit from that.

AdFew4357 OP t1_j7ckwcf wrote on February 5, 2023 at 8:12 PM

Reply to comment by Mechanical_Number in Are PhDs in statistics useful for ML research? [D] by AdFew4357

How would you recommend a student like me, whose a phd student in statistics, to be marketable for industry related ML research? I’m worried that in my time as a phd statistics student, my work will be too “classical” and “foundational” and lie more in the statistics domain rather than ML, and not be attractive for recruiters in the ML research space. How would you advise myself to be come off as more of a ML researcher than a pure theoretical statistician? Just focus on more ML related applications in my research?

suflaj t1_j7ckp0d wrote on February 5, 2023 at 8:11 PM

Reply to comment by candidhorse4 in What text to speech does this guy use? [R] by candidhorse4

Yes. Although impressive in the number of languages and voices, it does not match Azure's more expressive prosody. I have listened to far too many robocalls, so that kind of magic is gone for me.

Someone else might consider it more humanlike, as it's all subjective. Have they published benchmark scores yet?

Recent comments in /f/MachineLearning