Recent comments in /f/MachineLearning
[deleted] t1_j7kuh80 wrote
Reply to comment by dreternal in [D] A ML-powered music description/tag generator? (a reverse-MusicML)? by dreternal
[deleted]
AccidentBackground72 t1_j7kub42 wrote
Reply to comment by aicharades in [P] ChatGPT without size limits: upload any pdf and apply any prompt to it by aicharades
That's an incredibly helpful overview! For the kind of work I do this is a really awesome tool.
dfcHeadChair t1_j7ku6j3 wrote
Reply to comment by Feeling_Card_4162 in [D] What techniques can I use to tell if a problem is likely enough to be solved by ML so as to justify compiling the dataset? by SnuggleWuggleSleep
Yep I agree. If you learn a transferable skill that should be taken into account.
I was framing the problem in the same terms OP did.
WokeAssBaller t1_j7ktznh wrote
Reply to comment by emerging-tech-reader in [N] Google: An Important Next Step On Our AI Journey by EducationalCicada
https://arxiv.org/pdf/1706.03762.pdf the paper that made all this possible.
Google has also been leading in research around transformers and NLP for some time. Not that they don’t in ways share from each other
KleinByte t1_j7ktb3x wrote
Reply to comment by memberjan6 in [N] Google: An Important Next Step On Our AI Journey by EducationalCicada
Competitive gaming would be ruined if this happened.
emerging-tech-reader t1_j7ksup6 wrote
Reply to comment by WokeAssBaller in [N] Google: An Important Next Step On Our AI Journey by EducationalCicada
> OpenAI is built on google research
To my knowledge that is not remotely true. Can you cite where you got that claim?
OpenAI does take funding and share research with a number of AI related companies. Don't know if Google is in that list.
marr75 t1_j7ksi6o wrote
Reply to comment by st8ic in [N] Google: An Important Next Step On Our AI Journey by EducationalCicada
They should be. I think LLMs will totally upset how content is indexed and accessed. It's one of the easiest and lowest stakes use cases for them, really.
Unfortunately, Google has such a huge incumbent advantage that they could produce the 5th or 6th best search specialized LLM and still be the #1 search provider.
[deleted] t1_j7kri0a wrote
Reply to comment by chief167 in [N] Google: An Important Next Step On Our AI Journey by EducationalCicada
[removed]
dreternal OP t1_j7krh8s wrote
Reply to comment by [deleted] in [D] A ML-powered music description/tag generator? (a reverse-MusicML)? by dreternal
No. Just a composer with a very large catalog, 85-90% of which I haven't had time to properly tag and describe for sales (using just song, names or file names isn't enough, you need very, very specific tags related to the mood tempo audience, beats per minute instrument list and on and on for every song in order to have proper exposure in the various music libraries online). I've been too busy over the last 30 years writing stuff to bother with adding all this data. So with the advent of these machine learning tools, I'm hoping they can help.
[deleted] t1_j7kqwkw wrote
Reply to comment by dreternal in [D] A ML-powered music description/tag generator? (a reverse-MusicML)? by dreternal
I just dont understand what's this about selling a music catalog? Are you like a record label or something?
WokeAssBaller t1_j7kqhgl wrote
Reply to comment by emerging-tech-reader in [N] Google: An Important Next Step On Our AI Journey by EducationalCicada
Yeah right, OpenAI is built on google research, and cool you worked a half functioning chat or into the worst messaging and search app, congrats
aicharades OP t1_j7kpzzn wrote
Reply to comment by AccidentBackground72 in [P] ChatGPT without size limits: upload any pdf and apply any prompt to it by aicharades
Of course! Step 1 breaks up your document and runs the prompt on each section. Try it with the Map section vs. Map Reduce (the main page).
Here's an example flow for Map:
- ​
Input a Book PDF 2. Convert the PDF to Text 3. Split the Book into Chunks: Book[pg1,pg2,pg3] -> pg1, pg2, pg3 4. Run the Prompt on Each Chunk: pg1, pg2, pg3 -> prompt(pg1), prompt(pg2), etc 5. Output the Summarized Chunks
Here's a prompt you could use (lots of room for improvement!):
the words in <<*>> are comments, plea remove from the final prompt
Goal: I'm trying to perform a content analysis of a document with 7 chapters and identify 10-15 core themes in each chapter.
Sample Map Prompt:
'INSTRUCTIONS': You are a writer <<BEST ROLE FIT??>> performing a content analysis of a document <<DOCUMENT TYPE??>>. You have been given a section of a larger document. You will identify up to 10-15 core themes in each chapter and output theme.
'INPUT': {text}
'OUTPUT':
Sample Reduce Prompt:
'INSTRUCTIONS': You are a copyeditor. You will need to edit a list of summaries together. Please combine the input together and combine any duplicate core themes. Please maintain the context of the document.
'INPUT': {text}
'OUTPUT':
Sample Input: document
emerging-tech-reader t1_j7kptn9 wrote
Reply to comment by WokeAssBaller in [N] Google: An Important Next Step On Our AI Journey by EducationalCicada
I got a demo of some of the stuff happening.
The one that is most impressive is they have GPT watching a meeting taking minutes and even crafts action items, emails, etc all ready for you when you leave the meeting.
It will also offer suggestions to follow up on in the meetings as they are on going.
Google have become the altavista.
marvinv1 t1_j7kpqoj wrote
Reply to comment by harharveryfunny in [N] Google: An Important Next Step On Our AI Journey by EducationalCicada
Yup, OpenAI expects to generate $200 million in revenue for 2023 and $1 billion for next year.
Acceptable-Fudge-816 t1_j7kpbkm wrote
Reply to Does the high dimensionality of AI systems that model the real world tell us something about the abstract space of ideas? [D] by Frumpagumpus
The real world also can have thousands of dimensions. Time, color, hatred tension in the room, air current, and anything you can possible attribute to a position/thing.
At the end of the day it's just words, and their meaning depends on agreements. When we speak of the 3 dimensions, we mean the 3 dimensions of the physical world that we decided to define with 3 coordinates that help us known the position of something. Might as well have used complex numbers and keep it to 2 coordinates, or decided time should be included as part of the concept of position. So when you talk about "dimensions" in general, it may as well mean anything.
Feeling_Card_4162 t1_j7kp7rq wrote
Reply to comment by dfcHeadChair in [D] What techniques can I use to tell if a problem is likely enough to be solved by ML so as to justify compiling the dataset? by SnuggleWuggleSleep
This is a good way to get an idea of the financial benefit but it’s also important to think about the knowledge you’ll gain and how much other people would benefit from it when deciding whether to continue or not. There is more to determining if something is worth your time than just money.
MrEloi t1_j7kolg7 wrote
Reply to Wouldn’t it be a good idea to bring a more energy efficient language into the ML world to reduce the insane costs a bit?[D] by thedarklord176
The 'busy' core stuff will be written in a low level high efficiency language.
dfcHeadChair t1_j7ko3pf wrote
Reply to [D] What techniques can I use to tell if a problem is likely enough to be solved by ML so as to justify compiling the dataset? by SnuggleWuggleSleep
- What is your best guess at how much money you'll make?
- Divide that by your best guess at the amount of time, money, and effort it will take you to compile the dataset.
- Do the division and ask yourself if it's worth it.
The hard math is going to get you your answer. You may be able to do some fancy correlation mapping depending on the models you think will solve the problem and what data you will need. The trouble with the "shortcut" route is two-fold:
- It may take you longer that to do the three steps above.
- You might not get an accurate answer.
harharveryfunny t1_j7knqfa wrote
Reply to comment by bartturner in [N] Google: An Important Next Step On Our AI Journey by EducationalCicada
OpenAI trained GPT on Microsoft Azure - it has zero to do with Google's TPU. While the "Attention Is All You Need" paper did come out of Google, it just built on models//concepts that came before. OpenAI have proven themselves plenty capable of innovating.
harharveryfunny t1_j7kmbzr wrote
Reply to comment by wood_orange443 in [N] Google: An Important Next Step On Our AI Journey by EducationalCicada
OpenAI just got a second round $10B investment from Microsoft, so that goes a ways ... They are selling API access to GPT for other companies to use however they like, and Microsoft has integrated Copilot (also GPT-based, fine-tuned for code generation) into their dev tools, and MIcrosoft is also integrating OpenAI's LLM tech into Bing. While OpenAI are also selling access to ChatGPT to end users, I doubt that's going to really be a focus for them or major source of revenue.
mostlyhydrogen OP t1_j7km5j2 wrote
Reply to comment by YOLOBOT666 in [D] Querying with multiple vectors during embedding nearest neighbor search? by mostlyhydrogen
Thanks for the offer! This is a work project, though. I'm working with images. I can't give too many details due to confidentiality, but we're sub-billion images scale.
Usability is determined by trained annotators. If they find an object of interest and want to harvest more training data, they do a reverse image search across the whole training data and tag true matches.
dreternal OP t1_j7klu14 wrote
Reply to comment by [deleted] in [D] A ML-powered music description/tag generator? (a reverse-MusicML)? by dreternal
I do not know everything, nor have I listened to every artist or genre. My categorization and genre choices would be limited to my limited experience. Having an assistant who has that would be a great timesaver and help me get hits I would otherwise miss.
DingWrong t1_j7klrp2 wrote
Reply to [D] Should I focus on python or C++? by NoSleep19
Focus on the basics. Math, algos, structures. The exact language is just a way to express these basics. Now, if you want to start coding right away, python is more common in ML atm.
EmbarrassedHelp t1_j7klnqy wrote
Reply to comment by _poisonedrationality in [N] Getty Images sues AI art generator Stable Diffusion in the US for copyright infringement by Wiskkey
If Getty Images wins, then AI generation tools are going to become further concentrated to a handful of companies while also becoming less open.
[deleted] t1_j7kulp5 wrote
Reply to comment by dreternal in [D] A ML-powered music description/tag generator? (a reverse-MusicML)? by dreternal
Yeah well good luck with that