Recent comments in /f/MachineLearning
Alert_Ad2 t1_ja3m9od wrote
If you are asking this, it means your lab has not already answered this question for you.
Which means you are in a bad lab.
Which means your paper will most likely be rejected from ICML.
sanman t1_ja3kkkp wrote
Reply to comment by activatedgeek in [D] "Deep learning is the only thing that currently works at scale" by GraciousReformer
first 2 links are the same - do you have the one for CNNs inductive bias?
whata_wonderful_day t1_ja3kh4d wrote
Reply to comment by CKtalon in [P] What are the latest "out of the box solutions" for deploying the very large LLMs as API endpoints? by johnhopiler
Yeah this is what the big bois use. It'll give you max performance, but isn't exactly user friendly
MysteryInc152 OP t1_ja3hozj wrote
Reply to [R] Large language models generate functional protein sequences across diverse families by MysteryInc152
>Deep-learning language models have shown promise in various biotechnological applications, including protein design and engineering. Here we describe ProGen, a language model that can generate protein sequences with a predictable function across large protein families, akin to generating grammatically and semantically correct natural language sentences on diverse topics. The model was trained on 280 million protein sequences from >19,000 families and is augmented with control tags specifying protein properties. ProGen can be further fine-tuned to curated sequences and tags to improve controllable generation performance of proteins from families with sufficient homologous samples. Artificial proteins fine-tuned to five distinct lysozyme families showed similar catalytic efficiencies as natural lysozymes, with sequence identity to natural proteins as low as 31.4%. ProGen is readily adapted to diverse protein families, as we demonstrate with chorismate mutase and malate dehydrogenase.
AlbertoUEDev t1_ja39mi1 wrote
Reply to comment by AlbertoUEDev in [R] [N] VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion. by radi-cho
Are you in discord Nvidia devs?
AlbertoUEDev t1_ja39jhy wrote
Reply to comment by radi-cho in [R] [N] VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion. by radi-cho
Hello! I'm looking for a couple of weeks to add segmentation to unreal engine! Let's see
avocadoughnut t1_ja35pg6 wrote
Reply to comment by visarga in [P] [N] Democratizing the chatGPT technology through a Q&A game by coconautico
There's risk of breaking OpenAI TOS by training on their models. It's a hard no for this project to ensure legal safety.
davidmezzetti OP t1_ja34bm6 wrote
Reply to comment by SatoshiNotMe in [P] Introducing txtchat, next-generation conversational search and workflows by davidmezzetti
Thanks, appreciate it. Not much I can do with down votes unless someone provides their rationale, which no one ever does.
davidmezzetti OP t1_ja345mn wrote
Reply to comment by dancingnightly in [P] Introducing txtchat, next-generation conversational search and workflows by davidmezzetti
Thank you.
This application is RAG with a local vector index combined with a LLM from the FLAN-T5 series of models.
The whole solution can be locally hosted with no remote runtime API dependencies.
tradegreek t1_ja336tf wrote
Reply to [D] Simple Questions Thread by AutoModerator
I have just been testing out some machine learning as I am new to it and have a simple dataset currently 500k rows the target value is literally the sum of each row. I was using model = Sequential()model.add(Dense(4, input_dim=4, activation='ELU'))model.add(Dense(1, activation='linear'))# Compile the modelmodel.compile(loss='mean_squared_error', optimizer='adam') model.fit(X, y, epochs=4000, verbose=1, batch_size=120). I then fed the model some unseen data to see if it could make the new calculations again literally just sum up the values.
22000 - 11000 - 6000 - 1500 should equal 3500 but instead, i got 3499.9915. The results for the other new unseen data was all similar I was wondering how i can fix this I know ai models need a lot of data but surely for something so trivial I would have expected it to get the values perfectly correct. My long term goal is to build data validation through calculations which is why i am practicing such a basic model.
SatoshiNotMe t1_ja2swra wrote
Reply to [P] Introducing txtchat, next-generation conversational search and workflows by davidmezzetti
This is very interesting. Ignore the down-voters. Thank you for sharing 🙏
visarga t1_ja2r2fe wrote
Wouldn't it be better if people could donate their interactions with chatGPT, BingChat and other models? Make a scraping extension, it should collect chat logs and anonymise them. Then you got a diverse distribution of real life tasks.
I suspect this is the reason OpenAI and Bing offered their models for free to the public - to find the real distribution of tasks people want to solve with AI bots.
LazerStallion t1_ja2r26r wrote
Reply to comment by mc-powzinho in [D] Simple Questions Thread by AutoModerator
Not necessarily - could just be a huge data file. Maybe pandas can read in parts of it at a time? I'm not sure, but it could be worth looking into.
rhineroceraptor t1_ja2nojn wrote
Reply to comment by savage_slurpie in [D] Looking for someone to do a small coding job by Brunt__
I don’t know man, I think I could do it for $120,000
cvnh t1_ja2n6lj wrote
Reply to comment by radi-cho in [R] [N] VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion. by radi-cho
That's awesome, I was just getting started in doing something similar but starting with even simpler geometries. Fantastic work.
londons_explorer t1_ja2m7hi wrote
Reply to [R] [N] VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion. by radi-cho
If I'm understanding this paper correctly... This technique doesn't work if there are any moving objects in any of the camera scenes?
dancingnightly t1_ja2lfup wrote
Reply to comment by davidmezzetti in [P] Introducing txtchat, next-generation conversational search and workflows by davidmezzetti
Is this current version mostly RAG + WebGPT semantic search to GPT answer, then?
Big fan of your recent work.
[deleted] t1_ja2izup wrote
alterframe t1_ja2f7xu wrote
Part of the answer is probably that DL is not a single algorithm or a class of algorithms, but rather a framework or a paradigm for building such algorithms.
Sure, you can take a SOTA model for ImageNet and apply it to similar image classification problems, by tuning some hyperparameters and maybe replacing certain layers. However, if you want to apply it to a completely different task, you need to build a different neural network.
WarAndGeese t1_ja2duq1 wrote
Great stuff
D4rkthorn t1_ja2d441 wrote
Expect to get rejected. Don't take the conference too seriously, they are usually filled with people who fell very important, so it can be necessary to drink your brains out to get through it.
If you get accepted remember to make a poster.
mc-powzinho t1_ja28i97 wrote
Reply to comment by LazerStallion in [D] Simple Questions Thread by AutoModerator
Know i have a terrible machine i guess.
radi-cho OP t1_ja27qnn wrote
Reply to [R] [N] VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion. by radi-cho
Paper: https://arxiv.org/pdf/2302.12251.pdf GitHub: https://github.com/nvlabs/voxformer
Abstract: Humans can easily imagine the complete 3D geometry of occluded objects and scenes. This appealing ability is vital for recognition and understanding. To enable such capability in AI systems, we propose VoxFormer, a Transformer-based semantic scene completion framework that can output complete 3D volumetric semantics from only 2D images. Our framework adopts a two-stage design where we start from a sparse set of visible and occupied voxel queries from depth estimation, followed by a densification stage that generates dense 3D voxels from the sparse ones. A key idea of this design is that the visual features on 2D images correspond only to the visible scene structures rather than the occluded or empty spaces. Therefore, starting with the featurization and prediction of the visible structures is more reliable. Once we obtain the set of sparse queries, we apply a masked autoencoder design to propagate the information to all the voxels by self-attention. Experiments on SemanticKITTI show that VoxFormer outperforms the state of the art with a relative improvement of 20.0% in geometry and 18.1% in semantics and reduces GPU memory during training by ~45% to less than 16GB.
stevevaius t1_ja27q2v wrote
Reply to comment by pommedeterresautee in [P] Get 2x Faster Transcriptions with OpenAI Whisper Large on Kernl by pommedeterresautee
Thanks. For a simple uploading a wav file and transcribe it, is there any implementation on colab? Sorry to bother you. I am working on whisper.cpp but large model is not fast on streaming. Looking to solve this issue by faster methods.
coconautico OP t1_ja3nvs7 wrote
Reply to comment by visarga in [P] [N] Democratizing the chatGPT technology through a Q&A game by coconautico
I have manually copy-pasted a few interesting questions (i.e, my input) that I have asked chatGPT previously, that encouraged lateral thinking or required specialized knowledge.
However, I'm not so sure it would a good idea to load thousands of questions indiscriminately, because just as we wouldn't express a question on Reddit in the same way we would in person, when we ask a question to chatGPT (or Google), we slightly modify the way we talk by taking into account the weaknesses of the system. And given that we are looking for a high-quality dataset of natural conversations, I don't think this would be a very good strategy in the short term.
Moreover, we also have to consider that the project prioritizes quality above all else, and unless the number of volunteers ranking questions/replies increases considerably, the "ratio of trees to ready exported" wouldn't increase much either.