Recent comments in /f/MachineLearning

xt-89 t1_jbt5yyd wrote

1

Avelina9X t1_jbt4o8y wrote

So the attention mechanism has N^2 space and time complexity relative to sequence length. However, if you are memory constrained it is possible to get the memory requirement per token down to O(N) by computing only 1 token at a time and caching the previous keys and values. This is only really possible at inference time and requires the architecture was implemented with caching in mind.

1

jsonathan t1_jbt3hqq wrote

This is really fascinating, thanks for sharing. I'm also working on generating natural language representations of Python packages. My approach is:

  1. Extract a call graph from the package, where each node is a function and two nodes are connected if one contains a call to the other.
  2. Generate natural language summaries of each function by convolving over the graph. This involves generating summaries of the terminal nodes (i.e. functions with no dependencies), then passing those summaries to their dependents to generate summaries, and so on. Very similar to how message passing works in a GNN. The idea here is that summarizing what a function does isn't possible without summaries of what its dependencies do.
  3. Summaries of each function within a file are chained to generate a summary of that file.
  4. Summaries of each file within a directory are chained to generate a summary of that directory, and so on until the root directory is reached.

I'd love to learn more about the differences/advantages of your approach compared to something like this. Thanks again for your contribution, this is insanely cool!

1

hak8or t1_jbt1wja wrote

/u/NovelspaceOnly Can you verify this?

As to /u/Main_Mathematician77 , you are effectively a software developer with the ability to dabble with machine learning. Are you located in the states or elsewhere? It would be very confusing as to how you are broke yet have that skillset.

> Don’t @ me saying this is a waste of compute, I know what I’m doing and idgaf.

That is extremely unnecessarily antagonistic/combative

4

Simusid OP t1_jbsyp5n wrote

Yesterday I set up a paid account at OpenAI. I have been using the free sentence-transformers library and models for many months with good results. I compared the performance of the two by encoding 20K vectors from this repo https://github.com/mhjabreel/CharCnn_Keras. I did no preprocessing or cleanup of the input text. The OpenAI model is text-embedding-ada-002 and the SentenceTransformer model is all-mpnet-base-v2. The plots are simple UMAP(), with all defaults.I also built a very generic model with 3 dense layers, nothing fancy. I ran each model ten times for the two embeddings, fitting with EarlyStopping, and evaluating with hold out data. The average results were HF 89% and OpenAI 91.1%. This is not rigorous or conclusive, but for my purposes I'm happy sticking with SentenceTransformers. If I need to chase decimal points of performance, I will use OpenAi.

Edit - The second graph should be titled "SentenceTransformer" not HuggingFace.

79

rainbow3 t1_jbsvc41 wrote

It is a bit like the daleks. Created by a mastermind but he forgot about stairs. Tall with cameras will get stuck under low hanging fruit trees. Ultrasound fine for walls but flower beds less so.

There is a lot of unknowns even for one application.

Another example..I had one with a rain sensor so it avoided cutting wet grass. Sounds good but if there are weeks of rain then the grass does not get cut until gets too long for the mower to cut.

1

NovelspaceOnly OP t1_jbsdr55 wrote

I have some preliminary generation scripts for SMILES chemical graphs, Feynman diagrams, storytelling with interleaved images, and testing compilation rates. sorry for switching accounts. this one is logged on my laptop lol..

1

science-raven OP t1_jbsbbyz wrote

Nvidia Jetson Nano and Raspberry Pi can run 2 FPS of AI object detection, on Yolo NN code. A Yolo model can differentiate 80 different objects, and you can run 20-50 different Yolo models, to detect 10,000 different objects.

The traditional programming copies the AI identification objects to a 3D map of the zone.

2

[deleted] t1_jbsafxc wrote

Thank you! Yes I thought the topic tree would be a great complement to the commit tree. Would be great for stale repos with little to no documentation.

Also the option to mix in multiple repositories and message pass between them to help with brain storming new features. Or message passing between your repo and its dependencies.

1

xt-89 t1_jbsaabf wrote

I also plan on applying the basic idea of a GNN with prompting to the thought loop of an cognitive entity (basically open assistant). I believe if you take the tree your outputting for code, but use it to aid CoT reasoning, that could be pretty powerful

3