Recent comments in /f/MachineLearning

[deleted] t1_j85flmh wrote

The problem I found with chatgpt and other AI is the word limit. I believe it is 4000 words max. and that includes the summary as well.

If anyone knows a fix, please let me know. In the meantime, I use an AI-tool called scholarcy, but it lacks data to be fed with. I study a subject that is *very* reading-heavy, so I can't simply rely on the abstract, and 100 pages per week/course is mostly too much to handle, while working part-time.

8

LetterRip t1_j85b07d wrote

Why not int4? Why not pruning? Why not various model compression tricks? int4 halves latency. At minimum they would do mixed int4/int8.

https://arxiv.org/abs/2206.01861

Why not distillation?

https://transformer.huggingface.co/model/distil-gpt2

NVidia using FasterTransformer and Triton inference server has a 32x speed up over baseline GPT-J,

https://developer.nvidia.com/blog/deploying-gpt-j-and-t5-with-fastertransformer-and-triton-inference-server/

I think their assumptions are at least an order of magnitude pessimistic.

As someone else notes, the vast majority of queries can be cached. Also there would likely be a Mixture of experts. No need for the heavy duty model when a trivial model can answer the question.

5

endless_sea_of_stars t1_j858dvn wrote

> abstract is meant is often a bit clickbaity.

Had a vision of a nightmare future where papers are written in click bait fashion.

Top Ten Shocking Properties of Positive Solutions of Higher Order Differential Equations and Their Astounding Applications in Oscillation Theory. You won't believe number 7!

78