Recent comments in /f/MachineLearning

PM_ME_ENFP_MEMES t1_jb73ky6 wrote

Well yeah it’s like dealing with any tradesman, some will be as reliable as a used car salesman and some will be superstars who’ll make you a billionaire within a decade; and the rest lie somewhere on a spectrum between both extremes. Shop around, speak to 5-10 and choose the one who’s vision and ethos aligns with you and your goals. The good ones aren’t cheap but if they’re bringing in business, then their fee is simply a cost of doing business. How they bring in clients for you is their business. You literally don’t need to worry about it, like using an appliance. If they don’t bring in paying clients, then dump them and try another. It’s such a competitive field that you’re bound to find suitable guys to deal with.

As far as word of mouth referrals go, they’ll be best of course but if you have too much work already this way, then you’ll obviously not need to advertise :)

Regardless, if you’re wanting to grow this business long term regardless of whether it’s part time or full time, I suggest studying up on entrepreneurship, advertising, sales, and related topics. Even free courses online can be helpful here, but your local government almost certainly runs courses for guys like you. Definitely worth looking into.

3

True_Toe_8953 t1_jb72i4c wrote

> Is this why some checkpoints / safetensors make for better results than stable diffusion's 1.5 and 2.1 weights?

I think this is because of a tradeoff between stylistic range and quality. Your model is only so big, so the more styles the less parameters available for each.

The base SD model is capable of a very wide range of styles, including a lot of abstract styles that no one ever uses. Most fine-tuned models only support a handful of popular styles (usually anime, digital paintings, or photographs) and other styles are merged with the main style and lost.

MidJourney has a wider range than most fine-tuned SD models but appears to be making the same tradeoff.

2

Mediocre-Bullfrog686 t1_jb71qkj wrote

Pixels with ignore_index mean that they should be ignored (e.g., pixels in the ground-truth image that the annotators are not sure about). It does not mean that they are from a "negative class". It is correct to ignore those pixels during IoU computation.

4

doctorjuice OP t1_jb6yzmv wrote

Yes I think this is a good idea, and have thought some about advertising. Will this give relevant enough, high quality leads though? Somehow, a lot of the time random connections, networking, forum talking, etc have lead to some of the best, high quality leads.

I worry that advertising will either

  1. lead to too small a conversion rate or 2) the leads will be too irrelevant, low quality, low paying, etc
3

alterframe t1_jb6ye5w wrote

Anyone noticed this with weight decay too?

For example here: GIST

It's like larger weight decay provide regularization which lead to slower training as we would expect, but setting lower weight decay makes the training even faster, than the one without any decay at all. I wonder if it may be related.

1

pancomputationalist t1_jb6uc96 wrote

That's a solvable problem. Same discussion as with autopilots in cars.

With the human staff in hospitals getting thinner by the day, some people would rather trust an inexpensive machine, than having to wait ages to talk to a human doctor, who might not even be smarter than the machine. Assuming that AI will grow in popularity in general.

9

thiru_2718 t1_jb6njez wrote

>supervised learning can teach a model to complete a human-defined task. But reinforcement learning can teach a model to choose its own tasks to complete arbitrary goals.

Isn't this contradicted by LLMs demonstrating emergent abilities (like learning how meta-learning strategies, or in-context learning) that allow it to tackle complex sequential tasks adaptively? There is research (i.e. https://innermonologue.github.io/) where LLMs are successfully applied to a traditional RL domain - planning and interaction for robots. While there is RLHF involved in models like ChatGPT, the bulk of the model's reasoning comes from the supervised learning.

As far as I can tell, the unexpected, emergent abilities of LLM have somewhat rewritten our assumptions of what is capable through supervised learning, and should be extended into the RL domain.

−1

InterlocutorX t1_jb6iw7y wrote

>The duplicates aren't perfect duplicates and are added to create more robust model results

This is incorrect and anyone who looks at the LAION5b aesthetic set can tell pretty easily. It's got easily viewable identical copies of images.

https://imgur.com/a/Mg2xZcT

And the noisy Stallone was an SD image, not an image from the dataset.

[I looked at the images it has for Henry Cavil and 6 out of 24 images are the exact same Witcher promo shot. Which is a quarter of the images it has of Cavil.]

Feel free to look for yourself:

https://laion-aesthetic.datasette.io/laion-aesthetic-6pls/

13

DataDrivenOrgasm t1_jb6fe6v wrote

I develop ML for medical devices. The integrated AI systems you are imagining are unlikely to be adopted for the foreseeable future.

First, the software in healthcare cannot be centralized. Every point of care has a LIMS (Laboratory Information Management System) for digitally managing lab results. Installing a modern diagnostic instrument involves communication with the LIMS. The problem is that virtually every clinic's LIMS is a bespoke creation by their IT staff. There exist almost no standards for the form of data in these systems. Performing a LIMS integration at one site does not make the process any easier for the next site. Thus an integrated AI solution for a clinic would need to be tailored to that site. There are very few sites that would generate enough data on their own to train a modern ML solution.

Similarly, the number and types of diagnostic tests performed are very different between sites. Further, there are often dozens of commercial options for any given test. So two identical patients at different sites will have different lab tests performed, and those tests may have slightly different results/coverage based on the technology adopted by that lab.

While this may seem messy, it actually makes sense for the field. Healthcare needs vary widely among different geographic contexts. Hospital-acquired infections tend to be unique to specific sites. Common injuries/illnesses/etc also tend to vary with urban vs rural environments, and the local weather patterns and ecology.

For some types of healthcare where geography is not so important, specialized centers will meet much of the demand. There will be trauma centers and cancer centers that treat similar ailments for a large geographic area. Those centers will be the best places to develop integrated AI solutions, but those solutions will only work for other similar large centers.

Additionally, the regulatory and IP environment in healthcare is not conducive to integrated solutions. Diagnostic IP is fragmented across thousands of companies, and none of them will voluntarily cooperate to help develop standards for integration. Some large companies are marketing integrated solutions, but these function as whole-sale replacements for specific lab workflows. Very few clinics will have the funding required to replace their existing workflow all at once, and even these integrated workflows require extensive customization in capability tailored to each site's needs. In the US, an integrated solution must go through the same regulatory process as the standalone tests, even if those tests are already approved by the FDA. This effectively doubles the costs of development.

COPAN is one company that has done great work in AI-assisted workflows through their integrated microbiology solutions. Despite this, they have less than 1000 sites deploying their solutions. This is because they rely on older methods and tests for integration. The newer/faster technologies are owned by other companies, requiring a partnership for integration.

Currently, AI in diagnostics is limited to what one company can accomplish, and even then the algorithms must be frozen. Updating a model based on new data requires another round of clinical trials for FDA approval. Data acquired at clinical sites cannot be included in these updates due to privacy laws. Even user telemetry data is nearly impossible to extract from a field instrument due to IT security practices.

7