Recent comments in /f/MachineLearning

marixer t1_j9b0x65 wrote

The step you're missing there is finding the cameras positions and angles with something like COLMAP, predicting them by extracting features from the images, pairing and triangulating. That data is then used alongside the RGB images to train the nerf

36

Pyramid_Jumper t1_j9ayed5 wrote

Been a while since I’ve read the paper but I don’t think you’re missing anything - apart from data in the correct format that is. You’ll need the aforementioned 5D vectors to be able to train/use this model.

If you can’t get that data then I’d suggest you look at other work that cite NeRF that maybe have data in a similar format to the data you do have

−2

harharveryfunny t1_j9aydo9 wrote

Here's the key, thanks to CHatGPT:

Data preparation: First, the training data is preprocessed to convert the 2D images and camera poses into a set of 3D points and corresponding colors. Each 2D image is projected onto a 3D point cloud using the corresponding camera pose, resulting in a set of 3D points with associated colors.

−8

go2carter t1_j9ay3ix wrote

This is very doable. What domain is your data? e.g. tabular, images, videos?

If it's vision, are you able to share a bit on what quality control metrics you have and whether you need to detect or classify anything in the images / video?

1

squidward2022 t1_j9au3bg wrote

Yup! If you look at the graph of tanh you will see relu(tanh) will smush the left half of the graph to 0. The right half of the graph on (0,infty) ranges in value from 0 and 1 but you can see saturation towards 1 starts to occur around 2-2.5. Since relu leaves this half unchanged you’ll be able to approach 1 very effectively with reasonable finite values.

2

AtomicNixon t1_j9ar9hw wrote

There's not enough here for a network to latch onto. I've trained nets on a variety of geometric patterns of differing styles so I know the minimum needed. I think banme's suggestion is the way to go. Figure out what your personal algo is and go with that.

1

wywywywy t1_j9ar2tk wrote

I did test larger but it didn't run. I can't remember which ones, probably GPT-J. I recently got a 3090 so I can load larger models now.

As for quality, my use case is simple (writing prompt to help with writing stories & articles) and nothing sophisticated, and they worked well. Until ChatGPT came along. I use ChatGPT instead now.

6

banmeyoucoward t1_j9apt54 wrote

What tool did you use to make the art on your website?

Your style relies heavily on recursion and similarities between scales, which conv nets are not good at, but programatic descriptions of images like LOGO are very good at. My strategy would be to manually write simple LOGO, python (or whatever tool you initially used) programs that generate each of the images on your site, and then prompt Chat-GPT with “write a program that generates an image combining ideas from <Program A> and <Program B>

2