Recent comments in /f/MachineLearning

Avelina9X OP t1_j55kcjo wrote

Yeah that's really weird. We're documenting google chrome silently adding upscaling. I think it's a really worthwhile discussion for the community to figure out what model its using as well as how they're implementing it in a cross platform, GPU agnostic way that is buttery smooth and doesn't use a tone of resources.

14

Avelina9X OP t1_j55h54h wrote

I think it's clientside. Which is why I mentioned perhaps its using a GLSL based CNN which is absolutely possible in WebGL2 and I've been experimenting with that sort of tech myself (not for upscaling, but just as a proof of concept CNN in WebGL).

15

IntelArtiGen t1_j55at5j wrote

I don't really see how and why they would do it. What's the video? You can check the codec they used with right click > "stats for nerds", the codec should say which algorithm they used to encode/decode the video. Using CNNs client-side for this task would probably be quite cpu/gpu intensive and I doubt they would do it (except perhaps if it's an experiment). And using CNNs server-side wouldn't make sense if it increases the size of data download.

It does look like CNN artifacts.

46

arararagi_vamp t1_j557ewd wrote

I have built a simple CNN which is able to detect circles on a white background with noise using PyTorch.

Now I wish to extend my network to be able to return the center of the circle as coordinates. The problem is in each data there is a variable number of circles, meaning I would need a variable number of labels for each data. In a CNN however the number of labels remains constant.

How do I work around this problem?

1

Omnes_mundum_facimus t1_j555aa2 wrote

The short but mostly true conversation we had with legal.

  • engineer: so this model was actually developed by our biggest competitor
  • lawyer: wtf?????
  • engineer: And we used a pretrained checkpoint, again from even bigger competitor
  • laywer: wtf??
  • engineer: All cool, It was trained on this imagenet thing
  • lawyer: And who owns this imagenet thing?
  • engineer: ????
  • lawyer: And did everybody in this imagenet thing consent to his or her picture being used?
  • engineer: ????
  • lawyer: what the actual f ?????
  • engineer: So I guess we are using our own model trained from our own data then.
8

jfacowns t1_j550f70 wrote

XGBoost Question around One-Hot Encoding & Get_Dummies in Python

I am working on building a model for NHL (hockey) games and have a spreadsheet with a ton of advanced stats from teams, dates they played and so on.

All of my data in this spreadheet is categorized as a float. I am trying to add in a few columns of categorical data as I feel it could help the model.

The categorical columns have data that determines if the home team or the away team is playing on back to back days.

I am trying to determine here is one-hot encoding is best for this approach or if I'm misunderstanding how it works as a whole.

Here is some code

NHLData = pd.read_excel('C:\\Temp\\NHL_ModelBuilder.xlsx')


data.drop(['HomeTeam', 'AwayTeam','Result'],
      axis=1, inplace=True)


NHLData = pd.get_dummies(NHLData, columns= ['B2B_Home', 'B2B_Away'])

Does this make sense? Am i on the right track here?

If i do NHLData.head() I can see the one-hot encoded columns but when I do NHLData.dtypes() I see this:

B2B_Home_0              uint8
B2B_Home_1              uint8
B2B_Away_0              uint8
B2B_Away_1              uint8

Should these not be objects?

1

Spico197 OP t1_j54pdgj wrote

Thanks very much for your reply.

I didn't evaluate the query time. This tool doesn't download the whole arxiv dataset, it just calls the official API. So time is up to the web connection. But it wouldn't take a long time to execute a query.

Yes, absolutely! There are some other things to do before making an online demo, e.g. merging the current two-stage searching into one step. I'm working on it. Thanks again for the advice!

4

unsteadytrauma t1_j54nqh3 wrote

Is it possible to run a model like GPT2 or GPT-J on my own computer and use it to rewrite/rephrase and summarize text? Or would that require too much resources for a personal computer? I'm a noob.

1