• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Paul Clapham
  • Jeanne Boyarsky
  • Junilu Lacar
  • Henry Wong
Sheriffs:
  • Ron McLeod
  • Devaka Cooray
  • Tim Cooke
Saloon Keepers:
  • Tim Moores
  • Stephan van Hulst
  • Frits Walraven
  • Tim Holloway
  • Carey Brown
Bartenders:
  • Piet Souris
  • salvin francis
  • fred rosenberger

Multi-Label NLP

 
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi all.

I am trying to train a simple CNN on this dataset, which is multi-class and natural language:
https://www.kaggle.com/badalgupta/stack-overflow-tag-prediction/data

I am using word embeddings from FastText.
I have converted the words to index numbers in my vocab (from FastText), then used a (non-trainable) TensorFlow Embedding layer to convert the index numbers to word vectors using the pre-trained FastText embeddings.
The labels are multi-hot encoded (there are 100 labels).
The output activation is sigmoid and the loss is binary crossentropy, as that is what many websites recommend.
I have just split the train/validation/test sets randomly for now, so they do not take into account of the labels.

When I train the CNN, the "accuracy" gets to 0.99 very quickly and the loss is low.
At the end of each epoch the precision, recall and F1 scores gradually improve, then plateau at around 0.35.

The predictions are poor, with the maximum probability from the sigmoid output often as low as 6%, so the network does not seem to be properly trained. With a high accuracy and low loss, any training will be very slow anyway.

As there is a fairly large skew in the number of times each label has been allocated, I have used the class_weight parameter when fitting, to try to assist the training.

Does anyone have the experience to point to where I should start my investigation? There are so many things to twiddle!
Perhaps the transfer-learning expert will have some ideas.


Thanks
Don.
 
Don Horrell
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Just to clarify, this is a multi-LABEL problem, not multi-class.
Apologies for my mistake.


Don.
 
Hey cool! They got a blimp! But I have a tiny ad:
Devious Experiments for a Truly Passive Greenhouse!
https://www.kickstarter.com/projects/paulwheaton/greenhouse-1
    Bookmark Topic Watch Topic
  • New Topic