Don Horrell

Greenhorn
+ Follow
since Oct 29, 2004
Cows and Likes
Cows
Total received
0
In last 30 days
0
Total given
0
Likes
Total received
3
Received in last 30 days
0
Total given
0
Given in last 30 days
0
Forums and Threads
Scavenger Hunt
expand Ranch Hand Scavenger Hunt
expand Greenhorn Scavenger Hunt

Recent posts by Don Horrell

Just to clarify, this is a multi-LABEL problem, not multi-class.
Apologies for my mistake.


Don.
My other ML interest is topic modelling, using document vectors.
There do not seem to be pre-trained sets of document vectors available yet, but when there are, how could we use transfer learning to take a pre-trained set of document vectors and adapt it to domain-specific documents e.g. medical documents, documents about programming, patents etc?
Thanks for your reply, Paul. I'm a little confused though.
The pre-trained FastText word embeddings I have downloaded map words to vectors, so in my case (using TensorFlow to do some NLP classification), I can only train my classifier on the words in the embedding list.
That is the crux of my original question - how can I add domain-specific vocabulary to pre-trained word embeddings. Will your book cover this?


Thanks
Don.
Hi all.

I am trying to train a simple CNN on this dataset, which is multi-class and natural language:
https://www.kaggle.com/badalgupta/stack-overflow-tag-prediction/data

I am using word embeddings from FastText.
I have converted the words to index numbers in my vocab (from FastText), then used a (non-trainable) TensorFlow Embedding layer to convert the index numbers to word vectors using the pre-trained FastText embeddings.
The labels are multi-hot encoded (there are 100 labels).
The output activation is sigmoid and the loss is binary crossentropy, as that is what many websites recommend.
I have just split the train/validation/test sets randomly for now, so they do not take into account of the labels.

When I train the CNN, the "accuracy" gets to 0.99 very quickly and the loss is low.
At the end of each epoch the precision, recall and F1 scores gradually improve, then plateau at around 0.35.

The predictions are poor, with the maximum probability from the sigmoid output often as low as 6%, so the network does not seem to be properly trained. With a high accuracy and low loss, any training will be very slow anyway.

As there is a fairly large skew in the number of times each label has been allocated, I have used the class_weight parameter when fitting, to try to assist the training.

Does anyone have the experience to point to where I should start my investigation? There are so many things to twiddle!
Perhaps the transfer-learning expert will have some ideas.


Thanks
Don.
What are the strengths and weaknesses of Gensim and TensorFlow for NLP?
Which is best for the different types of project?
Hi Paul Azunre.
I am trying to do multi-label classification on some text. The number of times each label has been assigned to the training text shows a large skew.
Is there anything that Transfer Learning can do to help?

Thanks
Don.
Hi Paul Azunre.
There are several pre-trained word embeddings available, but they generally cover the most common words.
Can I do something similar to Transfer Learning - start with a pre-trained set of word embeddings, then add my own domain-specific words somehow?

Cheers
Don.
I have not used either yet, but I'm looking at HDIV (www.hdiv.org) and OWASP (http://www.owasp.org/index.php/CSRF_Guard). HDIV looks more efficient, as the OWASP CSRFGuard parses the HTML produced by the Web App.
10 years ago
I'm trying to write a custom JSF tag that takes a database table name as a parameter.
The tag needs to work out the structure of the database table and display the data in the table as a set of rows containing the fields in the database.

So, I've crated DatabaseGridTag, which extends UIComponentELTag and is configured to create a UIInputDatabaseGrid (which extends UIData).
DatabaseGridTag dynamically works out the structure of the database table and creates UIOuput objects for each column of the table.
The DatabaseGridTag works fine at displaying all the rows and columns, but when I try to add a UICommand to display an "Update" button, the button is displayed, but the command does not call the handler. There's no error, but it does not call my CompanyGroupHandler.update().


protected void addFields(UIInputDatabaseGrid grid, CompanyGroupHandler handler)
{
FacesContext context = getFacesContext();
boolean updateable = true;

try
{
// Setup the columns etc.
List cmds = new JdbcHelper().getMetaData(tableName);
Iterator it = cmds.iterator();
while(it.hasNext())
{
// Create a column & add to the grid.
ColumnMetaData cmd = (ColumnMetaData)(it.next());
UIColumn column = new UIColumn();
UIOutput header1 = new UIOutput();
String columnLabel = getColumnLabel(cmd);
header1.setValue(columnLabel);
column.setHeader(header1);
grid.getChildren().add(column);

// Create an input & add to the column. Value binding gets the value from the map - key is columnLabel.
UIInput input = new UIInput();
ValueBinding vb = context.getApplication().createValueBinding("#{" + var + "." + columnLabel + "}");
input.setValueBinding("value", vb);
column.getChildren().add(input);
}

// Add update button, if req.
if(updateable)
{
// ??? This bit displays the button, but clicking it does not work!!!
UIColumn column = new UIColumn();
grid.getChildren().add(column);
UICommand command = (UICommand)(context.getApplication().createComponent("javax.faces.Command"));
column.getChildren().add(command);
command.setRendererType("javax.faces.Button");
MethodBinding mb = context.getApplication().createMethodBinding("#{companyGroupHandler.update}", null);
if(mb == null)
{
LOGGER.error("DatabaseGridTag.addFields() : Null methodBinding.");
}
command.setAction(mb);
command.setValue("Update");
}


}
catch(DAOException e)
{
LOGGER.error("DatabaseGridTag.addFields() : DAOE " + e);
}
}



So, how do I dynamically create the "Update" button and bind it to the CompanyGroupsHandler.update() method?
Where should I call my addFields() method from - the Tag's setProperties() method, doStartTag()...?


I've tried lots of different ways of creating the MethodBinding, but noneof them work.


Any help appreciated.
13 years ago
JSF
I found this article very interesting, as I use plugins very frequently. As Ulf has used some deprecated methods in the SecurityManager, I had a look to see how it could work without using deprecated methods.

My aims are slightly different to Ulf's. I started with an interface called ThirdPartyPlugin and wanted to put tight security round any class that implements this plugin, whether loaded from the classpath or via a custom loader. This prevents anyone slipping a rogue plugin onto the classpath. I also want to allow the plugin to instantiate any existing or new classes from any package - so the developer can spread his or her code across several new classes.

It turned out that I did not need a custom classloader, but it's still nice to use one to keep third-party code separate. However, my SecurityManager code is a bit clumsy without using the deprecated methods. Basically the SecurityManager has to navigate up the callstack and find out if any classes implement the ThirdPartyPlugin interface. If so, that's when it needs to throw SecurityExceptions.
One further quirk - when the third-party plugin needs to instantiate a class that is not in the standard Java packages, SecurityManager.checkRead(String file) gets called, so here the SecurityManager needs to allow the call to succeed if a ClassLoader is in the callstack before the ThirdPartyPlugin.

If there's a more elegent solution, I'd be interested to know about it.

Here's my SecurityManager code...

13 years ago
Does anyone know how to make a Servlet runnimg under Tomcat write to a serial port?

I've downloaded and installed the Java Communications API (version 2) and
Tomcat 5.0.25. The whole lot uses Java 1.4.0., running on Windows XP Professional.

Tomcat works fine and the comms stuff works fine.

My servlet is below, but when I invoke it, I get a log and an exception:
"Serial port is already in use."
javax.servlet.ServletException
comms.Send.init(Send.java:55)
javax.servlet.GenericServlet.init(GenericServlet.java:211)
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117) org.apache.coyote.tomcat5.CoyoteAdapter.service(CoyoteAdapter.java:160) org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:793) org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.processConnection(Http11Protocol.java:702) org.apache.tomcat.util.net.TcpWorkerThread.runIt(PoolTcpEndpoint.java:571) org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:644) java.lang.Thread.run(Thread.java:536)

At some point, I'm going to need to find a way of limiting the poolsize forthis servlet to 1 only too.


Cheers
Don.

--------------------------------
Source:

package comms;

/**
* <p>Title: </p>
* <p>Description: </p>
* <p>Copyright: Copyright (c) Donald Horrell 2004</p>
* <p>Company: </p>
* @author Donald Horrell
* @version 1.0
*/

import java.io.IOException;
import java.io.OutputStream;
import java.io.PrintWriter;
import java.util.Enumeration;

import javax.comm.*;
import javax.servlet.ServletException;
import javax.servlet.SingleThreadModel;
import javax.servlet.http.*;


public class Send extends HttpServlet implements SingleThreadModel

{
/** The output stream which writes to the Com port. */
PrintWriter out = null;

public void init() throws ServletException
{
// Open the serial port.
Enumeration portList = CommPortIdentifier.getPortIdentifiers();

while (portList.hasMoreElements())
{
CommPortIdentifier portId = (CommPortIdentifier) portList.nextElement();
if (portId.getPortType() == CommPortIdentifier.PORT_SERIAL)
{
//if (portId.getName().equals("/dev/term/a")) {
if (portId.getName().equals("COM1"))
{
SerialPort serialPort = null;
try
{
serialPort = (SerialPort)(portId.open("SendApplet", 2000));
out = new PrintWriter(serialPort.getOutputStream());
serialPort.setSerialPortParams(9600,
SerialPort.DATABITS_8,
SerialPort.STOPBITS_1,
SerialPort.PARITY_NONE);
}
catch (PortInUseException e)
{
System.out.println("Serial port is already in use.");
throw(new ServletException());
}
catch (IOException e)
{
System.out.println("Failed to connect to output stream.");
throw(new ServletException());
}
catch (UnsupportedCommOperationException e)
{
System.out.println("Failed to set comms params.");
throw(new ServletException());
}
}
}
}
}

public void destroy()
{
// Release the serial port.
}

public void doGet(HttpServletRequest req, HttpServletResponse resp)
{
System.out.println("doGet.");
doWork(req, resp);
}

public void doPost(HttpServletRequest req, HttpServletResponse resp)
{
System.out.println("doPost.");
doWork(req, resp);
}

public void doWork(HttpServletRequest req, HttpServletResponse resp)
{
// Take the msg param and send it to COM1.
String msg = req.getParameter("msg");
System.out.println("Sending["+msg+"].");
if (msg != null)
{
out.println(msg);
}
}
}
15 years ago