public static Formula lhs(Term lhs)
Factory method. The predictors will be all the columns not otherwise in the formula in the context of a data frame.
lhs - the left-hand side of formula, i.e. dependent variable.
In the simpliest case, the terms (both of LHS and of RHS) are column names. But they can be functions (e.g. log) and transformations (e.g. interaction and factor crossing) too. The functions/transformations are symbolic and thus lazy.
Mike Simmons wrote:What I meant was, do you have a complete list of all possible values? Are all values either s or t? Do they range from a-z? Or is there some other list, like C, M, T, S, X?
If all the values are s or t, then make all s = 0, and all t = 1. That's easy.
If it's a range a-z, then ou can convert from the char to an int using math:
If it's a more random list like c, m, t, s, x or something, then go back to that labelIndices code I showed to make a more flexible way of mapping each unique label to an int.
Mike Simmons wrote:Apparently, Sensitivity applies only to a binary classification, which means there are only two classes.In your example they are all 13 or 15, which is good. However it looks like Smile enforces "binary" by saying the classes must be either 0 or 1. So you could replace all 13 with 0, and all 15 with 1. Or the other way around. It doen't matter which you choose, as long as you remember what 0 means, and what 1 means, based on how you have converted the 13 and 15 to 0 or 1.
Piet Souris wrote:A few topics ago you had this method (with a lambda!)
Piet Souris wrote:Did you not like that method?
Mike Simmons wrote:Regarding the last question, it doesn't look like you have a column named "height". How can you drop the height column if there is no height column?
I assume you have an array of labels, as Strings, where each String occurs exactly once. (If not, you need to get something like that, either an array or List.)