-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XGBoostClassifier multiclass objective #9
Comments
Hi @paulbir, the issue is not so much XGBoost itself, but the feature learning algorithms. It can be very tricky to build features for very high-dimensional targets. You can use the getml.data.make_target_columns(…), as exemplified in the CORA notebook: https://nbviewer.org/github/getml/getml-demo/blob/master/cora.ipynb |
Hi @liuzicheng1987 , thanks for your reply. I have target with 3 classes. This is a multiclass problem, so the objective should be multi:softmax or multi:softprob, but only binary targets are allowed. |
Thanks for your question, @paulbir. You can just materialize your features using |
Hi @srnnkls . My goal is to create new features using the Relboost method. In all the notebook examples I can see that when creating the
And the docs do not explicitly state if the predictor is necessary for feature engineering itself. So I usually set it with But what I thought now is do I really need to set the |
@paulbir , no, if all you are interested in are the features, you don't really need predictors. It's nice to have predictors for things such as calculating feature importances, but they are not necessary for the feature engineering. |
@liuzicheng1987 thanks. So I have no issues with predictors anymore then. |
90.16Let is define the number of class labels as L. @paulbir just to clarify: You can do multiclass classification using getml and You can also follow @srnnkls approach, that you use the getml pipeline for constructing the features and then use Example can be found here: https://github.com/Jogala/cora under scripts/ml_all.py Note that using the L times 1 vs All approach I achieved slightly better results on that example. Overall, we outperform the best predictor of that ranking here: https://paperswithcode.com/sota/node-classification-on-cora Using the split of the leading paper (accuracy = 90.16%), we reach 91%. If you have further questions, you can also drop me an email: |
Currently "objective" parameter for the
XGBoostClassifier
is limited to "reg:squarederror", "reg:tweedie", "reg:linear", "reg:logistic", "binary:logistic", "binary:logitraw". And these values are even forced with validation code:This code was clearly added when XGBoost already supported multiclass classification. So why can't I use an objective like "multi:softmax"? Or maybe there is some workaround for the multiclass classification?
The text was updated successfully, but these errors were encountered: