| Home  | About ScienceAsia  | Publication charge  | Advertise with us  | Subscription for printed version  | Contact us  
Editorial Board
Journal Policy
Instructions for Authors
Online submission
Author Login
Reviewer Login
Volume 50 Number 2
Volume 50 Number 1
Volume 49 Number 6
Volume 49 Number 5
Volume 49S Number 1
Volume 49 Number 4
Earlier issues
Volume 49 Number 3

Research articles

ScienceAsia 49S (2023): 68-77 |doi: 10.2306/scienceasia1513-1874.2023.s003

A modification of logistic regression with imbalanced data: F-measure-oriented Lasso-logistic regression

Bui T. T. Mya,b,*, Bao Q. Tac

ABSTRACT:     Logistic regression (LR) is one of the most popular classifiers. However, LR cannot perform effectively on imbalanced data. There are two approaches to imbalanced data for LR, including resampling techniques and modifications to the log-likelihood function. These approaches improve performance measures of LR in some cases, but their effectiveness is not robust in general. In this paper, we propose a classifier called F-measure-oriented LassoLogistic Regression (F-LLR) to deal with imbalanced data. The base learner of F-LLR is Lasso-Logistic regression (LLR) which imposes the prior on the magnitude of parameters by a hyper-parameter ?. The optimal ? is determined by an adjustment of the cross-validation procedure which aims for the highest F-measure instead of the highest accuracy. F-LLR addresses imbalanced data by the combination of Under-sampling and Synthetic Minority Oversampling Technique (SMOTE) selectively based on the scores of the training data. The empirical study shows that F-LLR increases F-measure and KS as compared with LLR and the traditional balanced methods, such as the resampling techniques (Random Undersampling, Random Over-sampling, and SMOTE) and the modifications to log-likelihood function (Ridge and Weighted likelihood estimation).

Download PDF

25 Downloads 404 Views

a Department of Mathematical Economics, Ho Chi Minh University of Banking, Ho Chi Minh City, 7000 Vietnam
b Faculty of Mathematics and Statistics, College of Technology and Design, UEH University, Ho Chi Minh City, 7000 Vietnam
c Department of Mathematics, International University, Vietnam National University, Ho Chi Minh City, 7000 Vietnam

* Corresponding author, E-mail: mybtt@hub.edu.vn

Received 15 Jan 2023, Accepted 31 Aug 2023