Data reconstruction of two actuarial metrics by staking machine learning models
DOI:
https://doi.org/10.59681/2175-4411.v15.iEspecial.2023.1095Keywords:
Artificial Intelligence, Process Optimization, Supplementary HealthAbstract
Objective: A large part of Brazilian’s health care is financed by health insurance plans, which readjustments have been questioned in the courts. The data from court cases tends to not be readily available. Therefore, in order to reconstruct the data, we developed a metric using Deep Learning techniques to obtain data estimations. Method: After analyzing the data obtained from the Regulatory Agency, we trained three different supervised learning algorithms aiming to obtain information through an optimization problem. We used the Augmented Lagrangian method aiming to include the constraints into the cost function and Simulated Annealing to minimize it. Results: Consistent as expected, the stacking performance outperformed the base learners. Conclusions: With the results obtained it was possible to obtain the retroactive average cost per claim and frequency information, fetched from the "health plan's past".
References
Chen T, Guestrin C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 2016: 785-794.
Cortes C, Vapnik V. Support-Vector networks. Machine Learning. 1995; 20 (3): 273– 297.
Ganaie M, Hu M, et al. Ensemble deep learning: A review. arXiv preprint arXiv:2104.02395. 2021
Wolpert D H. Stacked generalization. Neural Networks. 1992; 5 (2): 241–259.
Breiman L. Stacked regressions. Mach. Learn. 1996; 24 (1): 49–64.
Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, et al. TensorFlow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467. 2016.
Harris CR, Millman KJ, van der Walt SJ, Gommers R, Virtanen P, Cournapeau D, Wieser E, Taylor J, Berg S, Smith NJ, Kern R, Picus M, Hoyer S, van Kerkwijk MH, Brett M, Haldane A, Del Río JF, Wiebe M, Peterson P, Gérard-Marchant P, Sheppard K, Reddy T, Weckesser W, Abbasi H, Gohlke C, Oliphant TE. Array programming with NumPy. Nature. 2020 Sep;585(7825):357-362
Guo C, Berkhahn F. Entity embeddings of categorical variables. ArXiv, abs/1604.06737. 2016.
Kingma DP, Ba J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980. 2014.
Hestenes MR. Multiplier and gradient methods. Journal of optimization theory and applications. 1969; 4 (5): 303–320.
Powell MJ. A method for nonlinear constraints in minimization problems. Opti mization. 1969: 283–298.
Bohachevsky IO, Johnson ME, Stein M L. Generalized simulated annealing for function optimization. Technometrics.1986; 28 (3): 209–217.
Romeo F, Sangiovanni-Vincentelli A. A theoretical framework for simulated annealing. Algorithmica. 1991; 6 (1): 302–345.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Amaury de Souza Amaral, Jardel Marques Monti, Segundo Parra Milián
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Submission of a paper to Journal of Health Informatics is understood to imply that it is not being considered for publication elsewhere and that the author(s) permission to publish his/her (their) article(s) in this Journal implies the exclusive authorization of the publishers to deal with all issues concerning the copyright therein. Upon the submission of an article, authors will be asked to sign a Copyright Notice. Acceptance of the agreement will ensure the widest possible dissemination of information. An e-mail will be sent to the corresponding author confirming receipt of the manuscript and acceptance of the agreement.