Data reconstruction of two actuarial metrics by staking machine learning models

Amaury de Souza Amaral; Jardel Marques Monti; Segundo Parra Milián

doi:10.59681/2175-4411.v15.iEspecial.2023.1095

Authors

Amaury de Souza Amaral Pontifícia Universidade Católica de São Paulo
Jardel Marques Monti Pontifícia Universidade Católica de São Paulo
Segundo Parra Milián Universidade Estadual Paulista

DOI:

https://doi.org/10.59681/2175-4411.v15.iEspecial.2023.1095

Keywords:

Artificial Intelligence, Process Optimization, Supplementary Health

Abstract

Objective: A large part of Brazilian’s health care is financed by health insurance plans, which readjustments have been questioned in the courts. The data from court cases tends to not be readily available. Therefore, in order to reconstruct the data, we developed a metric using Deep Learning techniques to obtain data estimations. Method: After analyzing the data obtained from the Regulatory Agency, we trained three different supervised learning algorithms aiming to obtain information through an optimization problem. We used the Augmented Lagrangian method aiming to include the constraints into the cost function and Simulated Annealing to minimize it. Results: Consistent as expected, the stacking performance outperformed the base learners. Conclusions: With the results obtained it was possible to obtain the retroactive average cost per claim and frequency information, fetched from the "health plan's past".

Downloads

Download data is not yet available.

Author Biography

Segundo Parra Milián, Universidade Estadual Paulista

Instituto de Física Teórica – IFT - Universidade Estadual Paulista – São Paulo

References

Chen T, Guestrin C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 2016: 785-794.

Cortes C, Vapnik V. Support-Vector networks. Machine Learning. 1995; 20 (3): 273– 297.

Ganaie M, Hu M, et al. Ensemble deep learning: A review. arXiv preprint arXiv:2104.02395. 2021

Wolpert D H. Stacked generalization. Neural Networks. 1992; 5 (2): 241–259.

Breiman L. Stacked regressions. Mach. Learn. 1996; 24 (1): 49–64.

Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, et al. TensorFlow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467. 2016.

Harris CR, Millman KJ, van der Walt SJ, Gommers R, Virtanen P, Cournapeau D, Wieser E, Taylor J, Berg S, Smith NJ, Kern R, Picus M, Hoyer S, van Kerkwijk MH, Brett M, Haldane A, Del Río JF, Wiebe M, Peterson P, Gérard-Marchant P, Sheppard K, Reddy T, Weckesser W, Abbasi H, Gohlke C, Oliphant TE. Array programming with NumPy. Nature. 2020 Sep;585(7825):357-362

Guo C, Berkhahn F. Entity embeddings of categorical variables. ArXiv, abs/1604.06737. 2016.

Kingma DP, Ba J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980. 2014.

Hestenes MR. Multiplier and gradient methods. Journal of optimization theory and applications. 1969; 4 (5): 303–320.

Powell MJ. A method for nonlinear constraints in minimization problems. Opti mization. 1969: 283–298.

Bohachevsky IO, Johnson ME, Stein M L. Generalized simulated annealing for function optimization. Technometrics.1986; 28 (3): 209–217.

Romeo F, Sangiovanni-Vincentelli A. A theoretical framework for simulated annealing. Algorithmica. 1991; 6 (1): 302–345.

Data reconstruction of two actuarial metrics by staking machine learning models

Authors

DOI:

Keywords:

Abstract

Downloads

Author Biography

Segundo Parra Milián, Universidade Estadual Paulista

References

Downloads

Published

How to Cite

Issue

Section

License

Similar Articles

Language

Indexadores, Bases de Dados, Repositórios e Bibliotecas

Information

Current Issue