{"id":20857,"date":"2024-02-20T20:22:31","date_gmt":"2024-02-20T19:22:31","guid":{"rendered":"https:\/\/complex-systems-ai.com\/?page_id=20857"},"modified":"2024-02-20T21:36:14","modified_gmt":"2024-02-20T20:36:14","slug":"cart-regression","status":"publish","type":"page","link":"https:\/\/complex-systems-ai.com\/en\/learning-supervises\/cart-regression\/","title":{"rendered":"CART regression and classification"},"content":{"rendered":"<div data-elementor-type=\"wp-page\" data-elementor-id=\"20857\" class=\"elementor elementor-20857\">\n\t\t\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-bd719d0 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"bd719d0\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-33 elementor-top-column elementor-element elementor-element-6998e3a\" data-id=\"6998e3a\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-1804e92 elementor-align-justify elementor-widget elementor-widget-button\" data-id=\"1804e92\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"button.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<div class=\"elementor-button-wrapper\">\n\t\t\t\t\t<a class=\"elementor-button elementor-button-link elementor-size-sm\" href=\"https:\/\/complex-systems-ai.com\/en\/learning-supervises\/\">\n\t\t\t\t\t\t<span class=\"elementor-button-content-wrapper\">\n\t\t\t\t\t\t\t\t\t<span class=\"elementor-button-text\">Supervised learning<\/span>\n\t\t\t\t\t<\/span>\n\t\t\t\t\t<\/a>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t<div class=\"elementor-column elementor-col-33 elementor-top-column elementor-element elementor-element-20a2960\" data-id=\"20a2960\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-15b41b4 elementor-align-justify elementor-widget elementor-widget-button\" data-id=\"15b41b4\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"button.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<div class=\"elementor-button-wrapper\">\n\t\t\t\t\t<a class=\"elementor-button elementor-button-link elementor-size-sm\" href=\"https:\/\/complex-systems-ai.com\/en\/\">\n\t\t\t\t\t\t<span class=\"elementor-button-content-wrapper\">\n\t\t\t\t\t\t\t\t\t<span class=\"elementor-button-text\">Home page<\/span>\n\t\t\t\t\t<\/span>\n\t\t\t\t\t<\/a>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t<div class=\"elementor-column elementor-col-33 elementor-top-column elementor-element elementor-element-2d8d247\" data-id=\"2d8d247\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-3966268 elementor-align-justify elementor-widget elementor-widget-button\" data-id=\"3966268\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"button.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<div class=\"elementor-button-wrapper\">\n\t\t\t\t\t<a class=\"elementor-button elementor-button-link elementor-size-sm\" href=\"https:\/\/www.jair.org\/index.php\/jair\/article\/view\/12228\" target=\"_blank\" rel=\"noopener\">\n\t\t\t\t\t\t<span class=\"elementor-button-content-wrapper\">\n\t\t\t\t\t\t\t\t\t<span class=\"elementor-button-text\">Wiki<\/span>\n\t\t\t\t\t<\/span>\n\t\t\t\t\t<\/a>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-0b9775e elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"0b9775e\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-c5fa8db\" data-id=\"c5fa8db\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-b2bd1ac elementor-widget elementor-widget-heading\" data-id=\"b2bd1ac\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_82_2 counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Contents<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Table of Content\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewbox=\"0 0 24 24\" version=\"1.2\" baseprofile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/complex-systems-ai.com\/en\/learning-supervises\/cart-regression\/#CART-regression-et-classification-utilisation-de-larbre-de-decision\" >CART regression and classification: use of the decision tree<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/complex-systems-ai.com\/en\/learning-supervises\/cart-regression\/#CART-et-Moindres-carres-pour-la-regression\" >CART and Least Squares for Regression<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/complex-systems-ai.com\/en\/learning-supervises\/cart-regression\/#Decoupage-et-optimisation\" >Slicing and optimization<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/complex-systems-ai.com\/en\/learning-supervises\/cart-regression\/#Avec-de-multiples-colonnes\" >With multiple columns<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/complex-systems-ai.com\/en\/learning-supervises\/cart-regression\/#CART-pour-la-classification\" >CART for classification<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/complex-systems-ai.com\/en\/learning-supervises\/cart-regression\/#Recap-classification\" >Classification recap<\/a><\/li><\/ul><\/nav><\/div>\n<h2 class=\"elementor-heading-title elementor-size-default\"><span class=\"ez-toc-section\" id=\"CART-regression-et-classification-utilisation-de-larbre-de-decision\"><\/span>CART regression and classification: use of the decision tree <span class=\"ez-toc-section-end\"><\/span><\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-dc76f9e elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"dc76f9e\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-0f5ce49\" data-id=\"0f5ce49\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-373fdfc elementor-widget elementor-widget-text-editor\" data-id=\"373fdfc\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Here is a tutorial for implementing CART <a href=\"https:\/\/complex-systems-ai.com\/en\/correlation-and-regressions\/\">Regression<\/a> and CART Classification.<\/p><p>As has been explained, decision trees are the non-parametric supervised learning approach. In addition to classification with continuous target data, we also often find cases with discrete target data called regression. In regression, the simplest way may be to use linear regression to solve this case. This time the way to solve the regression case will use a <a href=\"https:\/\/complex-systems-ai.com\/en\/graph-theory-2\/trees-and-trees\/\">tree<\/a> decision.<\/p><p><img decoding=\"async\" class=\"aligncenter wp-image-11096 size-full\" src=\"http:\/\/complex-systems-ai.com\/wp-content\/uploads\/2020\/09\/cropped-Capture.png\" alt=\"CART Regression\" width=\"97\" height=\"97\" title=\"\"><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-73b5611 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"73b5611\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-ec42fb5\" data-id=\"ec42fb5\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-0ce79d3 elementor-widget elementor-widget-heading\" data-id=\"0ce79d3\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\"><span class=\"ez-toc-section\" id=\"CART-et-Moindres-carres-pour-la-regression\"><\/span>CART and Least Squares for Regression<span class=\"ez-toc-section-end\"><\/span><\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-4bff5f1 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"4bff5f1\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-1cf2233\" data-id=\"1cf2233\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-234b00d elementor-widget elementor-widget-text-editor\" data-id=\"234b00d\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>For regression trees, two common impurity measures are:<\/p><p>Least squares. This method is similar to <a href=\"https:\/\/complex-systems-ai.com\/en\/language-theory\/minimization-dun-afd\/\">minimization<\/a> least squares in a <a href=\"https:\/\/complex-systems-ai.com\/en\/help-with-the-decision\/linear-modeling\/\">linear model<\/a>. The divisions are chosen to minimize the residual sum of squares between the observation and the mean in each node.<\/p><p>The smallest absolute deviations. This method minimizes the average absolute deviation from the median within a node. The advantage of this method over least squares is that it is not as sensitive to outliers and provides a more robust model. The disadvantage is insensitivity when dealing with data sets containing a large proportion of zeros.<\/p><p>CART in classification cases uses Gini Impurity in the process of splitting the dataset into a decision tree. On the other hand, CART in regression cases uses least squares, splits are chosen intuitively to minimize the residual sum of squares between the observation and the mean in each node. In order to know the \u201cbest\u201d distribution, we must minimize the RSS:<\/p><p><img decoding=\"async\" class=\"alignnone size-medium wp-image-20863\" src=\"http:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART1-300x114.webp\" alt=\"CART regression\" width=\"300\" height=\"114\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART1-300x114.webp 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART1-18x7.webp 18w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART1.webp 360w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-789327f elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"789327f\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-9887026\" data-id=\"9887026\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-b641ee0 elementor-widget elementor-widget-heading\" data-id=\"b641ee0\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\"><span class=\"ez-toc-section\" id=\"Decoupage-et-optimisation\"><\/span>Slicing and optimization<span class=\"ez-toc-section-end\"><\/span><\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-c1a70f9 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"c1a70f9\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-e13af53\" data-id=\"e13af53\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-da742a8 elementor-widget elementor-widget-text-editor\" data-id=\"da742a8\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>This simulation uses a \u201cdummy\u201d dataset as follows<\/p><p><img fetchpriority=\"high\" decoding=\"async\" class=\"alignnone wp-image-20864 size-full\" src=\"http:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART2.webp\" alt=\"CART regression\" width=\"1001\" height=\"611\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART2.webp 1001w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART2-300x183.webp 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART2-768x469.webp 768w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART2-18x12.webp 18w\" sizes=\"(max-width: 1001px) 100vw, 1001px\" \/><\/p><p>As mentioned before, in order to know the \u201cbest\u201d distribution, we need to minimize the RSS. First, we calculate RSS by dividing it into two regions, starting with index 0<\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-20866 size-large\" src=\"http:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART4-1024x366.webp\" alt=\"CART regression\" width=\"1024\" height=\"366\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART4-1024x366.webp 1024w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART4-300x107.webp 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART4-768x275.webp 768w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART4-18x6.webp 18w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART4.webp 1316w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/p><p>Since the data is already divided into two regions, we add the residual square for each index data. Moreover, we calculate RSS each node using equation 2.0<\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-20867 size-large\" src=\"http:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART5-1024x678.webp\" alt=\"CART regression\" width=\"1024\" height=\"678\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART5-1024x678.webp 1024w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART5-300x198.webp 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART5-768x508.webp 768w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART5-1536x1016.webp 1536w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART5-18x12.webp 18w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART5.webp 1664w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/p><p>This process continues until the RSS is calculated in the last index:<\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-20868 size-large\" src=\"http:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART6-1024x343.webp\" alt=\"CART regression\" width=\"1024\" height=\"343\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART6-1024x343.webp 1024w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART6-300x101.webp 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART6-768x257.webp 768w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART6-18x6.webp 18w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART6.webp 1286w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/p><p>The price with threshold 19 has smallest RSS, in R1 there are 10 data in price &lt; 19, so we will split the data in R1. In order to avoid overfitting, we set the minimum data for each region &gt;= 6. If the region contains less than 6 data, the division process in that region stops.<\/p><p>Split data with threshold 19<\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-20869 size-full\" src=\"http:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART7.webp\" alt=\"CART regression\" width=\"899\" height=\"597\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART7.webp 899w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART7-300x199.webp 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART7-768x510.webp 768w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART7-18x12.webp 18w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART7-700x465.webp 700w\" sizes=\"(max-width: 899px) 100vw, 899px\" \/><\/p><p>calculate RSS in R1, the process in this section is the same as the previous process, performed only for R1. It is possible to make as many branches as you wish, be careful to keep a consistent mass of points!<\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-20870 size-large\" src=\"http:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART8-1024x299.webp\" alt=\"CART regression\" width=\"1024\" height=\"299\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART8-1024x299.webp 1024w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART8-300x88.webp 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART8-768x224.webp 768w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART8-18x5.webp 18w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART8.webp 1477w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/p><p>Which gives the following regression tree after 2 branches:<\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-20865 size-large\" src=\"http:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART3-1024x264.webp\" alt=\"CART regression\" width=\"1024\" height=\"264\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART3-1024x264.webp 1024w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART3-300x77.webp 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART3-768x198.webp 768w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART3-18x5.webp 18w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/CART3.webp 1362w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-a0c90dc elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"a0c90dc\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-43380d7\" data-id=\"43380d7\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-c10092c elementor-widget elementor-widget-heading\" data-id=\"c10092c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\"><span class=\"ez-toc-section\" id=\"Avec-de-multiples-colonnes\"><\/span>With multiple columns<span class=\"ez-toc-section-end\"><\/span><\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-0d287e0 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"0d287e0\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-8367008\" data-id=\"8367008\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-a1cde6e elementor-widget elementor-widget-text-editor\" data-id=\"a1cde6e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Simply calculate the RSS for each predicator and root\/branch the one with the smallest RSS (be careful to use preprocessing and an adequate metric in case of unbalanced data!)<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-04813cb elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"04813cb\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-f257578\" data-id=\"f257578\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-3a6f2d3 elementor-widget elementor-widget-heading\" data-id=\"3a6f2d3\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\"><span class=\"ez-toc-section\" id=\"CART-pour-la-classification\"><\/span>CART for classification<span class=\"ez-toc-section-end\"><\/span><\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-e321b87 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"e321b87\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-1236ae2\" data-id=\"1236ae2\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-5d63603 elementor-widget elementor-widget-text-editor\" data-id=\"5d63603\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>For classification we use the<a href=\"https:\/\/complex-systems-ai.com\/en\/data-analysis\/gini-entropy-and-error\/\">impurity<\/a> by Gini. As an example, we take a heart disease dataset with 303 rows and 13 attributes. The target includes 138 0 values and 165 1 values<\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-20718 size-full\" src=\"http:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/arbre7.webp\" alt=\"gini impurity\" width=\"505\" height=\"117\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/arbre7.webp 505w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/arbre7-300x70.webp 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/arbre7-18x4.webp 18w\" sizes=\"(max-width: 505px) 100vw, 505px\" \/><\/p><p>Let&#039;s calculate the Gini impurity for the Sex column<\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-20721 size-full\" src=\"http:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/arbre9.webp\" alt=\"gini impurity\" width=\"523\" height=\"301\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/arbre9.webp 523w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/arbre9-300x173.webp 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/arbre9-18x10.webp 18w\" sizes=\"(max-width: 523px) 100vw, 523px\" \/><\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-20722 size-full\" src=\"http:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/arbre10.webp\" alt=\"gini impurity\" width=\"496\" height=\"252\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/arbre10.webp 496w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/arbre10-300x152.webp 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/arbre10-18x9.webp 18w\" sizes=\"(max-width: 496px) 100vw, 496px\" \/><\/p><p>For Fbs we have 0.360. For Exang 0.381. Fbs has the smallest Gini impurity, so this will be our root.<\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-20874 size-full\" src=\"http:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction1.webp\" alt=\"CART classification\" width=\"654\" height=\"591\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction1.webp 654w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction1-300x271.webp 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction1-13x12.webp 13w\" sizes=\"(max-width: 654px) 100vw, 654px\" \/><\/p><p>we need to determine to what extent Sex and Exang separate these patients in the left node of Fbs<\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-20875 size-full\" src=\"http:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction2.webp\" alt=\"CART classification\" width=\"921\" height=\"701\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction2.webp 921w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction2-300x228.webp 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction2-768x585.webp 768w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction2-16x12.webp 16w\" sizes=\"(max-width: 921px) 100vw, 921px\" \/><\/p><p>Exang (exercise-induced angina) has the lowest Gini impurity, we will use it at this node to separate patients.<\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-medium wp-image-20876\" src=\"http:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction3-300x234.webp\" alt=\"classification\" width=\"300\" height=\"234\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction3-300x234.webp 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction3-15x12.webp 15w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction3.webp 366w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/p><p>In the left node of Exang, how far does this separate these 49 patients (24 with heart disease and 25 without heart disease. Since only the sex attribute remains, we put the sex attribute in the left node of Exang Exang.<\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-20877 size-large\" src=\"http:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction4-1024x415.webp\" alt=\"CART classification\" width=\"1024\" height=\"415\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction4-1024x415.webp 1024w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction4-300x122.webp 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction4-768x311.webp 768w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction4-18x7.webp 18w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction4.webp 1197w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/p><p>As we can see, we have final leaf nodes on this branch, but why is the leaf node circled, including the final node?<\/p><p>Note: the circled leaf node, 89% do not have heart disease<\/p><p>Do these new sheets separate patients better than before?<\/p><p>In order to answer these questions, we need to compare the Gini impurity using the sex attribute and the Gini impurity before using the sex attribute to separate patients.<\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-20878 size-large\" src=\"http:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction5-1024x477.webp\" alt=\"CART classification\" width=\"1024\" height=\"477\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction5-1024x477.webp 1024w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction5-300x140.webp 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction5-768x358.webp 768w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction5-18x8.webp 18w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction5.webp 1242w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/p><p>The Gini impurity before using gender to separate patients is the lowest, so we do not separate this node using gender. The final leaf node on this branch of the tree. We do the same for the right branch of the root, which gives the following decision tree:<\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-20879 size-full\" src=\"http:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction6.webp\" alt=\"classification\" width=\"830\" height=\"407\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction6.webp 830w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction6-300x147.webp 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction6-768x377.webp 768w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/prediction6-18x9.webp 18w\" sizes=\"(max-width: 830px) 100vw, 830px\" \/><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-1058b17 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"1058b17\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-96775cb\" data-id=\"96775cb\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-cd55e76 elementor-widget elementor-widget-heading\" data-id=\"cd55e76\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\"><span class=\"ez-toc-section\" id=\"Recap-classification\"><\/span>Classification recap<span class=\"ez-toc-section-end\"><\/span><\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-00a7eb1 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"00a7eb1\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-7565b86\" data-id=\"7565b86\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-b7332b0 elementor-widget elementor-widget-text-editor\" data-id=\"b7332b0\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Main point during the data set splitting process<\/p><p>1. calculate all Gini impurity score<\/p><p>2. compare the Gini impurity score, after n before using a new attribute to separate the data. If the node itself has the lowest score, there is no point in separating the data<\/p><p>3. If data separation results in improvement, choose the separation with the lowest impurity score.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>","protected":false},"excerpt":{"rendered":"<p>Supervised learning Home page Wiki CART regression and classification: using the decision tree Here is a tutorial for implementing CART Regression and\u2026 <\/p>","protected":false},"author":1,"featured_media":0,"parent":20741,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-20857","page","type-page","status-publish","hentry"],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/complex-systems-ai.com\/en\/wp-json\/wp\/v2\/pages\/20857","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/complex-systems-ai.com\/en\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/complex-systems-ai.com\/en\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/complex-systems-ai.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/complex-systems-ai.com\/en\/wp-json\/wp\/v2\/comments?post=20857"}],"version-history":[{"count":4,"href":"https:\/\/complex-systems-ai.com\/en\/wp-json\/wp\/v2\/pages\/20857\/revisions"}],"predecessor-version":[{"id":20882,"href":"https:\/\/complex-systems-ai.com\/en\/wp-json\/wp\/v2\/pages\/20857\/revisions\/20882"}],"up":[{"embeddable":true,"href":"https:\/\/complex-systems-ai.com\/en\/wp-json\/wp\/v2\/pages\/20741"}],"wp:attachment":[{"href":"https:\/\/complex-systems-ai.com\/en\/wp-json\/wp\/v2\/media?parent=20857"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}