{"id":15886,"date":"2022-04-24T11:39:30","date_gmt":"2022-04-24T10:39:30","guid":{"rendered":"https:\/\/complex-systems-ai.com\/?page_id=15886"},"modified":"2022-11-27T21:19:37","modified_gmt":"2022-11-27T20:19:37","slug":"transformation-des-donnees-et-regression","status":"publish","type":"page","link":"https:\/\/complex-systems-ai.com\/es\/correlacion-y-regresiones\/transformacion-de-datos-y-regresion\/","title":{"rendered":"Transformaci\u00f3n y regresi\u00f3n de datos"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-page\" data-elementor-id=\"15886\" class=\"elementor elementor-15886\">\n\t\t\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-16dd14d elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"16dd14d\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-33 elementor-top-column elementor-element elementor-element-82c7954\" data-id=\"82c7954\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-fbb636b elementor-align-justify elementor-widget elementor-widget-button\" data-id=\"fbb636b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"button.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<div class=\"elementor-button-wrapper\">\n\t\t\t\t\t<a class=\"elementor-button elementor-button-link elementor-size-sm\" href=\"https:\/\/complex-systems-ai.com\/correlation-et-regressions\/\">\n\t\t\t\t\t\t<span class=\"elementor-button-content-wrapper\">\n\t\t\t\t\t\t\t\t\t<span class=\"elementor-button-text\">Corr\u00e9lation et r\u00e9gressions<\/span>\n\t\t\t\t\t<\/span>\n\t\t\t\t\t<\/a>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t<div class=\"elementor-column elementor-col-33 elementor-top-column elementor-element elementor-element-ecd0fef\" data-id=\"ecd0fef\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-1aac1fb elementor-align-justify elementor-widget elementor-widget-button\" data-id=\"1aac1fb\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"button.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<div class=\"elementor-button-wrapper\">\n\t\t\t\t\t<a class=\"elementor-button elementor-button-link elementor-size-sm\" href=\"https:\/\/complex-systems-ai.com\/\">\n\t\t\t\t\t\t<span class=\"elementor-button-content-wrapper\">\n\t\t\t\t\t\t\t\t\t<span class=\"elementor-button-text\">Page d'accueil<\/span>\n\t\t\t\t\t<\/span>\n\t\t\t\t\t<\/a>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t<div class=\"elementor-column elementor-col-33 elementor-top-column elementor-element elementor-element-38532aa\" data-id=\"38532aa\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-501a6a0 elementor-align-justify elementor-widget elementor-widget-button\" data-id=\"501a6a0\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"button.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<div class=\"elementor-button-wrapper\">\n\t\t\t\t\t<a class=\"elementor-button elementor-button-link elementor-size-sm\" href=\"https:\/\/en.wikipedia.org\/wiki\/Correlation\" target=\"_blank\" rel=\"noopener\">\n\t\t\t\t\t\t<span class=\"elementor-button-content-wrapper\">\n\t\t\t\t\t\t\t\t\t<span class=\"elementor-button-text\">Wiki<\/span>\n\t\t\t\t\t<\/span>\n\t\t\t\t\t<\/a>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-ef6701f elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"ef6701f\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-368a24a\" data-id=\"368a24a\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-d8ae180 elementor-widget elementor-widget-text-editor\" data-id=\"d8ae180\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Voici un pipeline expliquant l&rsquo;exploration des donn\u00e9es, leur transformation pour normalisation et la r\u00e9gression (avec analyse de performance).<\/p><p><img decoding=\"async\" class=\"aligncenter wp-image-11096 size-full\" src=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2020\/09\/cropped-Capture.png\" alt=\"r\u00e9gression\" width=\"97\" height=\"97\" title=\"\"><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-bd0b5eb elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"bd0b5eb\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-392c688\" data-id=\"392c688\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-e778eac elementor-widget elementor-widget-heading\" data-id=\"e778eac\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_82_2 counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Contenus<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Alternar tabla de contenidos\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/complex-systems-ai.com\/es\/correlacion-y-regresiones\/transformacion-de-datos-y-regresion\/#Analyse-transformation-et-regression\" >Analyse, transformation et r\u00e9gression<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/complex-systems-ai.com\/es\/correlacion-y-regresiones\/transformacion-de-datos-y-regresion\/#Exploration-et-preprocessing\" >Exploration et pr\u00e9processing<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/complex-systems-ai.com\/es\/correlacion-y-regresiones\/transformacion-de-datos-y-regresion\/#Regressions-et-comparaisons\" >R\u00e9gressions et comparaisons<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/complex-systems-ai.com\/es\/correlacion-y-regresiones\/transformacion-de-datos-y-regresion\/#Evaluation-des-modeles\" >Evaluation des mod\u00e8les<\/a><\/li><\/ul><\/nav><\/div>\n<h2 class=\"elementor-heading-title elementor-size-default\"><span class=\"ez-toc-section\" id=\"Analyse-transformation-et-regression\"><\/span>Analyse, transformation et r\u00e9gression<span class=\"ez-toc-section-end\"><\/span><\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-39039b7 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"39039b7\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-bee3c12\" data-id=\"bee3c12\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-867dee0 elementor-widget elementor-widget-text-editor\" data-id=\"867dee0\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Plongeons maintenant dans l&rsquo;autre cat\u00e9gorie d&rsquo;apprentissage supervis\u00e9 &#8211; la r\u00e9gression o\u00f9 la variable de sortie est continue et num\u00e9rique. Il existe quatre types courants de mod\u00e8les de r\u00e9gression : lin\u00e9aire, au lasso, de cr\u00eate (ridge regression), polynomiale.<\/p><p><img fetchpriority=\"high\" decoding=\"async\" class=\"aligncenter wp-image-15892 size-full\" src=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132055.png\" alt=\"\" width=\"756\" height=\"545\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132055.png 756w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132055-300x216.png 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132055-18x12.png 18w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132055-600x433.png 600w\" sizes=\"(max-width: 756px) 100vw, 756px\" \/><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-e42d879 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"e42d879\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-e68b680\" data-id=\"e68b680\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-3b26da7 elementor-widget elementor-widget-heading\" data-id=\"3b26da7\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\"><span class=\"ez-toc-section\" id=\"Exploration-et-preprocessing\"><\/span>Exploration et pr\u00e9processing<span class=\"ez-toc-section-end\"><\/span><\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-8ed4ec2 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"8ed4ec2\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-fba2cfa\" data-id=\"fba2cfa\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-06298ee elementor-widget elementor-widget-text-editor\" data-id=\"06298ee\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Ce projet vise \u00e0 utiliser des mod\u00e8les de r\u00e9gression pour pr\u00e9dire les scores de bonheur des pays en fonction d&rsquo;autres facteurs \u00ab\u00a0PIB par habitant\u00a0\u00bb, \u00ab\u00a0Soutien social\u00a0\u00bb, Esp\u00e9rance de vie en bonne sant\u00e9\u00a0\u00bb, \u00ab\u00a0Libert\u00e9 de faire des choix de vie\u00a0\u00bb, \u00ab\u00a0G\u00e9n\u00e9rosit\u00e9\u00a0\u00bb et \u00ab\u00a0Perceptions de la corruption\u00a0\u00bb.<\/p><p>J&rsquo;ai utilis\u00e9 l&rsquo;ensemble de donn\u00e9es \u00ab\u00a0World Happiness Report\u00a0\u00bb sur Kaggle, qui comprend 156 entr\u00e9es et 9 fonctionnalit\u00e9s.<\/p><p>Appliquez l&rsquo;histogramme pour comprendre la distribution de chaque fonctionnalit\u00e9. Comme indiqu\u00e9 ci-dessous, le \u00absoutien social\u00bb semble \u00eatre fortement biais\u00e9 \u00e0 gauche tandis que la \u00abg\u00e9n\u00e9rosit\u00e9\u00bb et les \u00abperceptions de la corruption\u00bb sont biais\u00e9es \u00e0 droite &#8211; ce qui informe les techniques d&rsquo;ing\u00e9nierie des fonctionnalit\u00e9s pour la transformation.<\/p><p><img decoding=\"async\" class=\"aligncenter wp-image-15893 size-full\" src=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132351.png\" alt=\"\" width=\"690\" height=\"578\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132351.png 690w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132351-300x251.png 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132351-14x12.png 14w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132351-600x503.png 600w\" sizes=\"(max-width: 690px) 100vw, 690px\" \/><\/p><p>Nous pouvons \u00e9galement combiner l&rsquo;histogramme avec la mesure d&rsquo;asym\u00e9trie ci-dessous pour quantifier si la caract\u00e9ristique est fortement asym\u00e9trique \u00e0 gauche ou \u00e0 droite.<\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-15894 size-full\" src=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132443.png\" alt=\"\" width=\"706\" height=\"496\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132443.png 706w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132443-300x211.png 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132443-18x12.png 18w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132443-120x85.png 120w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132443-600x422.png 600w\" sizes=\"(max-width: 706px) 100vw, 706px\" \/><\/p><p>np.sqrt est appliqu\u00e9 pour transformer les caract\u00e9ristiques asym\u00e9triques \u00e0 droite &#8211; \u00ab\u00a0G\u00e9n\u00e9rosit\u00e9\u00a0\u00bb et \u00ab\u00a0Perceptions de la corruption\u00a0\u00bb. En cons\u00e9quence, les deux caract\u00e9ristiques deviennent plus normalement distribu\u00e9es.<\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-15895 size-full\" src=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132741.png\" alt=\"\" width=\"536\" height=\"646\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132741.png 536w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132741-249x300.png 249w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132741-10x12.png 10w\" sizes=\"(max-width: 536px) 100vw, 536px\" \/><\/p><p>np.log(2-df[&lsquo;Social support&rsquo;]) est appliqu\u00e9 pour transformer la fonction asym\u00e9trique \u00e0 gauche. Et l&rsquo;asym\u00e9trie diminue consid\u00e9rablement de 1,13 \u00e0 0,39.<\/p><p>sns.pairplot(df) peut \u00eatre utilis\u00e9 pour visualiser la <a href=\"https:\/\/complex-systems-ai.com\/es\/correlacion-y-regresiones\/\">corr\u00e9lation<\/a> entre les entit\u00e9s apr\u00e8s la transformation. Les diagrammes de dispersion sugg\u00e8rent que \u00ab\u00a0PIB par habitant\u00a0\u00bb, \u00ab\u00a0Soutien social\u00a0\u00bb, \u00ab\u00a0Esp\u00e9rance de vie en bonne sant\u00e9\u00a0\u00bb sont corr\u00e9l\u00e9s avec la caract\u00e9ristique cible \u00ab\u00a0Score\u00a0\u00bb, et peuvent donc avoir des valeurs de coefficient plus \u00e9lev\u00e9es. D\u00e9couvrons si c&rsquo;est le cas dans la section ult\u00e9rieure.<\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium wp-image-15896\" src=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132945-300x300.png\" alt=\"\" width=\"300\" height=\"300\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132945-300x300.png 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132945-150x150.png 150w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132945-12x12.png 12w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132945-600x598.png 600w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132945-100x100.png 100w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-132945.png 742w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/p><p>\u00c9tant donn\u00e9 que les techniques de r\u00e9gularisation manipulent la valeur des coefficients, cela rend les performances du mod\u00e8le sensibles \u00e0 l&rsquo;\u00e9chelle des caract\u00e9ristiques. Par cons\u00e9quent, les entit\u00e9s doivent \u00eatre transform\u00e9es \u00e0 la m\u00eame \u00e9chelle. J&rsquo;ai exp\u00e9riment\u00e9 sur trois \u00e9chelles &#8211; StandardScaler, MinMaxScaler et RobustScaler.<\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-15897 size-full\" src=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-133105.png\" alt=\"\" width=\"753\" height=\"667\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-133105.png 753w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-133105-300x266.png 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-133105-14x12.png 14w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-133105-600x531.png 600w\" sizes=\"(max-width: 753px) 100vw, 753px\" \/><\/p><p>Comme vous pouvez le voir, les \u00e9chelles n&rsquo;affecteront pas la distribution et la forme des donn\u00e9es, mais modifieront la plage des donn\u00e9es.<\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-15898 size-full\" src=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-133209.png\" alt=\"\" width=\"759\" height=\"519\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-133209.png 759w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-133209-300x205.png 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-133209-18x12.png 18w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-133209-600x410.png 600w\" sizes=\"(max-width: 759px) 100vw, 759px\" \/><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-ba3bf45 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"ba3bf45\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-f4e874d\" data-id=\"f4e874d\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-15d7c2b elementor-widget elementor-widget-heading\" data-id=\"15d7c2b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\"><span class=\"ez-toc-section\" id=\"Regressions-et-comparaisons\"><\/span>R\u00e9gressions et comparaisons<span class=\"ez-toc-section-end\"><\/span><\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-4b2c282 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"4b2c282\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-04b6d20\" data-id=\"04b6d20\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-aad903c elementor-widget elementor-widget-text-editor\" data-id=\"aad903c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Comparons maintenant trois mod\u00e8les de r\u00e9gression lin\u00e9aire ci-dessous &#8211; lin\u00e9aire, de cr\u00eate et de lasso.<\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-15899 size-full\" src=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134035.png\" alt=\"\" width=\"477\" height=\"116\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134035.png 477w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134035-300x73.png 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134035-18x4.png 18w\" sizes=\"(max-width: 477px) 100vw, 477px\" \/><\/p><p>L&rsquo;\u00e9tape suivante consiste \u00e0 exp\u00e9rimenter comment diff\u00e9rentes valeurs lambda (alpha dans scikit-learn) affectent les mod\u00e8les. Plus important encore, comment l&rsquo;importance des caract\u00e9ristiques et les valeurs de coefficient changent lorsque la valeur alpha passe de 0,0001 \u00e0 1.<\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-15900 size-full\" src=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134135.png\" alt=\"\" width=\"602\" height=\"397\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134135.png 602w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134135-300x198.png 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134135-18x12.png 18w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134135-600x396.png 600w\" sizes=\"(max-width: 602px) 100vw, 602px\" \/><\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-15901 size-full\" src=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134244.png\" alt=\"\" width=\"682\" height=\"724\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134244.png 682w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134244-283x300.png 283w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134244-11x12.png 11w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134244-600x637.png 600w\" sizes=\"(max-width: 682px) 100vw, 682px\" \/><\/p><p>Sur la base des valeurs de coefficients g\u00e9n\u00e9r\u00e9es \u00e0 partir des mod\u00e8les Lasso et Ridge, \u00ab PIB par habitant \u00bb, \u00ab Soutien social \u00bb, \u00ab Esp\u00e9rance de vie en bonne sant\u00e9 \u00bb semblaient \u00eatre les 3 caract\u00e9ristiques les plus importantes. Ceci est align\u00e9 sur les r\u00e9sultats des diagrammes de dispersion pr\u00e9c\u00e9dents, sugg\u00e9rant qu&rsquo;ils sont les principaux moteurs du Country Happy Score.<\/p><p>La comparaison c\u00f4te \u00e0 c\u00f4te indique \u00e9galement que l&rsquo;augmentation des valeurs alpha a un impact sur Lasso et Ridge \u00e0 diff\u00e9rents niveaux, les fonctionnalit\u00e9s de Lasso sont plus fortement supprim\u00e9es. C&rsquo;est pourquoi Lasso est souvent choisi pour la s\u00e9lection des fonctionnalit\u00e9s.<\/p><p id=\"viewer-8mti7\" class=\"mm8Nw _1j-51 iWv3d _1FoOD _3M0Fe aujbK iWv3d public-DraftStyleDefault-block-depth0 fixed-tab-size public-DraftStyleDefault-text-ltr\">De plus, des caract\u00e9ristiques polynomiales sont introduites pour am\u00e9liorer la r\u00e9gression lin\u00e9aire de base, ce qui augmente le nombre de caract\u00e9ristiques de 6 \u00e0 27.<\/p><pre id=\"viewer-fvq43\" class=\"_3M8UJ _3Dd1B md9lk _1FoOD _3M0Fe aujbK iWv3d public-DraftStyleDefault-block-depth0 fixed-tab-size public-DraftStyleDefault-text-ltr\"><span class=\"_2PHJq public-DraftStyleDefault-ltr\"># apply polynomial effects\n<span class=\"_3tKjk\">from<\/span> sklearn<span class=\"_1zYnF\">.<\/span>preprocessing <span class=\"_3tKjk\">import<\/span> PolynomialFeatures\npf <span class=\"_3TgEz\">=<\/span> <span class=\"\">PolynomialFeatures<\/span><span class=\"_1zYnF\">(<\/span>degree <span class=\"_3TgEz\">=<\/span> <span class=\"_2nB1v\">2<\/span><span class=\"_1zYnF\">,<\/span> include_bias <span class=\"_3TgEz\">=<\/span> False<span class=\"_1zYnF\">)<\/span>\nX_train_poly <span class=\"_3TgEz\">=<\/span> pf<span class=\"_1zYnF\">.<\/span><span class=\"\">fit_transform<\/span><span class=\"_1zYnF\">(<\/span>X_train<span class=\"_1zYnF\">)<\/span>\nX_test_poly <span class=\"_3TgEz\">=<\/span> pf<span class=\"_1zYnF\">.<\/span><span class=\"\">fit_transform<\/span><span class=\"_1zYnF\">(<\/span>X_test<span class=\"_1zYnF\">)<\/span><\/span><\/pre><p id=\"viewer-vgv8\" class=\"mm8Nw _1j-51 iWv3d _1FoOD _3M0Fe aujbK iWv3d public-DraftStyleDefault-block-depth0 fixed-tab-size public-DraftStyleDefault-text-ltr\">Regardez leur distribution apr\u00e8s la transformation polynomiale.<\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-15902 size-full\" src=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134509.png\" alt=\"\" width=\"578\" height=\"797\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134509.png 578w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134509-218x300.png 218w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134509-9x12.png 9w\" sizes=\"(max-width: 578px) 100vw, 578px\" \/><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-045d754 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"045d754\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-08c368c\" data-id=\"08c368c\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-739400f elementor-widget elementor-widget-heading\" data-id=\"739400f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\"><span class=\"ez-toc-section\" id=\"Evaluation-des-modeles\"><\/span>Evaluation des mod\u00e8les<span class=\"ez-toc-section-end\"><\/span><\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-b120a41 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"b120a41\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-97986d3\" data-id=\"97986d3\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-3d2bb5f elementor-widget elementor-widget-text-editor\" data-id=\"3d2bb5f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Derni\u00e8re \u00e9tape, \u00e9valuez et comparez les performances du mod\u00e8le Lasso vs de cr\u00eate, avant et apr\u00e8s effet polynomial. Dans le code ci-dessous, j&rsquo;ai cr\u00e9\u00e9 quatre mod\u00e8les\u00a0:<\/p><ul><li>l2\u00a0: de cr\u00eate sans caract\u00e9ristiques polynomiales<\/li><li>l2_poly\u00a0: de cr\u00eate avec caract\u00e9ristiques polynomiales<\/li><li>l1\u00a0: de lasso sans caract\u00e9ristiques polynomiales<\/li><li>l1_poly\u00a0: de lasso avec caract\u00e9ristiques polynomiales<\/li><\/ul><p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-15903 size-full\" src=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134634.png\" alt=\"\" width=\"630\" height=\"734\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134634.png 630w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134634-257x300.png 257w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134634-10x12.png 10w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134634-600x699.png 600w\" sizes=\"(max-width: 630px) 100vw, 630px\" \/><\/p><p>Les m\u00e9triques d&rsquo;\u00e9valuation de mod\u00e8le de r\u00e9gression couramment utilis\u00e9es sont MAE, MSE, RMSE et R au carr\u00e9 &#8211; consultez mon article sur \u00ab\u00a0Un guide pratique de la r\u00e9gression lin\u00e9aire\u00a0\u00bb pour une explication d\u00e9taill\u00e9e. Ici, j&rsquo;ai utilis\u00e9 MSE (erreur quadratique moyenne) pour \u00e9valuer les performances du mod\u00e8le.<\/p><p>1) En comparant Ridge et Lasso dans un graphique, cela indique qu&rsquo;ils ont une pr\u00e9cision similaire lorsque les valeurs alpha sont faibles, mais que le Lasso se d\u00e9t\u00e9riore consid\u00e9rablement lorsque l&rsquo;alpha est plus proche de 1.<\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-15904 size-full\" src=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134747.png\" alt=\"\" width=\"743\" height=\"521\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134747.png 743w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134747-300x210.png 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134747-18x12.png 18w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134747-120x85.png 120w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134747-600x421.png 600w\" sizes=\"(max-width: 743px) 100vw, 743px\" \/><\/p><p>2) En comparant avec ou sans effet polynomial dans un graphique, nous pouvons dire que le polyn\u00f4me diminue la MSE en g\u00e9n\u00e9ral &#8211; donc am\u00e9liore les performances du mod\u00e8le. Cet effet est plus significatif dans la r\u00e9gression Ridge lorsque alpha augmente \u00e0 1, et plus significatif dans la r\u00e9gression Lasso lorsque alpha est plus proche de 0,0001.<\/p><p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-15905 size-full\" src=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134847.png\" alt=\"\" width=\"745\" height=\"520\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134847.png 745w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134847-300x209.png 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134847-18x12.png 18w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134847-120x85.png 120w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2022\/04\/2022-04-24-134847-600x419.png 600w\" sizes=\"(max-width: 745px) 100vw, 745px\" \/><\/p><p>Cependant, m\u00eame si la transformation polynomiale am\u00e9liore les performances des mod\u00e8les de r\u00e9gression, elle rend l&rsquo;interpr\u00e9tabilit\u00e9 du mod\u00e8le plus difficile &#8211; il est difficile de distinguer les principaux moteurs du mod\u00e8le d&rsquo;une r\u00e9gression polynomiale. Moins d&rsquo;erreur ne garantit pas toujours un meilleur mod\u00e8le, et il s&rsquo;agit de trouver le bon \u00e9quilibre entre pr\u00e9visibilit\u00e9 et interpr\u00e9tabilit\u00e9 en fonction des objectifs du projet.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>P\u00e1gina de inicio de Wiki de correlaci\u00f3n y regresi\u00f3n Aqu\u00ed hay una canalizaci\u00f3n que explica la extracci\u00f3n de datos, la transformaci\u00f3n para la normalizaci\u00f3n y la regresi\u00f3n (con an\u00e1lisis de rendimiento). Analizar, \u2026 <\/p>","protected":false},"author":1,"featured_media":0,"parent":15517,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-15886","page","type-page","status-publish","hentry"],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/complex-systems-ai.com\/es\/wp-json\/wp\/v2\/pages\/15886","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/complex-systems-ai.com\/es\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/complex-systems-ai.com\/es\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/complex-systems-ai.com\/es\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/complex-systems-ai.com\/es\/wp-json\/wp\/v2\/comments?post=15886"}],"version-history":[{"count":4,"href":"https:\/\/complex-systems-ai.com\/es\/wp-json\/wp\/v2\/pages\/15886\/revisions"}],"predecessor-version":[{"id":17890,"href":"https:\/\/complex-systems-ai.com\/es\/wp-json\/wp\/v2\/pages\/15886\/revisions\/17890"}],"up":[{"embeddable":true,"href":"https:\/\/complex-systems-ai.com\/es\/wp-json\/wp\/v2\/pages\/15517"}],"wp:attachment":[{"href":"https:\/\/complex-systems-ai.com\/es\/wp-json\/wp\/v2\/media?parent=15886"}],"curies":[{"name":"gracias","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}