{"id":20097,"date":"2024-02-07T07:56:27","date_gmt":"2024-02-07T06:56:27","guid":{"rendered":"https:\/\/complex-systems-ai.com\/?page_id=20097"},"modified":"2024-02-13T07:16:23","modified_gmt":"2024-02-13T06:16:23","slug":"operateurs-relationnels","status":"publish","type":"page","link":"https:\/\/complex-systems-ai.com\/en\/software-analysis\/relational-operators\/","title":{"rendered":"2 Corrected Exercises Relational Operators"},"content":{"rendered":"<div data-elementor-type=\"wp-page\" data-elementor-id=\"20097\" class=\"elementor elementor-20097\">\n\t\t\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-97ce9b1 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"97ce9b1\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-33 elementor-top-column elementor-element elementor-element-eb58237\" data-id=\"eb58237\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-7f7c1d1 elementor-align-justify elementor-widget elementor-widget-button\" data-id=\"7f7c1d1\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"button.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<div class=\"elementor-button-wrapper\">\n\t\t\t\t\t<a class=\"elementor-button elementor-button-link elementor-size-sm\" href=\"https:\/\/complex-systems-ai.com\/en\/software-analysis\/\">\n\t\t\t\t\t\t<span class=\"elementor-button-content-wrapper\">\n\t\t\t\t\t\t\t\t\t<span class=\"elementor-button-text\">Software Analysis<\/span>\n\t\t\t\t\t<\/span>\n\t\t\t\t\t<\/a>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t<div class=\"elementor-column elementor-col-33 elementor-top-column elementor-element elementor-element-6f38c66\" data-id=\"6f38c66\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-a9f7f8f elementor-align-justify elementor-widget elementor-widget-button\" data-id=\"a9f7f8f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"button.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<div class=\"elementor-button-wrapper\">\n\t\t\t\t\t<a class=\"elementor-button elementor-button-link elementor-size-sm\" href=\"https:\/\/complex-systems-ai.com\/en\/\">\n\t\t\t\t\t\t<span class=\"elementor-button-content-wrapper\">\n\t\t\t\t\t\t\t\t\t<span class=\"elementor-button-text\">Home page<\/span>\n\t\t\t\t\t<\/span>\n\t\t\t\t\t<\/a>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t<div class=\"elementor-column elementor-col-33 elementor-top-column elementor-element elementor-element-000b2f9\" data-id=\"000b2f9\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-6a1791b elementor-align-justify elementor-widget elementor-widget-button\" data-id=\"6a1791b\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"button.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<div class=\"elementor-button-wrapper\">\n\t\t\t\t\t<a class=\"elementor-button elementor-button-link elementor-size-sm\" href=\"https:\/\/fr.wikipedia.org\/wiki\/UML_(informatique)\" target=\"_blank\" rel=\"noopener\">\n\t\t\t\t\t\t<span class=\"elementor-button-content-wrapper\">\n\t\t\t\t\t\t\t\t\t<span class=\"elementor-button-text\">Wiki<\/span>\n\t\t\t\t\t<\/span>\n\t\t\t\t\t<\/a>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-3edd86e elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"3edd86e\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-0006b03\" data-id=\"0006b03\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-2d120e1 elementor-widget elementor-widget-heading\" data-id=\"2d120e1\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_82_2 counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\n<div class=\"ez-toc-title-container\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Contents<\/p>\n<span class=\"ez-toc-title-toggle\"><a href=\"#\" class=\"ez-toc-pull-right ez-toc-btn ez-toc-btn-xs ez-toc-btn-default ez-toc-toggle\" aria-label=\"Toggle Table of Content\"><span class=\"ez-toc-js-icon-con\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewbox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewbox=\"0 0 24 24\" version=\"1.2\" baseprofile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/span><\/a><\/span><\/div>\n<nav><ul class='ez-toc-list ez-toc-list-level-1' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/complex-systems-ai.com\/en\/software-analysis\/relational-operators\/#Exercices-corriges-sur-les-operateurs-relationnels\" >Corrected exercises on relational operators<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/complex-systems-ai.com\/en\/software-analysis\/relational-operators\/#Rappel-de-cours\" >Course reminder<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/complex-systems-ai.com\/en\/software-analysis\/relational-operators\/#Exercice-1\" >Exercise 1<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/complex-systems-ai.com\/en\/software-analysis\/relational-operators\/#Exercice-2\" >Exercise 2<\/a><\/li><\/ul><\/nav><\/div>\n<h2 class=\"elementor-heading-title elementor-size-default\"><span class=\"ez-toc-section\" id=\"Exercices-corriges-sur-les-operateurs-relationnels\"><\/span>Corrected exercises on relational operators<span class=\"ez-toc-section-end\"><\/span><\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-dfe1dd2 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"dfe1dd2\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-4c106be\" data-id=\"4c106be\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-87fd4a3 elementor-widget elementor-widget-text-editor\" data-id=\"87fd4a3\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>This page presents use cases for relational operators via corrected exercises.<\/p><p><img decoding=\"async\" class=\"aligncenter wp-image-11096 size-full\" src=\"http:\/\/complex-systems-ai.com\/wp-content\/uploads\/2020\/09\/cropped-Capture.png\" alt=\"relational operators\" width=\"97\" height=\"97\" title=\"\"><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-debdc4f elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"debdc4f\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-48e1693\" data-id=\"48e1693\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-dfb44e3 elementor-widget elementor-widget-heading\" data-id=\"dfb44e3\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\"><span class=\"ez-toc-section\" id=\"Rappel-de-cours\"><\/span>Course reminder<span class=\"ez-toc-section-end\"><\/span><\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-346e10d elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"346e10d\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-69af9fb\" data-id=\"69af9fb\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-598fd55 elementor-widget elementor-widget-text-editor\" data-id=\"598fd55\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>We want to join two relations R and S (unsorted). R is smaller than S<\/p><p>|R| = Number of pages on disk of R, ||R|| = Number of tuples of R, M = Number of memory pages available (we assume that 3 memory pages are allocated for I\/O<\/p><p>Sort-merge join (SMJ) algorithm<\/p><table><tbody><tr><td width=\"205\"><p><strong>Condition on memory<\/strong><\/p><\/td><td width=\"205\"><p><strong>I\/O cost<\/strong><\/p><\/td><\/tr><tr><td width=\"205\"><p>M &gt; |R|+|S|<\/p><\/td><td width=\"205\"><p>Cost = |R|+|S|<\/p><\/td><\/tr><tr><td width=\"205\"><p>|R|+|S| &gt; M &gt; sqr(|R|+|S|)<\/p><\/td><td width=\"205\"><p>Cost = 3|R|+3|S|<\/p><\/td><\/tr><\/tbody><\/table><p>Grace hash join (GHJ) algorithm<\/p><table><tbody><tr><td width=\"205\"><p><strong>Condition on memory<\/strong><\/p><\/td><td width=\"205\"><p><strong>I\/O cost<\/strong><\/p><\/td><\/tr><tr><td width=\"205\"><p>M &gt; |R|<\/p><\/td><td width=\"205\"><p>Cost = |R|+|S|<\/p><\/td><\/tr><tr><td width=\"205\"><p>M &gt; sqr(|R|)<\/p><\/td><td width=\"205\"><p>Cost = 3|R|+3|S|<\/p><\/td><\/tr><\/tbody><\/table>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-86d3bc1 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"86d3bc1\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-11b5850\" data-id=\"11b5850\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-fad34c4 elementor-widget-divider--view-line elementor-widget elementor-widget-divider\" data-id=\"fad34c4\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"divider.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"elementor-divider\">\n\t\t\t<span class=\"elementor-divider-separator\">\n\t\t\t\t\t\t<\/span>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-1f99c9d elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"1f99c9d\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-0cf4697\" data-id=\"0cf4697\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-9a8743f elementor-widget elementor-widget-heading\" data-id=\"9a8743f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\"><span class=\"ez-toc-section\" id=\"Exercice-1\"><\/span>Exercise 1<span class=\"ez-toc-section-end\"><\/span><\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-df7e282 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"df7e282\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-04f72dd\" data-id=\"04f72dd\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-ac19626 elementor-widget elementor-widget-text-editor\" data-id=\"ac19626\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p><strong>Question 1<\/strong><\/p><p>Based on the algorithms seen in class, find the cost formulas as well as the limit conditions on the memory (assuming that M does not include 2 or 3 buffers for I\/O).<\/p><p><strong>Question 2<\/strong><\/p><p>Propose an execution strategy for SMJ and GHJ when the memory is smaller than sqr(|R|+|S|) (SMJ) or sqr(|R|) (GHJ)<\/p><p><strong>Question 3<\/strong><\/p><p>Assuming that to read n consecutive pages from disk takes 10ms + n*0.2ms, compare Block Nested Loop Join (BNLJ), GHJ and SMJ. For the SMJ and the GHJ, we place ourselves in the case where we have a memory smaller than |R|+|S| and that |R| respectively.<\/p><p><strong>Question 4<\/strong><\/p><p>Based on the Hybrid Hash Join algorithm seen in class, propose an adaptive hashing execution strategy allowing you to take advantage of the maximum available memory but <strong>without knowing a priori the size of R <\/strong>(for simplicity, we will assume that M &gt; sqr(|R|) and M&lt;|R|). Evaluate the I\/O cost for M = R\/2, M = R\/3.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-9ac751d elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"9ac751d\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-b7e897e\" data-id=\"b7e897e\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-54ce3e6 elementor-widget elementor-widget-toggle\" data-id=\"54ce3e6\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"toggle.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"elementor-toggle\">\n\t\t\t\t\t\t\t<div class=\"elementor-toggle-item\">\n\t\t\t\t\t<div id=\"elementor-tab-title-8891\" class=\"elementor-tab-title\" data-tab=\"1\" role=\"button\" aria-controls=\"elementor-tab-content-8891\" aria-expanded=\"false\">\n\t\t\t\t\t\t\t\t\t\t\t\t<span class=\"elementor-toggle-icon elementor-toggle-icon-left\" aria-hidden=\"true\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<span class=\"elementor-toggle-icon-closed\"><i class=\"fas fa-caret-right\"><\/i><\/span>\n\t\t\t\t\t\t\t\t<span class=\"elementor-toggle-icon-opened\"><i class=\"elementor-toggle-icon-opened fas fa-caret-up\"><\/i><\/span>\n\t\t\t\t\t\t\t\t\t\t\t\t\t<\/span>\n\t\t\t\t\t\t\t\t\t\t\t\t<a class=\"elementor-toggle-title\" tabindex=\"0\">Solution<\/a>\n\t\t\t\t\t<\/div>\n\n\t\t\t\t\t<div id=\"elementor-tab-content-8891\" class=\"elementor-tab-content elementor-clearfix\" data-tab=\"1\" role=\"region\" aria-labelledby=\"elementor-tab-title-8891\"><p><strong>Question 1\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/strong>\u00a0\u00a0\u00a0<\/p><p>Based on the algorithms seen in class, find the cost formulas as well as the limit conditions on the memory (assuming that M does not include 2 or 3 buffers for I\/O).<\/p><p>SMJ<\/p><ul><li>M&gt; |R|+|S| : Reading R, sorting R in memory on the join attribute, same for S in memory, merge sorted pages in memory. We therefore have |R|+|S| input output discs to read the 2 relationships. (NB: the writing of the result is never included in the operators&#039; costs).<\/li><li>|R|+|S| &gt; M &gt; sqr(|R|+|S|): The two relations do not hold in memory. The algorithm is then as follows. Read R into memory in packets of M pages and sort the packet in memory on the join attribute. This packet is written on disk (called a bucket in English). The process is repeated |R|\/M times until all R has been consumed and sorted.\u00a0<\/li><\/ul><p><span style=\"color: var( --e-global-color-text ); font-family: var( --e-global-typography-text-font-family ), Sans-serif; font-weight: var( --e-global-typography-text-font-weight ); font-size: 1.125rem;\">We obtain |R|\/M packets of sorted tuples of R. The number of inputs outputs is therefore here 2|R|, to read R and write the sorted packets of R. The same process is applied to S, corresponding to 2|S| entries exits. We then obtain |R|\/M+|S|\/M packets of size M. We must then join the tuples. To do this, we carry out the merge phase. We need at least one memory page per packet. The first page of each packet is then recalled in memory. Joining tuples are produced in the result.\u00a0<\/span><\/p><p><span style=\"color: var( --e-global-color-text ); font-family: var( --e-global-typography-text-font-family ), Sans-serif; font-weight: var( --e-global-typography-text-font-weight ); font-size: 1.125rem;\">When we know that a page of a package contains tuples that can no longer join, we replace it with the next page of the package from which it comes. Thus, each page of each packet is read, generating |R|+|S| entries exits. The number of buckets cannot exceed M if we want to be able to merge. The memory required to perform this processing is at least one page per packet. So it is necessary that M &gt; |R|\/M+|S|\/M hence the result of the table.<\/span><\/p><p><img fetchpriority=\"high\" decoding=\"async\" class=\"alignnone wp-image-20103 size-full\" src=\"http:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/bdd52.png\" alt=\"relational operators\" width=\"630\" height=\"383\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/bdd52.png 630w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/bdd52-300x182.png 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/bdd52-18x12.png 18w\" sizes=\"(max-width: 630px) 100vw, 630px\" \/><\/p><p>GHJ:<\/p><ul><li>M&gt; |R|+|S| : Reading R into memory, hashing R, reading and hashing S into memory, joining corresponding hash packets into memory. We therefore have |R|+|S| input outputs.<\/li><li>M&gt; sqr(|R|): The relation R does not hold in memory. The algorithm is then as follows. Reading R into memory in packets of M and manufacturing N packets of tuples of R hashed on the join attribute (N is the number of hash values produced by the hash function), until having consumed all of R .<\/li><\/ul><p>The number of buckets of R cannot exceed M to have at least one memory page for each entry in the hash table. The number of inputs and outputs is therefore here 2|R|. The same process is applied to S, generating 2|S| entries exits. To join the tuples, we must read an entire hashed packet of R into memory, and pass the corresponding hashed packet of S page by page. To hold in memory, the hashed packet of R must be less than M. We have at more M packets, and on average |R|\/M pages per packet. The memory required to carry out this processing is therefore |R|\/M &lt; M hence the result of the table.<\/p><p><strong>Question 2\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/strong>\u00a0\u00a0\u00a0\u00a0<\/p><p>Propose an execution strategy for SMJ and GHJ when the memory is smaller than sqr(|R|+|S|) (SMJ) or sqr(|R|) (GHJ)<\/p><p>SMJ: we can do several levels of runs. We then increase the size of the packets and reduce their number. This can be done by merging several packets of R (we make larger sorted packets with the tuples of the initial packets) and of S by sorting. We then obtain 3|R|+3|S| I\/O (1 run level), 5R|+5|S| (2 levels of runs), 7|R|+7|S| (3 levels of runs).<\/p><p><img decoding=\"async\" class=\"alignnone wp-image-20104 size-full\" src=\"http:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/bdd53.png\" alt=\"relational operators\" width=\"520\" height=\"357\" title=\"\" srcset=\"https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/bdd53.png 520w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/bdd53-300x206.png 300w, https:\/\/complex-systems-ai.com\/wp-content\/uploads\/2024\/02\/bdd53-18x12.png 18w\" sizes=\"(max-width: 520px) 100vw, 520px\" \/><\/p><p>GHJ: We hash more finely (another hashing function) the R packets whose size is greater than that of the memory. (Pb: should we also rehash S?)<\/p><p><strong>Question 3\u00a0\u00a0<\/strong>\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0<\/p><p>Assuming that to read n consecutive pages from disk takes 10ms + n*0.2ms, compare Block Nested Loop Join (BNLJ), GHJ and SMJ. For the SMJ and the GHJ, we place ourselves in the case where we have a memory smaller than |R|+|S| and that |R| respectively.<\/p><p>BNLJ: Reads a block of R, then passes all S page by page. Do this R\/M times. We therefore obtain: |R|\/M (10 + M x 0.2ms) + |R|\/M (10ms + |S| x 0.2ms)<\/p><p>SMJ: We obtain 2 x |R|\/M (10ms + M x 0.2ms) + 2 x |S|\/M (10ms + M x 0.2ms) + (|R|+|S|) (10ms + 0.2ms) = sequential read\/write by blocks of R and S + random read of pages of R and S<\/p><p>GHJ: We obtain 2 x (|R|+|S|) (10ms + 0.2ms) + |R|\/M (10ms + M x 0.2ms) + |R|\/M (10ms + |S| * M \/ |R| x 0.2ms) = random read\/write of pages of R and S + sequential block reading of R and S<\/p><p>Ref: \u201cSeeking the truth about adhoc joins\u201d<\/p><p><strong>Question 4<\/strong><\/p><p>Based on the Hybrid Hash Join algorithm seen in class, propose an adaptive hashing execution strategy allowing you to take advantage of the maximum available memory but <strong>without knowing a priori the size of R <\/strong>(for simplicity, we will assume that M &gt; sqr(|R|) and M&lt;|R|). Evaluate the I\/O cost for M = R\/2, M = R\/3.<\/p><p>We can decide to start hashing R in memory as if everything was going to go well (M &gt; |R|), then as soon as we run out of memory, we put certain hash packets on disk (starting from the end) . These hash values will now be found in the same packet. Then we read S and join with the packets of R found in memory. Tuples of S that join with a packet of R on disk are written to disk as a hash packet. Finally we reread the R and S packets on disk to produce the missing results.<\/p><p>With M=|R|\/2, we write half of |R|. At the end, half of the hash packets remain in memory, and on disks the other packets containing half of the tuples of R (|R|\/2 inputs outputs). We pass S against the hash table page by page (|S| disk inputs and outputs). We produce the results corresponding to the tuples of S joining with the packets of R in memory and we write the tuples of S to disk (in the form of hash packets).<span style=\"color: var( --e-global-color-text ); font-family: var( --e-global-typography-text-font-family ), Sans-serif; font-weight: var( --e-global-typography-text-font-weight ); font-size: 1.125rem;\"> corresponding to R packages on disk.<\/span><\/p><p><span style=\"color: var( --e-global-color-text ); font-family: var( --e-global-typography-text-font-family ), Sans-serif; font-weight: var( --e-global-typography-text-font-weight ); font-size: 1.125rem;\">This generates |S|\/2 inputs outputs. Finally, we reread the packets of R and S on disks, which generates |R|\/2+|S|\/2 inputs and outputs. In total, there are therefore 2 (|R|+|S|) inputs and outputs.<\/span><\/p><p>With M=|R|\/3, we obtain |R| + 2\/3 |R| + |S| + 2\/3|S| + 2\/3 |R| + 2\/3|S| = 7\/3 (|R|+|S|) inputs outputs.<\/p><\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-35a9dfa elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"35a9dfa\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-525afc6\" data-id=\"525afc6\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-868c60f elementor-widget-divider--view-line elementor-widget elementor-widget-divider\" data-id=\"868c60f\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"divider.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"elementor-divider\">\n\t\t\t<span class=\"elementor-divider-separator\">\n\t\t\t\t\t\t<\/span>\n\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-c29a84b elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"c29a84b\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-d9a2400\" data-id=\"d9a2400\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-57e059e elementor-widget elementor-widget-heading\" data-id=\"57e059e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"heading.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t<h2 class=\"elementor-heading-title elementor-size-default\"><span class=\"ez-toc-section\" id=\"Exercice-2\"><\/span>Exercise 2<span class=\"ez-toc-section-end\"><\/span><\/h2>\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-c466e7d elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"c466e7d\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-d3a1c2d\" data-id=\"d3a1c2d\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-29caca7 elementor-widget elementor-widget-text-editor\" data-id=\"29caca7\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>We are now in an extreme case where the available memory is only a few bytes, on the other hand, the data is stored on an electronic memory, the reading cost is therefore similar to reading from memory.<\/p><p><strong>Question 5<\/strong><\/p><p>Find algorithms and express them with the iterator model formalism to perform the following operations: Selection, projection, join.<\/p><p>Specify the minimum memory required.<\/p><p><strong>Question 6<\/strong><\/p><p>Can we carry out sorting or aggregate calculations? If yes, how ?<\/p><p><strong>Question 7<\/strong><\/p><p>You now have a few KB of RAM. How to optimize joins?<\/p><p><strong>Question 8<\/strong><\/p><p>And sorting or calculating aggregates?<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-834d026 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"834d026\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-49ca56a\" data-id=\"49ca56a\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-48667b3 elementor-widget elementor-widget-toggle\" data-id=\"48667b3\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"toggle.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t<div class=\"elementor-toggle\">\n\t\t\t\t\t\t\t<div class=\"elementor-toggle-item\">\n\t\t\t\t\t<div id=\"elementor-tab-title-7591\" class=\"elementor-tab-title\" data-tab=\"1\" role=\"button\" aria-controls=\"elementor-tab-content-7591\" aria-expanded=\"false\">\n\t\t\t\t\t\t\t\t\t\t\t\t<span class=\"elementor-toggle-icon elementor-toggle-icon-left\" aria-hidden=\"true\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<span class=\"elementor-toggle-icon-closed\"><i class=\"fas fa-caret-right\"><\/i><\/span>\n\t\t\t\t\t\t\t\t<span class=\"elementor-toggle-icon-opened\"><i class=\"elementor-toggle-icon-opened fas fa-caret-up\"><\/i><\/span>\n\t\t\t\t\t\t\t\t\t\t\t\t\t<\/span>\n\t\t\t\t\t\t\t\t\t\t\t\t<a class=\"elementor-toggle-title\" tabindex=\"0\">Solution<\/a>\n\t\t\t\t\t<\/div>\n\n\t\t\t\t\t<div id=\"elementor-tab-content-7591\" class=\"elementor-tab-content elementor-clearfix\" data-tab=\"1\" role=\"region\" aria-labelledby=\"elementor-tab-title-7591\"><p>We are now in an extreme case where the available memory is only a few bytes, on the other hand, the data is stored on an electronic memory, the reading cost is therefore similar to reading from memory.<\/p><p><strong>Question 5<\/strong><\/p><p>Find algorithms and express them with the iterator model formalism to perform the following operations: Selection, projection, join.<\/p><p>Specify the minimum memory required.<\/p><p>The selection and projection operators work tuple by tuple, so they do not consume RAM. The join can be done by nested loop, there is no memory consumption (or 1 pointer per relation).<\/p><p><strong>Question 6<\/strong><\/p><p>Can we carry out sorting or aggregate calculations? If yes, how ?<\/p><p>We can for example completely scan the operator&#039;s input to find the tuple sharing the smallest sort key, then redo a complete scan of the input to deliver all the tuples sharing this key, and so on until having produced all the tuples. We can therefore sort tuples in this way.<\/p><p>To calculate aggregates, we proceed in the same way by calculating the aggregate for the tuples sharing the key.<\/p><p><strong>Question 7<\/strong><\/p><p>You now have a few KB of RAM. How to optimize joins?<\/p><p>We make a nested loop block\u2026<\/p><p><strong>Question 8<\/strong><\/p><p>And sorting or calculating aggregates?<\/p><p>We buffer several sorting or aggregate values on each pass on the data. Each pass thus makes it possible to process several sort or aggregate keys.<\/p><\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>","protected":false},"excerpt":{"rendered":"<p>Software Analysis Home page Wiki Corrected exercises on relational operators This page presents use cases for relational operators via corrected exercises. Reminder \u2026 <\/p>","protected":false},"author":1,"featured_media":0,"parent":4609,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-20097","page","type-page","status-publish","hentry"],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/complex-systems-ai.com\/en\/wp-json\/wp\/v2\/pages\/20097","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/complex-systems-ai.com\/en\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/complex-systems-ai.com\/en\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/complex-systems-ai.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/complex-systems-ai.com\/en\/wp-json\/wp\/v2\/comments?post=20097"}],"version-history":[{"count":7,"href":"https:\/\/complex-systems-ai.com\/en\/wp-json\/wp\/v2\/pages\/20097\/revisions"}],"predecessor-version":[{"id":20507,"href":"https:\/\/complex-systems-ai.com\/en\/wp-json\/wp\/v2\/pages\/20097\/revisions\/20507"}],"up":[{"embeddable":true,"href":"https:\/\/complex-systems-ai.com\/en\/wp-json\/wp\/v2\/pages\/4609"}],"wp:attachment":[{"href":"https:\/\/complex-systems-ai.com\/en\/wp-json\/wp\/v2\/media?parent=20097"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}