- Principal component analysis
- Factorial correspondence analysis
- Multiple Correspondence Analysis
- Canonical analysis
- Multidimensional positioning
- Hierarchical Multiple Factor Analysis
- Generalized Procrustean Analysis
- Multiple Dual Factor Analysis
- Factor Analysis of Mixed Data
- Iconography of correlations
- AIT
- t-SNE

Data management / preprocessing:

- How to handle missing data
- Normalize / Standardize / Resize your Data
- From data normalization to regression

EDA:

Tree decision:

Pipeline:

Contents

Toggle## Data analysis

Data analysis is a process of inspecting, cleaning, transforming, and modeling data with the goal of uncovering useful information, informing conclusions, and supporting decision-making. Data analytics has many facets and approaches, encompassing various techniques under a variety of names, and is used in different fields of business, science, and social science. In today's business world, data analysis plays a role in making more scientific decisions and helping businesses operate more efficiently.

In statistical applications, data analysis can be divided into descriptive statistics, exploratory data analysis, and confirmatory data analysis. It focuses on discovering new features in data while the latter focuses on confirming or falsifying existing assumptions. Predictive analytics focuses on the application of statistical models for prediction or predictive classification, while text analytics applies statistical, linguistic, and structural techniques to extract and classify information from textual sources, a species unstructured data. All of the above are varieties of data analysis.

Data integration is a precursor to data analytics, and data analytics is closely related to data visualization and data dissemination.

Analysis is the breaking of a whole into its separate components for individual examination. Data analysis is a process of obtaining raw data and then converting it into useful information for decision making by users. Data is collected and analyzed to answer questions, test hypotheses or disprove theories.

Statistician John Tukey defined data analysis in 1961 as:

“Procedures for analyzing data, techniques for interpreting the results of these procedures, ways of planning the collection of data to make analysis easier, more precise or more exact, and all the mechanisms and results of statistics ( mathematics) that apply to data analysis . »

Several phases can be distinguished. The phases are iterative, in that feedback from later phases may lead to additional work in earlier phases.