- Principal component analysis
- Factorial correspondence analysis
- Multiple Correspondence Analysis
- Canonical analysis
- Multidimensional positioning
- Hierarchical Multiple Factor Analysis
- Generalized Procrustean Analysis
- Multiple Dual Factor Analysis
- Factor Analysis of Mixed Data
- Iconography of correlations
- Index of Jaccard
- Dice Index
- Concordance index
- Tanimoto index
- Data cleaning and understanding
- Selection of columns
- Normalize / Standardize / Resize your Data
- How to handle missing data
- From data normalization to regression
- Pipeline for the classification problem
- Classification and Set Learning
- Classification Performance Indices (Confusion Matrix) Tutorial
- Dimension reduction tutorial
- t-SNE Tutorial
- The right tools for machine learning debugging
Data analysis is a process of inspecting, cleaning, transforming and modeling data with the aim of discovering useful information, illuminating conclusions and support decision making. Data analytics has many facets and approaches, encompassing various techniques under a variety of names, and is used in different fields of business, science, and social science. In today's business world, data analysis plays a role in making more scientific decisions and helping businesses operate more efficiently.
In statistical applications, data analysis can be divided into descriptive statistics, exploratory data analysis, and confirmatory data analysis. It focuses on discovering new features in data while the latter focuses on confirming or falsifying existing assumptions. Predictive analytics focuses on the application of statistical models for prediction or predictive classification, while text analytics applies statistical, linguistic, and structural techniques to extract and classify information from textual sources, a species unstructured data. All of the above are varieties of data analysis.
Data integration is a precursor to data analytics, and data analytics is closely related to data visualization and data dissemination.
Analysis is the breaking of a whole into its separate components for individual examination. Data analysis is a process of obtaining raw data and then converting it into useful information for decision making by users. Data is collected and analyzed to answer questions, test hypotheses or disprove theories.
Statistician John Tukey defined data analysis in 1961 as:
“Procedures for analyzing data, techniques for interpreting the results of these procedures, ways of planning the collection of data to make analysis easier, more precise or more exact, and all the mechanisms and results of statistics ( mathematics) that apply to data analysis . »
Several phases can be distinguished. The phases are iterative, in that feedback from later phases may lead to additional work in earlier phases.