PredictEasy!
  • 🚀PredictEasy
  • Getting Started
  • ☁️Installation
  • 🛠️Data Preprocessing
    • Label Encoding
    • Log Transform
    • Standard Scaling
    • Replacement
    • Imputation
  • 📊Statistical Test
    • T-Test
    • ANOVA Test
    • Spearman’s Rank Correlation
    • Pearson’s Correlation
    • Chi-Squared Test
    • Shapiro-Wilk Test
    • Kruskal-Wallis H Test
    • Friedman Test
  • 📶Data Modelling
    • Classification
    • Regression
    • Clustering
  • 🔡NLP
    • Sentiment Analysis
  • 📈Visualization Charts
    • Line graph
    • Bar Chart
    • Scatter Chart
    • Area Chart
  • Media
  • 📽️Video Tutorials
  • 🗞️Articles
  • FAQ?
  • 📞Contact
Powered by GitBook
On this page

Was this helpful?

  1. Data Modelling

Classification

PreviousData ModellingNextRegression

Last updated 1 year ago

Was this helpful?

Definition: Classification is a type of predictive modeling that aims to categorize or assign observations or instances to a predefined set of classes or categories. It's used to predict the category or class of a new dataset, based on training from historical data.

Example: Email spam detection, sentiment analysis, disease diagnosis (e.g., classifying a patient as having a particular disease or not based on symptoms).

Steps:

1. Select Independent Columns (X):

  • Identify and choose the independent columns in your dataset.

  • These columns, often referred to as features or predictors, are the variables that will be used to predict the dependent variable(Y).

2. Select Dependent Column (Y):

  • Identify the dependent variable or target variable (Y) that you aim to predict.

  • This column represents the output or the variable to be predicted based on the other independent variables (X).

3. Cross-Validation:

  • Determine the level or number of folds for cross-validation. Cross-validation is a resampling technique used to assess how the results of a predictive model will generalize to an independent dataset.

  • Common methods include k-fold cross-validation, where the dataset is divided into k subsets or folds. The model is trained on k-1 folds and tested on the remaining fold, repeated k times.

Reports

Summary Page

Simulator Overview:

Actionable Insights:

📶