> For the complete documentation index, see [llms.txt](https://docs.predicteasy.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.predicteasy.com/data-modelling/classification.md).

# Classification

**Definition:** Classification is a type of predictive modeling that aims to categorize or assign observations or instances to a predefined set of classes or categories. It's used to predict the category or class of a new dataset, based on training from historical data.

**Example:** Email spam detection, sentiment analysis, disease diagnosis (e.g., classifying a patient as having a particular disease or not based on symptoms).

**Steps:**

**1. Select Independent Columns (X):**&#x20;

* Identify and choose the independent columns in your dataset.&#x20;
* These columns, often referred to as features or predictors, are the variables that will be used to predict the dependent variable(Y).&#x20;

**2. Select Dependent Column (Y):**

* Identify the dependent variable or target variable (Y) that you aim to predict.
* This column represents the output or the variable to be predicted based on the other independent variables (X).

**3. Cross-Validation:**

* Determine the level or number of folds for cross-validation. Cross-validation is a resampling technique used to assess how the results of a predictive model will generalize to an independent dataset.
* Common methods include k-fold cross-validation, where the dataset is divided into k subsets or folds. The model is trained on k-1 folds and tested on the remaining fold, repeated k times.

### **Reports**

**Summary Page**&#x20;

<figure><img src="/files/iH2hfoNs2jYZWPsJdVNB" alt=""><figcaption></figcaption></figure>

\
\
**Simulator Overview:**

<figure><img src="/files/6iDXFiRAiK2sSdv9WcHy" alt=""><figcaption></figcaption></figure>

\
\
**Actionable Insights:**<br>

<figure><img src="/files/YJW3ZVIjghN4t5euIInH" alt=""><figcaption></figcaption></figure>


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.predicteasy.com/data-modelling/classification.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
