Understanding Sku (Stock Keeping Unit) Clustering/Segmentation


What Is The Sku Clustering/Segmentation?

SKU (Stock keeping unit) is a unique alphanumeric code that allows tracking of products. SKU as a concept is the term retailers use to differentiate products and manage inventory levels. 

The number of SKUs in companies can vary between 100 and millions depending on the service provided. When the size of an SKU increase, managing will be very complex, and strategic decisions about the products will be made based on the overall average. When deciding on the overall average, there will always be an important outcome that is missed. This is a mathematical formula and how much detail and cost must be incurred to seize the missed opportunity. 

The chart below shows the cost function with the detail of parameter management for SKUs. With SKU clustering/segmentation solution, you can make strategic management with the optimum level/label and maximize your profitability. Sectors that especially need SKU clustering/segmentation solutions can be listed as #retail, #game, #insurance, #e-commerce, #banking.


How To Use Sku Segmentation/Clustering Results?

Clustering/segmentation is done to reduce complexity in process management. Cost advantage and profit increase with the decrease in complexity are expected outputs.

The areas where SKU clustering/segmentation results for different sectors can be used directly can be summarized as follows.

  • • Inventory management
  • • Forecasting / Ideal Stock Level Calculation
  • • Advertising and Marketing
  • • Campaign Management


The following image shows a simple grouping of results for customers and products.


What Is The Working Principle Of Clustering/Segmentation?

There are different approaches and methods in clustering models. These approaches may show different performances according to different datasets. Therefore, a single method should not be used in a dataset and should be considered versatile.

In general, there are 4 types of clustering algorithms;

  •  • Centroid Based Algorithms
  •  • Density Based Clustering
  •  • Distribution Based Clustering
  •  • Hierarchical Clustering


An example of centroid-based clustering  


An example of density based clustering


Gaussian distribution graph


An example of distribution based clustering


How Should Clustering/Segmentation Results Be Interpreted?

Clustering/segmentation is an unsupervised learning technique. In this technique, the labels and implications of the data are inherently unknown. Aims to combine together the data by looking at the meaningful common aspects. Accuracy is determined by calculating metrics that measure how meaningful the combined data is. These metrics can be summarized as follows; 

  •   •  Silhouette Score
  •   •  Calinski Harabasz Score
  •   •  Davis Bouldin Score


For which variables the created clusters/segments stand out is as important as its modeling. What clusters/segments tell processes? How to take action specific to segments? In order to answer these questions efficiently, it is necessary to be able to see what the clusters are telling. Therefore, visualizing and summarizing, and explaining the results in a way that can be interpreted becomes the most important issue.

The following methods can be preferred for the interpretation and understanding of segmentation results;

  •  • Dimensionality reduction
  •  • Graphic visualization 
  •       - Histogram distribution
  •       - Pie chart
  •  • Box Plot
  •  • Cluster Profiling Table



General 2022-12-27 13:55:29

Post Retail Powerbi

Supercharge Your Success with Power BI in Retail and E-commerce!

AutoML 2023-05-26 14:44:45

Importance Of Churn!

In today's highly competitive business environment, customer churn is a major challenge faced by businesses across industries. Churn, which refers to the rate at which customers stop using a product or service, can lead to a loss of revenue and customers, negatively impact a company's reputation, an...

AutoML 2023-05-18 11:42:57

What Is Data Lakehouse?

The Data Lakehouse architecture is an extremely well-performing technology that supports direct access data types, has first-level support for machine learning and data science but before talking more about Data Lakehouse architecture, we would like to briefly describe the structures used today with...

Big Data 2022-11-25 09:09:17

Post Retail Powerbi

Supercharge Your Success with Power BI in Retail and E-commerce!

AutoML 2023-05-26 14:44:45

Get Notifications When We Share New Stories