How to find market segments using machine learning?

​In the previous applications of machine learning in pricing, we looked at algorithms that require labeled datasets—that is, those that include an output variable. But there are also situations where you want to classify information without having such a variable.

MACHINE LEARNINGSEGMENTATIONARTIFICIAL INTELLIGENCE

10/20/20252 min read

In these cases, unsupervised classification algorithms are used. One of the most common is cluster analysis, a technique that groups data into subsets or segments that share similar characteristics with each other and differ from the elements of other groups.

Clustering to identify market segments
Cluster analysis is frequently applied in pricing to identify market segments. Each segment groups people who value common product or service attributes, although in different proportions compared to other segments.

For example, a car rental company conducted a conjoint analysis with 200 potential customers. This analysis made it possible to understand the relative importance each respondent assigned to six service attributes, in addition to price.

Then, the pricing manager used a statistical application to find three different groups in the data. The algorithm delivered the average importance of each attribute for the identified segments.

In the first group, the most valued attribute was damage and theft coverage. This segment was named “Cautious.” The second group turned out to be the most price-sensitive, so it was labeled “Savers.” The third segment prioritized the company’s brand and was labeled “Trust-seekers.”

Based on these findings, the pricing manager designed three different subscription plans. He used the two most relevant functional attributes—damage and theft coverage, and included mileage—to build a price menu targeted at each segment.

How many segments should the model identify?
The number of clusters to identify is not decided by the algorithm but by the researcher. This input parameter is essential, as it directly affects the usefulness of the results.

When the algorithm is asked to find too many segments, the groups tend to resemble each other. In many cases, clusters end up being intermediate points between two clearly differentiated segments. For example, a fourth group might simply be a blend of “Savers” and “Trust-seekers,” adding no new insight. These redundant segments must be removed.

Conversely, dividing the market into only two groups may oversimplify the market’s diversity and defeat the purpose of segmentation.

Therefore, a practical recommendation when performing cluster analysis in pricing is to look for three to four segments. This range offers a balance between simplicity and informative richness.

Conclusion
Cluster analysis is a powerful tool for segmenting markets without the need for output variables. It enables identifying patterns in customer behavior and designing differentiated pricing strategies for each group.

What’s most interesting is that, just like the other machine learning models discussed, no programming is required to use them. You only need a sufficiently large and high-quality dataset, and the tool does the rest.

However, what happens when you don’t have a robust database? Is it possible to apply artificial intelligence in pricing without reliable historical data?

This will be addressed in the next stage, where we will explore the use of expert systems as an alternative when data is not sufficient.