Clusterization

The technique of finding similarities in the data point and grouping similar entities together

Hours total

1350

Technical Stack

PCA, Decision trees, Recurrent models, Probabilistic programming, VGG, Word2vec

Services Involved

Dedicated Team

Have similar project?

OVERVIEW

Clustering is a machine learning technique of finding similarities in the data point and grouping similar entities together. Clustering is often conducted at the stage of exploratory data analysis to better understand the dataset structure, or as a preliminary step for more complicated models.

Challenge

To identify and match heterogeneous data from various sources in different formats.

Solution

To implement a highly-parallel complex algorithm with embedded RNN, CNN and DNN architectures for various types of media. Based on DTW path, Euclidian and cosine distances different metrics were defined. To get final results, bloom filters were applied.

Results

A two-step parallel algorithm performing fast clusterization of given data with very high confidence score and capable to speed up data processing by a factor of 10.

Recent cases

Our latest challenges where technology meets creativity

image
image

Anomaly Detection

The technique to identify unusual patterns that do not conform to expected behavior

image
image

Healthcare Web Solution

Engineering the flow of healthcare

image
image

Handbrake

A few clicks is all it takes

Contact us

If you have a question, request or just want to meet up for coffee, call, email us or fill out the form and we will contact you asap.

Full Name
Email
Short message