🔹Project name: HOPERO – Churn prediction
🔹Project’s period: April 2025 – June 2025
🔹Partner: PredictiveDataScience
The main task was to create a prototype of a churn prediction model for KROS services. The goal was to identify customers who are at higher risk of ending their subscription, so that retention activities can be targeted earlier and based on data rather than only on manual assumptions.
How we approached it
The prototype focuses on two core product areas. It uses historical transaction data together with available product usage information, such as document activity or usage of the system. Churn is defined from the service validity period: a customer is considered churned when their last paid service period has already ended. The model is trained only on information that would have been available before the prediction point, which is important to avoid data leakage and to make the evaluation closer to a real future prediction scenario.
The challenge was not only to train a classifier, but also to prepare the data in a consistent and reliable way. Data were normalised and processed to unified time intervals. Customers with too little transaction history were filtered out, and the feature windows were designed to use recent behavior before the prediction date rather than information from after the customer had already churned.
The prototype was designed as a complete pipeline from database export to model evaluation and prediction output. The solution focused on several key areas:
🔹Data export and preparation: Scripts were prepared to restore the KROS database snapshot in Docker, export the required tables to CSV, and merge larger datasets from usage tables into a format suitable for modelling.
🔹Churn label creation: Customers were labelled as churned or active based on the end date of their service period, using transaction history as the source of truth.
🔹Leakage-aware feature engineering: Features were calculated only from a fixed historical window before the prediction point. This includes metadata from the available instances and also behavioral data about the system usage.
🔹Model training and evaluation: A Random Forest classifier was trained separately for different datasets. The evaluation includes standard classification metrics, ROC AUC, confusion matrix, feature importance analysis, and visual performance reports.
🔹Prediction outputs: The pipeline saves trained model artifacts and generates per-customer churn probabilities, making the result usable for follow-up analysis or retention prioritization.

“We enjoyed the collaboration, which showed how KROS transaction and usage data can be transformed into practical churn-risk indicators for future customer-retention activities.”
JAKUB KOPÁL
Research Engineer
KInIT
What we delivered
The collaboration resulted in a working prototype for churn prediction for KROS customers. The delivered code prepares data from a database snapshot, trains and evaluates product-specific models, saves reusable model artifacts, and exports customer-level churn probabilities. The evaluated model achieved a ROC AUC score of 0.80. This makes the prototype a useful early-warning tool, with the decision threshold adjustable according to KROS retention priorities. The solution provides KROS with a practical foundation for customer-retention use cases and further model development.

“Our collaboration with KInIT demonstrated how to unlock real practical value from standard transactional data. Having evaluated the first round of our pilot phase, the model successfully identified around 250 high-churn-risk customers. We then ran a targeted campaign on a sample of 24 of them, which resulted in 5 conversions into revenue. Given the nature of these customers and the fact that no other communication was planned, it is highly likely that without this predictive model, they wouldn’t have been engaged in time, leading to unavoidable churn. These results clearly highlight the practical value of a predictive approach and its potential to drive measurable business impact.”
Ing. MONIKA MOŽJEŠÍKOVÁ
Data Analyst
KROS
Project team members
