mikaelahonen.com website has been renewed! Read more.
Kubernetes have been everywhere lately. Especially in the context of MLOps. I gave it a try by creating web app with Python Flask.

Running Flask frontend and backend in Kubernetes

Kubernetes has been everywhere lately. Especially in the context of MLOps to manage the plethora of different tasks such as training, serving and registering the models.

I found an undocumented product_id parameter in Pipedrive API to attach products to deals. The issue is reported to Pipedrive dev team.

An undocumented product_id parameter in Pipedrive API to attach products to deals

I found a “bug” in the Pipedrive API documentation while exploring a customer case together with Pipedrive partner SaaShop .

Google Colab, Databricks Community Edition, Visual Studio Code and Dcoker are some options to create a free data science workspace.

Free data science workspaces

I have written multiple blog posts about machine learning (ML) engineering and machine learning platforms. Those systems are usually target to productionize ML solutions, are somewhat big investments and focus on managing the whole ML lifecycle.

Comparing the major machine learning platforms AWS SageMaker, Azure Machine Learning, Google Vertex AI and Databricks.

Comparison of machine learning platforms in major clouds

This blog post compares machine learning platforms from major cloud providers Azure, AWS and Google Cloud. Also Databricks platform has been included.

Machine learning in predictive maintenance. The two-part blog series provides insights for cost savings and an example script in Python.

Machine learning in predictive maintenance

Predictive maintenance aims to repair the equipment before the failure actually happens. Scheduled maintenances minimize the production downtime especially in industrial companies.

Experiences from DataCamp online training. Structured data science courses are easy to organize for yourself or a team.

DataCamp - Learn data science online

DataCamp is an online learning platform for data science. The data science course catalog contains wide selection of Python, R, SQL and Excel videos and assignments.

The article goes through the PySpark execution logic and provides guidelines to optimize the speed and performance.

PySpark execution logic and code optimization

On last fall I wrote about the PySpark framework at my previous employer’s blog. As the name indicates, the topic is extremely technical.

Clustering time series data with SQL - Nice 3D visualization using simple logic. Python notebook example in GitHub with industrial data.

Clustering data using SQL - An example with industrial IoT data

Clustering time series data with SQL. The purpose of this experiment was to prove that doing data science doesn’t always require fancy tools.

A tutorial for parallel computation with Spark and Python. The example has been ran on AWS cloud computing platform.

Spark + Python tutorial for data developers

Go to Spark + Python tutorial in AWS Glue in Solita’s data blog. Spark and parallel computing A shop cashier can only serve a limited amount of customers at a given time.

I wrote to Solita's blog about text analytics with the headline "Finnish stemming and lemmatization in python". The post has code examples.

Finnish stemming and lemmatization in python

I wrote to Solita’s Data blog about text analytics with the headline Finnish stemming and lemmatization in python. Read the writing here .

Investing when a depression is coming. Is it bad moment to start investing to stock index that are usually seen as sage bets?

Should you start investing if a depression is coming? - Data analysis

This is a summary from the original Finnish blog article. Data analysis result: Invest only the money you don’t need at the moment The purpose of the stock market analysis was to answer this question: Is it good idea time to start regular investing if a depression is coming?

Python code to automatically list header fields of multiple CSV files. The original use case was related to data warehouse documentation.

Csv headers to list using Python

A datawarehouse project required documentation for incoming CSV-files. The intent was to list all header fields of tens of CSV files grouped by the file name.

Python based Django web framework offers a great platform to create a data oriented web application for any size of needs.

Django tutorial - For data oriented web developers

Django is a web framework for Python programming language which in practise means well designed folder structure and pre-made class modules for most common functionalities in web service.