2023

Finland postal code data including boundary coordinates

Data for all postal codes in Finland free of charge enriched by area boundaries as standard coordinates.

How to copy and paste text in Datalore terminal?

Datalore is an online data science environment. Typical CTRL+C and CTRL+V commands do not work in the Datalore terminal, so here is the solution.

List of business intelligence tools

List of business intelligence, reporting and data pipeline tools. Reporting tools Reporting and business intelligence tools.

Change Python version in Vertex AI

Vertex AI is a bare-bones analytics environment in Google Cloud.

Google Colab - Easily accessible Python workspace

Google Colab is a low barrier option to run Python scripts.

Datalore tech review

Datalore is a collaborative data science platform. The notebook experience has been taken to the next level.

Datalore pricing - Which licensing model to choose?

The online Python worskpace Datalore has three main models for pricing and licensing.

Wealth management web app - Technical implementation

Here is a presentation of the wealth management app I have developed.

Go web server in Docker deployed to Google Cloud Run

I created a web server with Go language and deployed it Google Cloud Run inside a Docker container.

Visualize postal code areas of Finland in Python

Reading and visualizing Finland’s postal code data on a map in Python.

Visualizing Finland's postal codes in a Filled map in Google Looker Studio

Visualize Finland’s postal code areas on a map in Google Looker Studio.

Shape Map visualization of Finland's postal codes in Power BI Desktop

with these instructions, you can visualize Finland’s postal code areas on a map in Power BI Desktop.

Public report in Looker Studio requires login - Instructions to solve

Looker Studio is a free reporting and business intelligence tool in Google Cloud.

Types of data science platforms - Workspace, MLOps or full stack?

Data Science platforms can be categorized to a few different buckets:

What kind of teams benefit from data science platforms?

Various kinds of teams from business innovation to academic research can benefit from data science platforms.

Value of data science platforms

Let’s go through the most typical use cases and their benefits to start using a data science platform.

List of data science platforms

Comprehensive list of data science platforms. Sometimes known also as machine learning platforms, ai platforms or DSML platforms.

ClearML - Robust MLOps platform for end-to-end solutions

Robust MLOps platform for end-to-end solutions.

cnvrg.io - Flexible Kubernetes deployments for advanced data science teams

For technically advanced teams looking for flexible Kubernetes deployments.

SmartPredict - Specific use cases with fully managed low code

Specific use cases with fully managed low code.

neptune.ai - Experiment tracking platform for MLOps

Log experiments and ML models versions from any environment.

Paperspace Gradient - ML platform with their own data centers and IPU processors

Paperspace Gradient machine learning platform is best known from extensive GPU support.

Saturn Cloud - Data science workspace with Dask cluster

Saturn Cloud is a greate choice for data science teams who want to maximize flexibility of their environment.

Datalore - Introduction to the advanced analytics platform

Datalore is a fairly recent online platform for advanced data analytics.

Bodo is a faster alternative for Spark to run massive ETL jobs in Python

Bodo is a platform for data processing with Python and SQL.

Vertex AI User-Managed notebooks auto shutdown

User-Managed notebooks in Vertex AI are virtual workspaces for data exploration.

30 questions for Google Cloud Professional Machine Learning Engineer exam

Around 30 questions I memorize from the Google Cloud Professional Machine Learning certification exam.

I became a certified Google Cloud Professional Machine Learning Engineer!

After 4 months of intense studying I passed the Google Cloud certification for Professional Machine Learning Engineer!

MLOps in Google Cloud

Google Cloud Platform has excellent toolset to operationalize and productionize machine learning models.

Neural networks for natural language processing

Natural Language Processing (NLP) refers to tools and methods to explore text data as well as identifiy patterns and making predictions.

Neural networks for image recognition

Some notes about image recognition while preparing for Google Cloud MLE certification.

Keras for basic neural networks

Keras is one of the high level APIs in Tensorflow deep learning stack.

Tensorflow Extended (TFX) for MLOps

Tensorflow Extended (known as TFX) is a framework to define ML pipelines.

Tensorflow for ML Engineers

I have more experience from Pandas and Scikit-Learn Python libraries compared to Tensorflow.

Are cloud certificates beneficial?

Cloud certificates are proof of developer’s competence. But are they beneficial in practice?

Recommendation systems in Google Cloud

Recommendation systems are useful to personalize experience and find relevant items among huge catalogs.

Introduction to neural networks for ML

I heard about artificial neural networks first time around 2017. Since then I have tried to understand their behavior and explain them in a simple way.

Dataflow for ML Engineers in Google Cloud

Dataflow product in Google Cloud is mandatory for advanced data processing pipelines for machine learning solutions.

2022

Machine learning fundamentals

Notes about fundamental ML concepts for Google Cloud ML Engineering certification.

Vertex AI for ML Engineers in Google Cloud

Some Google materials refer to it as Fully managed Tensorflow.

BigQuery for ML Engineers in Google Cloud

BigQuery is by far the most important storage and processing service in Google Cloud from ML perspective.

Machine learning products in Google Cloud

This is a summary of Google Cloud Platform (GCP) products relevant for Machine Learning Engineer role.

Google Cloud ML Engineer certification - Training and preparing

I am preparing for Google Cloud Professional Machine Learning Engineer certification .

Pirsch - Google Analytics alternative for static website

Setting up anayltics tracking for a web page is simple. Just copy couple of lines of code from the select analytics service and you are good to go.

EmailLabs - EU based service for 9 000 free transactional emails per month

During my website migration it became evident that I would also need a new email service.

Migration from Wordpress to Hugo

Migrations are in most cases performed poorly. The reason is that companies treat their IT ecosystem as a commodity rather than asset.

Hugo SEO

Yoast SEO is addmitedly a nice plugin in Wordpress. I was thinking a lot how to replace it in Hugo.

Image optimization in Hugo

Text content of all my 300+ blog posts is less than 2 MB in size containing the metadata.

Backend for Hugo website

Hugo framework generates HTML files that load really fast. It does not have.

Hugo vs Wordpress - Performance, price and plugins

I moved my website from Wordpress to Hugo on August 2022.

My website has been renewed!

The website migration of mikaelahonen.com has completed! Here is the background why I migrated my Wordpress site to Hugo framework that generates static websites.

Running Flask frontend and backend in Kubernetes

Kubernetes have been everywhere lately. Especially in the context of MLOps. I gave it a try by creating web app with Python Flask.

An undocumented product_id parameter in Pipedrive API to attach products to deals

I found an undocumented product_id parameter in Pipedrive API to attach products to deals. The issue is reported to Pipedrive dev team.

My Pipedrive partner SaaShop nominated as the EMEA Partner of 2021

SaaShop nominated as the Pipedrive EMEA Partner of 2021. I announced my co-operation with SaaShop recently.

My Pipedrive business has been aquired by saashop.com

My part-time company has signed a contract with Finnish SaaShop to move all my existing Pipedrive customers under their business.

2021

Free data science workspaces

Google Colab, Databricks Community Edition, Visual Studio Code and Dcoker are some options to create a free data science workspace.

Comparison of machine learning platforms in major clouds

Comparing the major machine learning platforms AWS SageMaker, Azure Machine Learning, Google Vertex AI and Databricks.

What is a machine learning platform?

What is a machine learning platform? Introducing different components such as workbench, MLOps tools and cloud computation.

2020

Machine learning in predictive maintenance

Machine learning in predictive maintenance. The two-part blog series provides insights for cost savings and an example script in Python.

Faking your geographical location to a web service - A hobby project

How to fool a web service about your actual location? In an experiment I pretended being in Ireland while traveling in Sweden.

Difference between data scientist and data engineer roles

In my opinion the big difference is that a data scientist focuses more on business problems while data engineer solves technical problems.

DataCamp - Learn data science online

Experiences from DataCamp online training. Structured data science courses are easy to organize for yourself or a team.

Databox for enterprise reporting - Review and demo

Databox enables elegant reports in a SaaS interface to be shared and published both internally and outside the organization.

Mikael Ahonen becomes a Pipedrive Premier Partner

I became a Pipedrive Premier Partner. My company provides CRM consulating in English and Finnish.

PySpark execution logic and code optimization

The article goes through the PySpark execution logic and provides guidelines to optimize the speed and performance.

Pipedrive Essential vs Advanced - Comparison of features and pricing on 2020

Pipedrive Essential and Advanced are by far the most popular plans among my customers. A comparison of different subscriptions.

2019

Clustering data using SQL - An example with industrial IoT data

Clustering time series data with SQL - Nice 3D visualization using simple logic. Python notebook example in GitHub with industrial data.

Spark + Python tutorial for data developers

A tutorial for parallel computation with Spark and Python. The example has been ran on AWS cloud computing platform.

Introduction to AWS Glue for big data ETL

AWS Glue service works especially well for big data batch processing. Read the full post from data.solita.fi.

Excel Power Map - Spatial data visualization as a time series

Excel Power Map is designed to visualize spatial data. Watch the demo video about visualizing annual asylum seeker data.

Finnish stemming and lemmatization in python

I wrote to Solita's blog about text analytics with the headline "Finnish stemming and lemmatization in python". The post has code examples.

Migrating CRM to Pipedrive - Experiences and tips

Experiences from Pipedrive CRM implementation. Pipedrive is simple for the sales people, but the migration from an old CRM requires planning.

Extended free trial period for Pipedrive CRM

Simple instructions to sign up and get started with Pipedrive CRM. Redeem the extended trial period for Pipedrive.

Experiences from funding application classification by text analytics

Experiences from funding application classification by text analytics

2018

Combining machine learning and business - Practical example

I give an example about machine learning use case in a format that should be understandable also for less technical people.

2017

Maximizing uptime in Hiab hackathon

Read about Solita team's solution in a hackathon organized by Hiab. The task was to take advantage of data to maximize machine uptime.

Unpivot columns to rows with Excel PowerQuery

Unpivoting columns to rows with Excel PowerQuery. Watch 30 seconds video how to do it without any formulas.

Csv headers to list using Python

Python code to automatically list header fields of multiple CSV files. The original use case was related to data warehouse documentation.

Parsing first name, last name and company from email in Excel - Download Excel template

Parsing first name, last name and company from email in Excel Do you have a list of emails that you want to split by first name, last name and company domain?

2016

The Free Excel Course Online

The free Excel course. Compact video lectures with exercise materials for training. The lectures have an optimal order for learning.

Lecture 1 - Introduction to Excel

Introduction to Microsoft Excel. What is workbook and worksheet? How to add and delete worksheets? Where to save and open Excel files?

Lecture 2 - Formatting Cells in Excel

Learn how formatting cells in Excel works. Change font color, autofit column width, merge cells and use bold text and bordered cells.

Lecture 3 - The Basics of Excel Formulas

In the Excel formula basics lecture we'll first mix numbers and text to create generic sentences and then we use SUM and IF.

Lecture 4 - Copy and paste in Excel

It's possible to paste only specific components of you data such as formulas, values or formattings. Make you workbook great again.

Lecture 5 - Number formats, dates and time in Excel

Number formats allow you to preserve the original cell value while displaying it as percentage, currency, date or time format.

Lecture 6 - Cell references and named ranges in Excel

Excel cell references are important part of formulas. Learn the difference between the absolute and relative reference.

Lecture 7 - Tables, sorting and filtering in Excel

In Excel's table object you can sort, filter and summarize data effortlessly as well as select the formatting that fits your needs.

Lecture 8 - String functions and advanced formulas in Excel

Excel has a set of powerful text manipulation formulas. In this lecture you will learn to apply SUBSTITUTE(), MID(), FIND() and LEN().

Creating my first Android app - How long it takes?

Creating and Android app has been in my task list a good whil but now I decided to take the pivotal step. How long would it take?

Visualization and clustering of earthquake dataset

The built-in dataset quakes in RStudio had 1000 records of earthquakes nearby Fiji.

Virus problem - A statistical puzzle

The problem: A virus is spreading across the world - it kills without treatment. Your task is to solve a statistical puzzle.

Introduction to Excel Power tools

Excel Power BI lisäosien perheeseen kuuluvat Power Query, Power Pivot, Power Map ja Power View nopeassa esittelyssä.

Data science and business intelligence - Definitions

It's easy to spot these hype terms like data science, big data in LinkedIn or exhibition posters. I summarized the definitions.

Django tutorial - For data oriented web developers

Python based Django web framework offers a great platform to create a data oriented web application for any size of needs.