Finland postal code data including boundary coordinates

Data for all postal codes in Finland free of charge enriched by area boundaries as standard coordinates.

How to copy and paste text in Datalore terminal?

Datalore is an online data science environment. Typical CTRL+C and CTRL+V commands do not work in the Datalore terminal, so here is the solution.

Change Python version in Vertex AI

Vertex AI is a bare-bones analytics environment in Google Cloud. As simple tasks as changing the Python version requires multiple steps.

Google Colab - Easily accessible Python workspace

Google Colab is a low barrier option to run Python scripts. Here is a brief introduction for the essential features.

Datalore tech review

Datalore is a collaborative data science platform. The notebook experience has been taken to the next level.

Datalore pricing - Which licensing model to choose?

The online Python worskpace Datalore has three main models for pricing and licensing. At the same time they provide logical path towards building you company’s data analysis ecosystem.

Wealth management web app - Technical implementation

Here is a presentation of the wealth management app I have developed.

Go web server in Docker deployed to Google Cloud Run

I created a web server with Go language and deployed it Google Cloud Run inside a Docker container.

Visualize postal code areas of Finland in Python

Reading and visualizing Finland’s postal code data on a map in Python. Python has many great packages to work with geospatial such as geopandas.

cnvrg.io - Flexible Kubernetes deployments for advanced data science teams

For technically advanced teams looking for flexible Kubernetes deployments.

neptune.ai - Experiment tracking platform for MLOps

Log experiments and ML models versions from any environment.

Paperspace Gradient - ML platform with their own data centers and IPU processors

Paperspace Gradient machine learning platform is best known from extensive GPU support. They have recently partnered with Graphcore to provide new generation processors.

Saturn Cloud - Data science workspace with Dask cluster

Saturn Cloud is a greate choice for data science teams who want to maximize flexibility of their environment. Integrated parallel processing with Dask differentiates it from the competitors. With open source tools teams can design a workflows that fit best to their specific needs.

Datalore - Introduction to the advanced analytics platform

Datalore is a fairly recent online platform for advanced data analytics.

Bodo is a faster alternative for Spark to run massive ETL jobs in Python

Bodo is a platform for data processing with Python and SQL. It is especially suitable for large datasets thanks to its unique parallel processing technology.

Keras for basic neural networks

Keras is one of the high level APIs in Tensorflow deep learning stack. It is the recommended framework to get started with neural networks, if you do not have special requirements.

Tensorflow for ML Engineers

I have more experience from Pandas and Scikit-Learn Python libraries compared to Tensorflow. I was surprised how large the Tensorflow ecosystem with its ML engineering extensions.

EmailLabs - EU based service for 9 000 free transactional emails per month

During my website migration it became evident that I would also need a new email service. Previously the email hosting was integrated in the CPanel of my web hotel.

Running Flask frontend and backend in Kubernetes

Kubernetes have been everywhere lately. Especially in the context of MLOps. I gave it a try by creating web app with Python Flask.

An undocumented product_id parameter in Pipedrive API to attach products to deals

I found an undocumented product_id parameter in Pipedrive API to attach products to deals. The issue is reported to Pipedrive dev team.

Free data science workspaces

Google Colab, Databricks Community Edition, Visual Studio Code and Dcoker are some options to create a free data science workspace.

Comparison of machine learning platforms in major clouds

Comparing the major machine learning platforms AWS SageMaker, Azure Machine Learning, Google Vertex AI and Databricks.

Machine learning in predictive maintenance

Machine learning in predictive maintenance. The two-part blog series provides insights for cost savings and an example script in Python.

DataCamp - Learn data science online

Experiences from DataCamp online training. Structured data science courses are easy to organize for yourself or a team.

PySpark execution logic and code optimization

The article goes through the PySpark execution logic and provides guidelines to optimize the speed and performance.

Clustering data using SQL - An example with industrial IoT data

Clustering time series data with SQL - Nice 3D visualization using simple logic. Python notebook example in GitHub with industrial data.

Spark + Python tutorial for data developers

A tutorial for parallel computation with Spark and Python. The example has been ran on AWS cloud computing platform.

Finnish stemming and lemmatization in python

I wrote to Solita's blog about text analytics with the headline "Finnish stemming and lemmatization in python". The post has code examples.

Should you start investing if a depression is coming? - Data analysis

Investing when a depression is coming. Is it bad moment to start investing to stock index that are usually seen as sage bets?

Csv headers to list using Python

Python code to automatically list header fields of multiple CSV files. The original use case was related to data warehouse documentation.

Django tutorial - For data oriented web developers

Python based Django web framework offers a great platform to create a data oriented web application for any size of needs.