Experiences from DataCamp online training. Structured data science courses are easy to organize for yourself or a team.
The article goes through the PySpark execution logic and provides guidelines to optimize the speed and performance.
Clustering time series data with SQL – Nice 3D visualization using simple logic. Python notebook example in GitHub with industrial data.
A tutorial for parallel computation with Spark and Python. The example has been ran on AWS cloud computing platform.
I wrote to Solita’s blog about text analytics with the headline “Finnish stemming and lemmatization in python”. The post has code examples.
Investing when a depression is coming. Is it bad moment to start investing to stock index that are usually seen as sage bets?
Python code to automatically list header fields of multiple CSV files. The original use case was related to data warehouse documentation.
Django is a web framework for Python programming language which in practise means well designed folder structure and pre-made class modules for most common functionalities in […]