Dataflow for ML Engineers in Google Cloud
Dataflow product in Google Cloud is mandatory for advanced data processing pipelines for machine learning solutions. It performs typical data engineering tasks by allowing same code to execute both batch and streaming.
Machine learning products in Google Cloud
This is a summary of Google Cloud Platform (GCP) products relevant for Machine Learning Engineer role. Google philosphy seems to be that moving to their platform requires minimal changes to the existing solution.
Comparison of machine learning platforms in major clouds
Comparing the major machine learning platforms AWS SageMaker, Azure Machine Learning, Google Vertex AI and Databricks.
Spark + Python tutorial for data developers
A tutorial for parallel computation with Spark and Python. The example has been ran on AWS cloud computing platform.
Introduction to AWS Glue for big data ETL
AWS Glue service works especially well for big data batch processing. Read the full post from data.solita.fi.
Maximizing uptime in Hiab hackathon
Read about Solita team's solution in a hackathon organized by Hiab. The task was to take advantage of data to maximize machine uptime.
Data science and business intelligence - Definitions
It's easy to spot these hype terms like data science, big data in LinkedIn or exhibition posters. I summarized the definitions.