I am preparing for Google Cloud Professional Machine Learning Engineer certification .
The certification is mandatory in the project I am working on. You find my feelings about it in the last chapter.
According to original plan this post would have been the announcement about successfully passing the ML Engineer certification exam. As the exam seems to be more difficult than expected, I will Instead go through why the training has taken so long and what I have learned about the preparations. One motivation to publish the thoughts early was to give advices to other members in the customer team where I am working as a consultant.
The other episodes in the blog series will introduce how I structured my learning notes.
Learning materials for the ML Engineer certification
I started from the basic introduction to Google Cloud . The course is on Cloud Skills Boost portal that seems to cost around 29 $ a month . I did not need to pay it myself.
Even being relatively experienced with cloud, the introduction provided some new insights and familiarized with Google specific products. Presentation style was awesome! If you are in hurry, the value for the exam is not significant though.
Google provides a nice recap for ML concepts . Many of the learning path materials go through the same ML basics later. Again, if you know the basic modeling techniques and evaluation methods by name, you can prioritize the exam specific learning.
Also might be good idea to bookmark the ML data preparation materials. And maybe the Google Cloud architecture center .
Google ML best practices is especially important as it gives directly answers to many exam questions.
Then I moved to the Google Cloud Machine Leaning Engineer Learning Path in Cloud Skills Boost. The small quizzes at the end of each section is a convenient way to check your knowledge about the topic. You can even first take the quiz and watch the videos only if you found the questions difficult to answer.
Google has done excellent job about breaking down the most important ML concepts and teaching them in understandable way. By far best materials I have seen about these topics.
Many of the materials might repeat the basics of a typical ML process. Just skip those and spend your time in the topics you are less familiar with.
Keeping notes for the certification exam topics
I started keeping notes early on about the topics that were new to me.
It is good to keep in mind that the exam will be very detailed. Write down all terms you are not familiar with and learn what they mean.
As an example, you most probably need to know what is the
TF_CONFIG environment variable used for. It is not common sense.
Before going to exam, review the notes carefully.
My notes contain seemingly irrelevant details. Rather than taking this blog series as learning material it is an example of my learning path towards the Google Cloud ML Engineer certification.
The answer is in the details
I recommend quite soon trying for example Exam Topics webpage to see how the certification exam questions will look like. It is good training at the same time. Even though the community votes for the correct answer are not trustworthy at all.
The questions will be typically in case format:
“You have setup A B C. You want to achieve thing X. What should you do.”
There can be a single word that reveals whether you should focus on cheap or maybe the most precise solution. It can be extremely challenging to spot the relevant part.
And sometimes no training can prepare you. After finishing all ML Engineer learning path videos in Cloud Skills Boost I tried the 20 sample questions here . I got 6 of them correct. Even after reading the detailed justifications I could not identified the correct answer with full confidence.
For some questions the correct answer felt like opinions rather than truths. Here is an example from the sample question form:
Your organization’s marketing team wants to send biweekly scheduled emails to customers that are expected to spend above a variable threshold. This is the first machine learning (ML) use case for the marketing team, and you have been tasked with the implementation. After setting up a new Google Cloud project, you use Vertex AI Workbench to develop model training and batch inference with an XGBoost model on the transactional data stored in Cloud Storage. You want to automate the end-to-end pipeline that will securely provide the predictions to the marketing team, while minimizing cost and code maintenance. What should you do?
Alternatives and explanations:
A is correct because Vertex AI Pipelines and Cloud Storage are cost-effective and secure solutions. The solution requires the least number of code interactions because the marketing team can update the pipeline and schedule parameters from the Google Cloud console.
B is not correct. Cloud Composer is not a cost-efficient solution for one pipeline because its environment is always active. In addition, using BigQuery is not the most cost-effective solution.
C is not correct because the marketing team would have to enter the Vertex AI Workbench instance to update a pipeline parameter, which does not minimize code interactions.
D is not correct. Cloud Composer is not a cost-efficient solution for one pipeline because its environment is always active. Also, using email to send personally identifiable information (PII) is not a recommended approach.
I chose correct answer for this one, but could have chosen the C as well. I would not let marketing team touch Google Cloud console in any circumstance. Also the other alternatives are arguable.
You can find exam guide here . The question categories are roughly as follows:
- Which Google Cloud products to use
- What ML Engineering methods to use (model selection, loss metric, data split)
- Which Tensorflow configuration to use
- How to ingest and pre-process data
- What is Google best practice for something
- Deploying ML solutions
In the Google Cloud product related questions the easy alternatives like serverless or drag and drop are often good candidates, but certainly not always.
The exam has limited time. You need to be able to answer relatively quickly.
I knew many of the methods in the training materials from earlier projects. But still needed extra practice to have the answer rapidly in any situation.
There seems not to be many trick questions. If a product was mentioned in the questions it usually exists. It just was not the right choice for the described case.
Reading the question thoroughly
Read the questions carefully. While doing practice exams I missed negations in two questions being like “Which answer is not correct”?.
One question in the practice exam was about Tensorflow graphs which I knew nothing about. By knowing that the option with
some_variable=no is against typical programming principles I was able eliminate it. Most certainly it would be
Many question give a hint about junior data scientist. Often this is a clue that a simpler approach should be preferred.
After having disastrous results from the sample questions I thought I need more practical training in Google Cloud.
Some topics are difficult to memorize only by watching the videos or reading the docs. Learning Labs from the Cloud Skills Boost would be the best way forward as they offer the temporary Google Cloud training environment out of the box.
In my opinion the coding tasks were not optimal for learning. Filling the exact missing code felt difficult. They are all based on this public repo . I felt better alternative to do it this way:
- See which notebook the Lab is referring
- Copy the code to your own Tensorflow environment and notebook
- Play around with the code
If you are familiar with Python
Pandas library for ML solutions, that is nice. Unfortunately it is not helpful for the ML Engineer certification. Everything is about Tensorflow. The whole AI ecosystem in Google Cloud is built around it. And I had very basic Tensorflow experience…
Before going to the exam I am planning to take another practice exam from Whizlabs. You find the link from the same page where they have the 25 sample questions .
Exam price and restrictions
ML Engineer exam costs 200 $. This will be covered by my employer.
There are no pre-requisites such as lower tier certifications.
The exam has 50-60 multiple choice questions to answer in 120 minutes. It can be done remotely. The room should be empty and you are monitored through web cam. No breaks or notes. I heard from a colleague that the schedule might be delayed, do not take it late in the evening.
He also advised to skip the questions you can not answer and come back later. Sometimes the answer is found from the other questions.
The pass/fail will be announced immediately after the exam but you will not get the exact results.
Here is the retake policy for the Google Cloud certifications:
- 1st retake can be taken after 14 days
- 2nd retake 60 days after the 1st retake
- 3rd retake after 365 days
My advice is to be well prepared and not take the exam to test yourself. I will take my first attempt around 3 months before my deadline.
Google Cloud certificates are valid 2 years. Ofcourse it will stay in your CV forever as an accomplishment.
More information about the ML certification practicalities on this page .
Will I pass the GCP ML certification exam?
As said, the certification is told to be mandatory in my project.
The exam questions seem really difficult. There is absolutely no guarantee I will pass it. I can only do my best. It is overall great advice in life.
I really do not stress about it. Obviously not because it would be easy, but due to the fact there are more important things in life.
My blog has actually became a safeguard against failing. I can always learn, showcase my skills, share the experiences and encourage others my own way even if I fail to fullfil external expectations.