Change one variable, keep the rest the same - the attribute you are observing. Munster thought he’d test Siri against Google, but he failed at the first hurdle. The questions that were asked were typed into Google, rather than spoken for Siri. This invalidates the accuracy ratings immediately; many of the errors Siri made were due to recognition errors, not her own AI.
Also, Munster misses much of Siri’s appeal. The two systems were asked knowledge questions, but much of Siri is its role as a personal assistant. For example, ask Google to set a timer for three minutes and it’ll respond with irrelevant search results. This side of Siri was not tested at all in these tests.
The whole thing was a pointless expedition. You are comparing apples to oranges.