Randy Au
1 min readOct 2, 2019

--

It… depends. On a ton of things.

SQL, JOIN logic isrun before WHERE logic, so you can’t do what you propose unless you use a subquery. But for a traditional relational database, subqueries are unindexed structures, so you are very likely to get massively worse performance from losing the index on the table than any gains you get from filtering a table down outside the index.

In a hadoop-like system that works off MapReduce (or similar concept where there’s no indexing) then you may want to do the subquery because it lowers the amount of work done. You’re still limited by SQL syntax and can’t just force a WHERE to happen first in a single query.

Optimization is hard, I wrote a massive thing about it before here:

https://link.medium.com/ovJthN9Ns0

--

--

Randy Au
Randy Au

Written by Randy Au

I stress about data quality a lot. Data nerd/scientist, camera junkie. Quant UXR @Google Cloud. Formerly @bitly, @Meetup, @primarydotcom. Opinions are my own.

No responses yet