SQL

Snippets on basic & advanced functions

SQL

Use ARRAY_AGG to flatten columns to lists.

You need to convert a long table of values into a list? This is the most convenient function. πŸ“Œ Syntax: πš‚π™΄π™»π™΄π™²πšƒ π™°πšπšπ™°πšˆ_𝙰𝙢𝙢(πšŒπš˜πš•πšžπš–πš—) π™΅πšπ™Ύπ™Ό πšπšŠπš‹πš•πšŽ There you go, the column is flattened into a list separated by comma. ❌ To do so, I used to copy the entire column into a spreadsheet tool, transpose it, and use

SQL

Pivot Table (From Long to Wide)

Every data scientist, scientist, and engineer should know how to create a pivot table. π™²π™°πš‚π™΄ πš†π™·π™΄π™½ is the best way to do so in SQL. πŸš€ Here are THREE ways to pivot a table for monthly sales. 1️⃣ We find the πš‚πš„π™Ό of all salesperson each month. Here, use πš‚πš„π™Ό(π™²π™°πš‚π™΄ πš†π™·π™΄π™½). 2️⃣ We find the average

SQL

Find duplicates with RANK

Let's clean out duplicates! Here's how. 1️⃣ Check if duplicates exists. Use π™Άπšπ™Ύπš„π™Ώ π™±πšˆ + π™²π™Ύπš„π™½πšƒ(*) to find duplicates. See example 1 in image below. 2️⃣ Identify exact rows that are duplicated. Use πšπ™°π™½π™Ί π™Ύπš…π™΄πš(π™Ώπ™°πšπšƒπ™Έπšƒπ™Έπ™Ύπ™½ π™±πšˆ πšŒπš˜πš•πšœ). See example 2. What does πšπ™°π™½π™Ί π™Ύπš…π™΄πš(π™Ώπ™°πšπšƒπ™Έπšƒπ™Έπ™Ύπ™½ π™±πšˆ πšŒπš˜πš•πšœ) do? πŸ‘‰ divides the table into multiple partitions, one for each unique value

SQL

Find Year-on-Year Growth Rate

This is a common interview question that you MUST nail! The formula is (value - last_year_value) / last_year_value. You can calculate last_year_value as πŸ“Œ 𝙻𝙰𝙢(πšŸπšŠπš•πšžπšŽ, 𝟷) πš˜πšŸπšŽπš› (πš˜πš›πšπšŽπš› πš‹πš’ πšŸπšŠπš•πšžπšŽ) 𝙻𝙰𝙢(πšŸπšŠπš•πšžπšŽ,𝟷) gets the value from the previous row. 𝙻𝙰𝙢(πšŸπšŠπš•πšžπšŽ,𝟸) gets the value from the row before the previous row. STEPS: 1️⃣ Use 𝙻𝙰𝙢 to get

SQL

Use EXISTS to check if a row is present

Need to find if records from table A can also be found in table B? EXISTS is here to help. πŸ“Œ Here's the syntax. πš‚π™΄π™»π™΄π™²πšƒ * π™΅πšπ™Ύπ™Ό πšπšŠπš‹πš•πšŽ_𝚊 πš†π™·π™΄πšπ™΄ π™΄πš‡π™Έπš‚πšƒπš‚ ( πš‚π™΄π™»π™΄π™²πšƒ * π™΅πšπ™Ύπ™Ό πšπšŠπš‹πš•πšŽ_πš‹ πš†π™·π™΄πšπ™΄ πšπšŠπš‹πš•πšŽ_𝚊.πš”πšŽπš’ = πšπšŠπš‹πš•πšŽ_πš‹.πš”πšŽπš’ ) πš’πš—πš—πšŽπš›_πššπšžπšŽπš›πš’ Let's walk through this. 1️⃣ the inner query retrieves all the records from table_b that has a common key with table_a.

SQL

Use SAFE_DIVIDE to handle division by zero.

π™Žπ˜Όπ™π™€_π˜Ώπ™„π™‘π™„π˜Ώπ™€ divides two numbers while handling the case where the denominator is zero. πŸ“Œ Here's the syntax. πš‚π™΄π™»π™΄π™²πšƒ πš‚π™°π™΅π™΄_π™³π™Έπš…π™Έπ™³π™΄(πš—πšžπš–πšŽπš›πšŠπšπš˜πš›, πšπšŽπš—πš˜πš–πš’πš—πšŠπšπš˜πš›) π™΅πšπ™Ύπ™Ό πš—πšžπš–πš‹πšŽπš›πšœ If the denominator is zero for any row, it returns NULL instead of causing your query to fail. π™Žπ˜Όπ™π™€_π˜Ώπ™„π™‘π™„π˜Ώπ™€ is not part of the standard SQL language, but it is often included as a

SQL

CUBE+GROUP BY for Subtotals

We all have used GROUP BY, but do you know CUBE? For example, if I have a table of user spending, I can get the subtotal of each user AND the total spend of all users combined, all within one table. πŸ“Œ Syntax is here. πš‚π™΄π™»π™΄π™²πšƒ πšžπšœπšŽπš›_πš’πš, πš‚πš„π™Ό(πšœπš™πšŽπš—πš) πšŠπš–πš π™΅πšπ™Ύπ™Ό πšžπšœπšŽπš›πšœ π™Άπšπ™Ύπš„π™Ώ π™±πšˆ π™²πš„π™±π™΄(πšžπšœπšŽπš›_πš’πš) This saves you the need to

SQL

Get subtotals with ROLLUP+GROUP BY

π™π™Šπ™‡π™‡π™π™‹ generates extra rows that represent the subtotals. For example… To get the sum of spending of each user, you typically GROUP BY user_id and SUM spend. Want to get the sum of ALL users in the same result? Try ROLLUP. πŸ“Œ The syntax is here. πš‚π™΄π™»π™΄π™²πšƒ πš„πšœπšŽπš›_πš’πš, πš‚πš„π™Ό(πšœπš™πšŽπš—πš) π™΅πšπ™Ύπ™Ό πš„πšœπšŽπš›πšœ π™Άπšπ™Ύπš„π™Ώ π™±πšˆ πšπ™Ύπ™»π™»πš„π™Ώ(πšžπšœπšŽπš›_πš’πš) The result will contain a

SQL

CROSS JOIN for all combinations

It is the cartesian products of both tables, if you're a math nerd. πŸ“Œ Syntax: πš‚π™΄π™»π™΄π™²πšƒ * π™΅πšπ™Ύπ™Ό πšπšŠπš‹πš•πšŽ_𝟷 π™²πšπ™Ύπš‚πš‚ 𝙹𝙾𝙸𝙽 πšπšŠπš‹πš•πšŽ_𝟸 Note that there is no need for an ON clause. CROSS JOIN is useful for getting all combinations - Matching users with all dates. - Match product tables with color tables to get product