π» Implements
Problems to implement and code
Counting Divisors
Problem Summary:
Given a number N <= 1e5 we will get N queries of numbers <= 1e6, report how many factors those numbers have.
Notes:
Naive solution, for a given factor we add contributions, so 1 contributes to 1, 2, 3, ... X. 2 contributes to 2, 4, 6, .... Then to answer queries we output in O(1). It takes X log X time to build the counts and N time to answer queries, N + X log X. Another solution is if we know the smallest prime factor "spf" for all numbers, we can divide a number by its spf repeatedly and use that info to compute # of divisors. If a number has 2^3 * 3^1 * 5^4 it has 4 * 2 * 5 options of factors. So we divide by the SPF and track the exponent, when the spf changes we multiply these together. On average a number M has log log M prime factors (ramanujan) so each query would take log log X time. To build the spf array we can use a prime sieve but mark which number is the spf. This can be done by iterating on a number and then its multiples and marking composites. We only need to mark composites when the number itself was prime. For instance 4 is not prime so we don't iterate on 8 12 16 etc since we know those will be marked. Naively this is N log N with harmonic series but in practice it is N log log N not sure how. We can mark the spf of a number every time we mark it. We do: for divisor in range(1, MAX_X +1) ... check if this divisor is prime (never had an spf set) for mult = 2 * divisor; multi <= MAX_X; mult += divisor mark spf here However we can make some optimizations, see the "sieves.py" guide for notes but a few callouts: -mult can start at divisor**2 not divisor*2, because any number < divisor would have marked any multiples. But this requires some long long casting like for long long mult = 1LL * divisor * divisor. Doesn't change complexity or I think runtime very much. -We can loop divisor up to root(MAX_X) not MAX_X. Remember the goal of the outer loop is to mark larger composite numbers as not prime. But if our factor is > root(MAX_X) then any multiple of it would have a factor less than root(MAX_X) which would already be marked. The only downside is the outer loop won't mark spf[somePrime] = somePrime, for instance MAX_X = 100 we don't ever mark spf[13] = 13, it stays 0. But implicitly anything with an spf=0 is prime. Also if we are just checking a bool array isPrime = [True] * n we aren't worried at all because those isPrimes are already true so stopping the loop on root(MAX_X) is totally fine.
Drill Type:
π» ImplementImplement different versions of this: -Harmonic series -SPF and repeatedly factorize numbers + use combinatorics -DP on divCount and spfPower -Re-read notes and understand different sieves and complexities
Solutions:
1082. Sales Analysis I
Problem Summary:
We have a SQL table of sales, select every seller_id that had a max total price of sales.
Notes:
I tried this at first: -- Write your PostgreSQL query statement below SELECT seller_id, SUM(price) AS price_sum FROM Sales GROUP BY seller_id; The issue is this will show every seller<>total sales but I only want to show the top amount(s). GPT showed this: SELECT seller_id FROM Sales GROUP BY seller_id HAVING SUM(price) >= ALL ( SELECT SUM(price) FROM Sales GROUP BY seller_id ); "ALL lets you compare a value against every value returned by a subquery." I asked this: "how can we do having SUM price when we didnt select price in that select though, you only selected the seller id" GPT said this: Because HAVING is allowed to use aggregate expressions that are computed from the rows in each group, even if you donβt output them. In a grouped query, Postgres conceptually does: FROM β get rows GROUP BY seller_id β make buckets compute aggregates per bucket (SUM(price), COUNT(*), etc.) HAVING ... β keep/discard buckets using those aggregates SELECT ... β choose what columns to output for each remaining bucket So this is totally valid: SELECT seller_id FROM Sales GROUP BY seller_id HAVING SUM(price) >= 1000; Youβre filtering groups by their sum, but youβre only displaying the seller_id. Also GPT gave another solution here: WITH s AS ( SELECT seller_id, SUM(price) AS total FROM Sales GROUP BY seller_id ) SELECT seller_id FROM s WHERE total = (SELECT MAX(total) FROM s);
Drill Type:
π» ImplementRead through notes and resolve
Solutions:
3816. Lexicographically Smallest String After Deleting Duplicate Characters
Problem Summary:
We have a string of letters. We can keep picking 2 of the same letter and delete one. Find the lexicographical minimum string.
Notes:
Hard hard hard. I am stronger tomorrow.
Drill Type:
π» ImplementSolve it yourself
Solutions:
Digit Queries
Problem Summary:
Consider the infinite string of numbers 123456789101112... Answer q <= 1000 queries, what is the k <= 1e18 digit?
Notes:
Find how many full numbers of each width we pass, basically implementation practice
Drill Type:
π» ImplementPractice implementation, just free-style solve it
Solutions:
3811. Number of Alternating XOR Partitions
Problem Summary:
Given an array of numbers N <= 1e5 A[i] <= 1e9, and target1 and target2, find how many partitions of the array exist where each block XOR is target1, then target2, then target1, repeating.
Notes:
This is so hard. We will track endA and endB default dicts. endA[x] tells us the # of ways where our last block was target1, and the XOR of the entire prefix is x, as we enumerate on the array. Initially endB[0] = 1. As we add a new element we have some new prefix XOR. We wish a suffix partition to be equal to target1 or target2, meaning we need to look at some prior prefix. If we want a suffix block to be target2 we need to look at the number of ways we previously had an ending block of target1. But not just any previous ending, we use the prefix XOR to establish which ones we can look at. The ending of the solution is tricky too because we return addOne and addTwo specifically for only partitions that ended at n-1.
Drill Type:
π» Implementresolve this hard problem
Solutions:
3812. Minimum Edge Toggles on a Tree
Problem Summary:
We have an undirected tree where every node is 0 or 1, and a desired target list of 0s and 1s for each node. In an operation we can toggle both ends of an edge. Return a minimum # of edge flips or show it is impossible.
Notes:
I thought of a toposort, basically a leaf node must toggle if it needs to be switched. We could do a postorder DFS but toposort made sense to me in the moment (basically the same thing). This was a unique application of a topo sort because it was on an undirected tree so the implementation was a bit different. And a good lesson was to get the topo order first then process it after.
Drill Type:
π» ImplementRead my notes and also re-implement my weird topo sort
Solutions:
197. Rising Temperature
Problem Summary:
We have a database with columns ID, date, temperature. Find all rows with a temperature bigger than the previous day.
Notes:
Join the table with itself shifted by 1, select when the temperature is bigger.
Drill Type:
π» ImplementPractice a basic join
Solutions:
196. Delete Duplicate Emails
Problem Summary:
We have a table with column id, and email address. Keep only the smallest id per email address (use a DELETE).
Notes:
Mental model for deletes: MENTAL MODEL: DELETE + USING (PostgreSQL) DELETE removes rows from ONE target table DELETE FROM Person; Meaning: Person is the only table rows can be deleted from SQL scans Person and deletes rows (all rows, since no condition) DELETE ... WHERE ... = delete rows that match a condition DELETE FROM Person WHERE email = 'john@example.com '; Meaning: Iterate over each row in Person If the WHERE condition is true for that row, delete it Only Person rows are deleted DELETE ... USING ... = delete from target table, but allow references to other tables General form: DELETE FROM target_table USING other_tables WHERE condition involving target_table and other_tables; Key idea: Rows are deleted ONLY from target_table USING tables are READ-ONLY helpers USING allows join-like comparisons while deciding which target rows to delete Think: "For each row in target_table, check if there exists a matching row in USING tables that satisfies the WHERE condition"
Drill Type:
π» ImplementPractice delete and review mental model
Solutions:
619. Biggest Single Number
Problem Summary:
We have a table MyNumbers with a column nums. Find the largest number appearing once or return null.
Notes:
First we can select on MyNumbers to remove duplicate numbers with GROUP BY. Then filter with HAVING COUNT(num) = 1; This returns a table of all numbers with frequency 1, we subquery to get the max
Drill Type:
π» ImplementReview notes for how group by relates to select and count from rows β groups, and then aggregates like MIN are defined on groups. Think of it as two phases Phase A: form groups (buckets) GROUP BY email partitions the table into buckets: bucket βjohn@β¦β contains rows with ids {1,3} bucket βbob@β¦β contains rows with ids {2} β¦ At this point you do not yet have output rows. Phase B: produce one output row per bucket Now SQL evaluates the SELECT list once per bucket: email = the bucket label (same for the whole bucket) MIN(id) = take all id values in the bucket and compute the minimum So for johnβs bucket: MIN({1,3}) = 1. Thatβs why itβs βsafeβ: youβve told SQL how to collapse many ids into one value per group. Why βconflictβ isnβt quite right Itβs not that SQL βtries and then conflicts and then fixes it.β Itβs that after grouping, youβre no longer allowed to ask for raw row columns (id) unless theyβre: part of the group key, or computed from the group via an aggregate (MIN, MAX, COUNT, β¦)
Solutions:
C. Divine Tree
Problem Summary:
We have N <= 1e5 nodes from 1...N. We are given M <= 1e12. The divine-ness of a node is the minimum value from that node to the root of a tree. Can we form a tree with divine-ness M? If so return the edges.
Notes:
First note a tree with root 1 has the minimum divine-ness which is N, so below that is impossible. A max divine tree would be like 7->6->5->4... which is n * (n + 1) / 2, so we cannot make more than that. Start with a min organization: 1>2>3>4>5 To increase by 1 we can rotate: 2>1>3>4>5 Again: 2>3>1>4>5 Eventually 1 gets put on the back: 2>3>4>5>1 We can do it again with 2, eventually reaching: 3>4>5>2>1 I simulated with a deque. We can also think about a baseline we score 1 per node, so we have P = m - n remaining to score. Any time we put a number X before 1, we gain X-1 extra (if we put them in descending order, like 6 5 4 1 we gain 5 + 4 + 3). So we can take the biggest numbers that sum to our remaining target and put everything else after the 1.
Drill Type:
π» ImplementImplement wice's solution
Solutions:
E2. Erase and Extend (Hard Version)
Problem Summary:
We have a string s of size <= 5e5. We need to form a string of size k <= 5e5. We can delete the last letter of s, or do s = s + s. After infinite operations, find the smallest lexicographical string we can form.
Notes:
First read the E1 solution. Then here is the greedy idea. There may also be other solutions like z-function that can be used but I'm just listing this greedy one / putting it in string algorithms. Don't really consider this constructive. We will enumerate prefixes greedily Say we have some prefix "db" and are considering if "dbc" repeated is better That c will compare to the start of our prefix, d, so it is better If we were at "db" and considering "f", clearly "db" + "d" is better than "dbf" and we fail if the letter is equal, like "dbca" considering "d" (in the test "dbcadabc") we don't know which sequence is better yet We store "dbca" as the best still but accept the new "d" as an option and keep going If we keep going a lot, say our initial prefix |AB| CDEFGHHIJK like everything after C is being marked as equal We don't need to mod our "indexToConsider" back to the beginning, I think because it is guaranteed the CDEFGH... portion is just ABABAB repeated
Drill Type:
π» ImplementSolve this
Solutions:
Planets Queries I
Problem Summary:
You have N planets each has a teleporter to some other planet. We have Q queries where we teleport up to K times. Which planet do you end up on? N,Q <= 2e5. K <= 1e9.
Notes:
Binary lifting
Drill Type:
π» ImplementImplement a solution
Solutions:
1706. Where Will the Ball Fall
Problem Summary:
We have a 2d grid and we drop balls, it's like plinko sort of (see picture). For each starting location at the top where does it fall.
Notes:
I used dp(row, column, quadrant in square, moves at this row). Moves at this row prevents us from cycling back and forth forever.
Drill Type:
π» Implementsolve it
Solutions:
Movie Festival II
Problem Summary:
We have movies with [L, R] times and K people. How many can we watch?
Notes:
First thought might be store a heap of R (ending times of people currently watching movies), sort movies by R, and when we get to one, pop all that are <= L. See this test: k=2 (1, 50) (1, 60) (100, 150) (40, 200) We would assign 2 people to 50 and 60, pop both for 150, but then 200 we cannot actually attend. Sorting by L doesn't work either because we cannot determine if this L is worth watching or if we should save it. Instead imagine K lanes of people, we sort by R to greedily consume early and assign the range to the latest ending time that is <= L. I sorted by R and if there is a tie on R it doesn't matter, because we will assign to the largest ending that is free, so any order works. I don't think another sort works (like if tie R, always put the leftmost L first and then assign that interval to the smallest last, instead of the biggest last that is <= L, because a further bigger R+1 or something might demand a small L that we filled up).
Drill Type:
π» Implementremember the idea and read notes and implement
Solutions:
Traffic Lights
Problem Summary:
We have a road from 0...x and every second we turn on a light at one point in the road, what is the longest passage of unlit area after each point? Note the lights operate in ranges, so if we have [0, 1, 2, 3] and we turn on 0 and 2, 0-2 is still a valid unlit passage.
Notes:
First we can use a sparse seg tree with point lights and range "longest contiguous range of 0s" by storing pref, suff, max, width inside the lazy nodes. A lot of lessons with lazy node practice such as create two children every time such as in pointUpdate, how to use pointers safely, and don't store nodes in a vector it is so easy to make bugs. Also in the pointUpdate function for a sparse seg tree as we descend and create nodes, we can create random dummy values see this python solution (TLE): https://cses.fi/problemset/result/15936995/ because as we recurse down those nodes will pull anyway. But the child we don't recurse to in a point update has to be made correct since we will pull on it. With a vector of nodes and each node has indices pointers it was so hard, pushing to a vector and doing Node& node = tree[i] causes issues because the tree vector resizes and tree[i] diverges from the node. Inside a pull of sparse seg tree since we already have the node we should update the values of the node directly to keep parent pointers that point down to us. Okay now for the query this is one of those weird point vs range index tricks. If we have 0 1 2 3 (4) 5 (6) 7 where () means it is lit, our longest path is 4 from 0-4. Even if 4 is lit. There's some edge cases with boundaries being lit and whatnot too. But my point is we cannot just take the longest range of unset points and add 1 to get the longest range. Like the 5 area has an unlit passage of 2 because its boundaries are lit, but the 7 has an unlit passage of 1 since it is at the edge. Easiest implementation is to light 0 and x at the start and everything works. I also did a multiset interval tree style solution but instead of storing ranges we just store the cuts, it makes the implementation easy and is good practice. Also I think we could do a dense segment tree with compression but it seems difficult to manage the widths of things.
Drill Type:
π» ImplementImplement a sparse seg tree (tricky thing around the pointRecurse function), use pointers in C++ for the seg tree, and remember the point versus span weirdness in this question, also implement the interval tree inverted
Solutions:
Tree Distances I
Problem Summary:
find the max distance to another node for all nodes
Notes:
Tree rerooting DP. First compute the answer for a max dist below for all nodes, when rooted at 0. Next we reroot. Iterate over adjacent children to solve for them. The max distance for a child is either its down distance (computed in dfs 1), 1 + max distance going up from our node (we will maintain an up array we solve for in dfs2), or 1 + max distance going down, from our node, but not going through this child. To compute that we need the best 2 downward paths (computed in step 1).
Drill Type:
π» ImplementSolve with reroot
Solutions:
Tree Distances II
Problem Summary:
for each node in a tree, find the sum of distances to all other nodes
Notes:
First compute the size of a subtree for all nodes (root at 0) and then we can get the distance to all nodes below us too. The root 0 would then be the distance to all other nodes. To transition we run a second dfs, iterate over children of a node. The child answer is the parents answer but all nodes in the child subtree moved 1 closer and all other nodes 1 further.
Drill Type:
π» ImplementImplement reroot dp
Solutions:
Hotel Queries
Problem Summary:
support point update and query leftmost >= X
Notes:
use a max seg tree and walk, could use sqrt decomp technically
Drill Type:
π» ImplementImplement the seg tree walk
Solutions:
Distinct Values Queries II
Problem Summary:
Support point updates and range queries [L, R] "are all numbers distinct"
Notes:
We can keep a next[i] array which points to the next occurence of that number or N if no further ones. Build a range min seg tree on this. If the min is > R, all distinct. To support updates we need sorted containers of indices to do bookkeeping. We could also compute # distinct with a merge sort tree on the nxt pointers, find how many values are >R in a range. I think fractional cascading could speed up the merge sort tree when using vectors, and I think order statistic containers could allow updates but not sure... Tin said there is a 2d structure like a fenwick tree on a persistent segment tree not even going to tag this. He said you can range set and count distinct too.
Drill Type:
π» ImplementImplement this nightmare
Solutions:
Distinct Values Queries
Problem Summary:
Support offline [L, R] distinct # of values queries
Notes:
Mo's to sort queries. A hashmap to track frequencies as we slide TLEs due to constant factor inside the hot path. Instead, we compress values to positions and store them in an array, compressed[i] is just the compressed index of arr[i]. We maintain an active count field and increment or decrement as we hit 0 or 1 frequencies. This constant optimization passes. Also when we do Mo's we should expand the boundaries both first, otherwise l could pass r and we start subtracting elements from the range. This would be un-done when we expand r, but bookkeeping (for instance checking things have frequency 0 and deleting them from a hashmap) might get bugged. We can also support online queries in logN with a persistent segment tree. We could also get num distinct with a merge sort tree on nxt[i] pointers and search how many values are >R. I think fractional cascading can speed this up. I think we could even support updates if using order statistic containers. We could also do offline scanning weirdness: Make a first[i] array which is 1 if it is the first occurrence of an number, otherwise 0. Sort queries by L. As we increment L we lose numbers out the left, we need to update inside our window any first occurrence of a number inside that window (can be done with nxt array bookkeeping). To get the # of first elements in our window use fenwick tree diffing to query sum l...r
Drill Type:
π» ImplementJust implement mo's smoothly along with the other needed optimizations
Solutions:
3762. Minimum Operations to Equalize Subarrays
Problem Summary:
Support range queries [L, R]. Given an operation where we can change A[i] by +-k, find minimum # of operations in a range to make every number equal, or -1 if not possible.
Notes:
If all numbers in a range do not have the same remainder mod K, it is not doable. We could make a remainder array and compare the max and min with a sparse table, or many other solutions (Mo's + max/min using two-stacks trick or monodeque, Mo's + hashmap of unique values, # of distinct elements query online or offline on a remainder array, etc). Now we need to find the median and sum of elements <= median and sum >= median and set everything to the median. First, we can use sliding window multisets with Mo's but it is O(n * rootN * logN) and TLEs because of overhead. We can use Mo's + fenwick/seg tree which is O(n * rootN * logN) but it passed with some optimizations. Here are some notes: -Fenwick or seg tree can store values not indices, and we update and remove values as we slide. We need to find the k-th element (can store counts in a node and do a tree walk) to find the median. And the sum of the first k elements if we store sums in nodes. We could also support queries like sum of numbers from [lowerBound, higherBound] by diffing prefixes or using a seg tree. Also we need to coordinate compress we can compress all array values and then use binary search on query values to find the nearest compressed array value, or compress both. Basically a few ways to do things. To AC I used optimizations: -Hilberts -Fenwicks instead of seg tree -Find the k-th element (median) and then sum of the first K, then use O(1) to get the top half sum and count so we only do 2 log operations per Mo. We can also use online range query for median and online range query sum <= median using persistent seg tree for n log n. Tin has some sqrt decomp code that seems to support point updates, range median queries, and range sum queries for l...r in a value range x <= ? <= y too, but not sure how this works: https://leetcode.com/problems/minimum-operations-to-equalize-subarrays/submissions/1844480184/ I bet we can do a merge sort tree with prefix sums too or something crazy.
Drill Type:
π» ImplementImplement something and also read through notes and different ideas
Solutions:
F. Cherry Tree
Problem Summary:
The leaves of a tree each have a cherry. We can shake a subtree to remove all cherries in it, but if you shake a leaf that already dropped the cherry the tree broke. Can we clear all cherries with a # of shakes divisible by 3?
Notes:
Standard tree convolution where we look at children and merge them together, however the implementation is a bit more unintuitive here. For instance we normally have a beforeChild and afterChild state and we initialize beforeChild to an identity value, but the identity in this problem was confusing: [True, False, False] meaning we can clear all cherries in this tree with a remainder of 0 shakes. This is vacuously true. Because when we consider just the root node as the tree (before any children) we don't have any leaves inside yet and thus can clear this empty tree. To aggregate, we need to clear the new LHS + RHS and can update newDp[0] [1] and [2]. I ended up implementing it with no identity value and starting my beforeChild dp on the first node, and looping through the second and further node. To merge in the children we do need to update newDp[1] and this isn't considering the chance we can shake the entire subtree at the root. That is more of an edge case that can be done after and we always update newDp[1] at the very end. There is also a second editorial solution I didn't understand
Drill Type:
π» ImplementI should understand the idea of the identity value in tree convolution and what the exact meaning of the states for before and after child mean for this problem, and how shaking the entire tree is unrelated to the convolution