πŸ’» Implements

Problems to implement and code

Counting Divisors

CSESβ€’3/10
πŸ“š Number Theory (core) 4/10Combinatorics (secondary) 3/10πŸ“š Harmonic Series (core) 2/10πŸ“š Prime Sieve (core) 2/10

Problem Summary:

Given a number N <= 1e5 we will get N queries of numbers <= 1e6, report how many factors those numbers have.

Notes:

Naive solution, for a given factor we add contributions, so 1 contributes to 1, 2, 3, ... X. 2 contributes to 2, 4, 6, .... Then to answer queries we output in O(1). It takes X log X time to build the counts and N time to answer queries, N + X log X. Another solution is if we know the smallest prime factor "spf" for all numbers, we can divide a number by its spf repeatedly and use that info to compute # of divisors. If a number has 2^3 * 3^1 * 5^4 it has 4 * 2 * 5 options of factors. So we divide by the SPF and track the exponent, when the spf changes we multiply these together. On average a number M has log log M prime factors (ramanujan) so each query would take log log X time. To build the spf array we can use a prime sieve but mark which number is the spf. This can be done by iterating on a number and then its multiples and marking composites. We only need to mark composites when the number itself was prime. For instance 4 is not prime so we don't iterate on 8 12 16 etc since we know those will be marked. Naively this is N log N with harmonic series but in practice it is N log log N not sure how. We can mark the spf of a number every time we mark it. We do: for divisor in range(1, MAX_X +1) ... check if this divisor is prime (never had an spf set) for mult = 2 * divisor; multi <= MAX_X; mult += divisor mark spf here However we can make some optimizations, see the "sieves.py" guide for notes but a few callouts: -mult can start at divisor**2 not divisor*2, because any number < divisor would have marked any multiples. But this requires some long long casting like for long long mult = 1LL * divisor * divisor. Doesn't change complexity or I think runtime very much. -We can loop divisor up to root(MAX_X) not MAX_X. Remember the goal of the outer loop is to mark larger composite numbers as not prime. But if our factor is > root(MAX_X) then any multiple of it would have a factor less than root(MAX_X) which would already be marked. The only downside is the outer loop won't mark spf[somePrime] = somePrime, for instance MAX_X = 100 we don't ever mark spf[13] = 13, it stays 0. But implicitly anything with an spf=0 is prime. Also if we are just checking a bool array isPrime = [True] * n we aren't worried at all because those isPrimes are already true so stopping the loop on root(MAX_X) is totally fine.

Drill Type:

πŸ’» Implement

Implement different versions of this: -Harmonic series -SPF and repeatedly factorize numbers + use combinatorics -DP on divCount and spfPower -Re-read notes and understand different sieves and complexities

1082. Sales Analysis I

Leetcodeβ€’Easyβ€’2/10
πŸ“š Database (core) 2/10

Problem Summary:

We have a SQL table of sales, select every seller_id that had a max total price of sales.

Notes:

I tried this at first: -- Write your PostgreSQL query statement below SELECT seller_id, SUM(price) AS price_sum FROM Sales GROUP BY seller_id; The issue is this will show every seller<>total sales but I only want to show the top amount(s). GPT showed this: SELECT seller_id FROM Sales GROUP BY seller_id HAVING SUM(price) >= ALL ( SELECT SUM(price) FROM Sales GROUP BY seller_id ); "ALL lets you compare a value against every value returned by a subquery." I asked this: "how can we do having SUM price when we didnt select price in that select though, you only selected the seller id" GPT said this: Because HAVING is allowed to use aggregate expressions that are computed from the rows in each group, even if you don’t output them. In a grouped query, Postgres conceptually does: FROM β†’ get rows GROUP BY seller_id β†’ make buckets compute aggregates per bucket (SUM(price), COUNT(*), etc.) HAVING ... β†’ keep/discard buckets using those aggregates SELECT ... β†’ choose what columns to output for each remaining bucket So this is totally valid: SELECT seller_id FROM Sales GROUP BY seller_id HAVING SUM(price) >= 1000; You’re filtering groups by their sum, but you’re only displaying the seller_id. Also GPT gave another solution here: WITH s AS ( SELECT seller_id, SUM(price) AS total FROM Sales GROUP BY seller_id ) SELECT seller_id FROM s WHERE total = (SELECT MAX(total) FROM s);

Drill Type:

πŸ’» Implement

Read through notes and resolve

Solutions:

3816. Lexicographically Smallest String After Deleting Duplicate Characters

Leetcodeβ€’Hardβ€’5/10
πŸ“š Monostack (core) 6/10

Problem Summary:

We have a string of letters. We can keep picking 2 of the same letter and delete one. Find the lexicographical minimum string.

Notes:

Hard hard hard. I am stronger tomorrow.

Drill Type:

πŸ’» Implement

Solve it yourself

Digit Queries

CSESβ€’2/10
πŸ“š Math (core) 1/10

Problem Summary:

Consider the infinite string of numbers 123456789101112... Answer q <= 1000 queries, what is the k <= 1e18 digit?

Notes:

Find how many full numbers of each width we pass, basically implementation practice

Drill Type:

πŸ’» Implement

Practice implementation, just free-style solve it

3811. Number of Alternating XOR Partitions

Leetcodeβ€’Mediumβ€’6/10
πŸ“š Lop Off Method (core) 7/10πŸ“š Dynamic Programming (core) 6/10Mod Math (mention) 1/10Bit Operations (core) 3/10

Problem Summary:

Given an array of numbers N <= 1e5 A[i] <= 1e9, and target1 and target2, find how many partitions of the array exist where each block XOR is target1, then target2, then target1, repeating.

Notes:

This is so hard. We will track endA and endB default dicts. endA[x] tells us the # of ways where our last block was target1, and the XOR of the entire prefix is x, as we enumerate on the array. Initially endB[0] = 1. As we add a new element we have some new prefix XOR. We wish a suffix partition to be equal to target1 or target2, meaning we need to look at some prior prefix. If we want a suffix block to be target2 we need to look at the number of ways we previously had an ending block of target1. But not just any previous ending, we use the prefix XOR to establish which ones we can look at. The ending of the solution is tricky too because we return addOne and addTwo specifically for only partitions that ended at n-1.

Drill Type:

πŸ’» Implement

resolve this hard problem

3812. Minimum Edge Toggles on a Tree

Leetcodeβ€’Hardβ€’3/10
πŸ“š Kahn's Algorithm (core) 5/10Tree (core) 4/10

Problem Summary:

We have an undirected tree where every node is 0 or 1, and a desired target list of 0s and 1s for each node. In an operation we can toggle both ends of an edge. Return a minimum # of edge flips or show it is impossible.

Notes:

I thought of a toposort, basically a leaf node must toggle if it needs to be switched. We could do a postorder DFS but toposort made sense to me in the moment (basically the same thing). This was a unique application of a topo sort because it was on an undirected tree so the implementation was a bit different. And a good lesson was to get the topo order first then process it after.

Drill Type:

πŸ’» Implement

Read my notes and also re-implement my weird topo sort

197. Rising Temperature

Leetcodeβ€’Easyβ€’1/10
πŸ“š Database (core) 1/10

Problem Summary:

We have a database with columns ID, date, temperature. Find all rows with a temperature bigger than the previous day.

Notes:

Join the table with itself shifted by 1, select when the temperature is bigger.

Drill Type:

πŸ’» Implement

Practice a basic join

Solutions:

196. Delete Duplicate Emails

Leetcodeβ€’Easyβ€’1/10
πŸ“š Database (core) 1/10

Problem Summary:

We have a table with column id, and email address. Keep only the smallest id per email address (use a DELETE).

Notes:

Mental model for deletes: MENTAL MODEL: DELETE + USING (PostgreSQL) DELETE removes rows from ONE target table DELETE FROM Person; Meaning: Person is the only table rows can be deleted from SQL scans Person and deletes rows (all rows, since no condition) DELETE ... WHERE ... = delete rows that match a condition DELETE FROM Person WHERE email = 'john@example.com '; Meaning: Iterate over each row in Person If the WHERE condition is true for that row, delete it Only Person rows are deleted DELETE ... USING ... = delete from target table, but allow references to other tables General form: DELETE FROM target_table USING other_tables WHERE condition involving target_table and other_tables; Key idea: Rows are deleted ONLY from target_table USING tables are READ-ONLY helpers USING allows join-like comparisons while deciding which target rows to delete Think: "For each row in target_table, check if there exists a matching row in USING tables that satisfies the WHERE condition"

Drill Type:

πŸ’» Implement

Practice delete and review mental model

Solutions:

619. Biggest Single Number

Leetcodeβ€’Easyβ€’1/10
πŸ“š Database (core) 1/10

Problem Summary:

We have a table MyNumbers with a column nums. Find the largest number appearing once or return null.

Notes:

First we can select on MyNumbers to remove duplicate numbers with GROUP BY. Then filter with HAVING COUNT(num) = 1; This returns a table of all numbers with frequency 1, we subquery to get the max

Drill Type:

πŸ’» Implement

Review notes for how group by relates to select and count from rows β†’ groups, and then aggregates like MIN are defined on groups. Think of it as two phases Phase A: form groups (buckets) GROUP BY email partitions the table into buckets: bucket β€œjohn@…” contains rows with ids {1,3} bucket β€œbob@…” contains rows with ids {2} … At this point you do not yet have output rows. Phase B: produce one output row per bucket Now SQL evaluates the SELECT list once per bucket: email = the bucket label (same for the whole bucket) MIN(id) = take all id values in the bucket and compute the minimum So for john’s bucket: MIN({1,3}) = 1. That’s why it’s β€œsafe”: you’ve told SQL how to collapse many ids into one value per group. Why β€œconflict” isn’t quite right It’s not that SQL β€œtries and then conflicts and then fixes it.” It’s that after grouping, you’re no longer allowed to ask for raw row columns (id) unless they’re: part of the group key, or computed from the group via an aggregate (MIN, MAX, COUNT, …)

Solutions:

C. Divine Tree

Codeforcesβ€’1400β€’5/10
πŸ“š Constructive Algorithms (core) 5/10Queue / Deque (secondary) 4/10Simulation (secondary) 4/10Tree (secondary) 3/10πŸ“š Greedy Summing all numbers 1 to N (core) 4/10πŸ“š Reflection Trick (core) 4/10

Problem Summary:

We have N <= 1e5 nodes from 1...N. We are given M <= 1e12. The divine-ness of a node is the minimum value from that node to the root of a tree. Can we form a tree with divine-ness M? If so return the edges.

Notes:

First note a tree with root 1 has the minimum divine-ness which is N, so below that is impossible. A max divine tree would be like 7->6->5->4... which is n * (n + 1) / 2, so we cannot make more than that. Start with a min organization: 1>2>3>4>5 To increase by 1 we can rotate: 2>1>3>4>5 Again: 2>3>1>4>5 Eventually 1 gets put on the back: 2>3>4>5>1 We can do it again with 2, eventually reaching: 3>4>5>2>1 I simulated with a deque. We can also think about a baseline we score 1 per node, so we have P = m - n remaining to score. Any time we put a number X before 1, we gain X-1 extra (if we put them in descending order, like 6 5 4 1 we gain 5 + 4 + 3). So we can take the biggest numbers that sum to our remaining target and put everything else after the 1.

Drill Type:

πŸ’» Implement

Implement wice's solution

E2. Erase and Extend (Hard Version)

Codeforcesβ€’2200β€’6/10
πŸ“š String Algorithms (core) 6/10

Problem Summary:

We have a string s of size <= 5e5. We need to form a string of size k <= 5e5. We can delete the last letter of s, or do s = s + s. After infinite operations, find the smallest lexicographical string we can form.

Notes:

First read the E1 solution. Then here is the greedy idea. There may also be other solutions like z-function that can be used but I'm just listing this greedy one / putting it in string algorithms. Don't really consider this constructive. We will enumerate prefixes greedily Say we have some prefix "db" and are considering if "dbc" repeated is better That c will compare to the start of our prefix, d, so it is better If we were at "db" and considering "f", clearly "db" + "d" is better than "dbf" and we fail if the letter is equal, like "dbca" considering "d" (in the test "dbcadabc") we don't know which sequence is better yet We store "dbca" as the best still but accept the new "d" as an option and keep going If we keep going a lot, say our initial prefix |AB| CDEFGHHIJK like everything after C is being marked as equal We don't need to mod our "indexToConsider" back to the beginning, I think because it is guaranteed the CDEFGH... portion is just ABABAB repeated

Drill Type:

πŸ’» Implement

Solve this

Planets Queries I

CSESβ€’4/10
πŸ“š Binary Lifting On Nodes (core) 1/10Graphs (secondary) 2/10

Problem Summary:

You have N planets each has a teleporter to some other planet. We have Q queries where we teleport up to K times. Which planet do you end up on? N,Q <= 2e5. K <= 1e9.

Notes:

Binary lifting

Drill Type:

πŸ’» Implement

Implement a solution

1706. Where Will the Ball Fall

Leetcodeβ€’Mediumβ€’4/10
πŸ“š Dynamic Programming (core) 4/10

Problem Summary:

We have a 2d grid and we drop balls, it's like plinko sort of (see picture). For each starting location at the top where does it fall.

Notes:

I used dp(row, column, quadrant in square, moves at this row). Moves at this row prevents us from cycling back and forth forever.

Drill Type:

πŸ’» Implement

solve it

Movie Festival II

CSESβ€’5/10
πŸ“š Tricky Sorting Questions (core) 5/10Sorted Containers (core) 2/10

Problem Summary:

We have movies with [L, R] times and K people. How many can we watch?

Notes:

First thought might be store a heap of R (ending times of people currently watching movies), sort movies by R, and when we get to one, pop all that are <= L. See this test: k=2 (1, 50) (1, 60) (100, 150) (40, 200) We would assign 2 people to 50 and 60, pop both for 150, but then 200 we cannot actually attend. Sorting by L doesn't work either because we cannot determine if this L is worth watching or if we should save it. Instead imagine K lanes of people, we sort by R to greedily consume early and assign the range to the latest ending time that is <= L. I sorted by R and if there is a tie on R it doesn't matter, because we will assign to the largest ending that is free, so any order works. I don't think another sort works (like if tie R, always put the leftmost L first and then assign that interval to the smallest last, instead of the biggest last that is <= L, because a further bigger R+1 or something might demand a small L that we filled up).

Drill Type:

πŸ’» Implement

remember the idea and read notes and implement

Traffic Lights

CSESβ€’4/10‒⭐ Great Problem
Coordinate Compression (mention)πŸ“š Interval Tree (core) 2/10πŸ“š Points versus ranges trickiness (core) 5/10πŸ“š Sparse Segment Tree (core) 4/10

Problem Summary:

We have a road from 0...x and every second we turn on a light at one point in the road, what is the longest passage of unlit area after each point? Note the lights operate in ranges, so if we have [0, 1, 2, 3] and we turn on 0 and 2, 0-2 is still a valid unlit passage.

Notes:

First we can use a sparse seg tree with point lights and range "longest contiguous range of 0s" by storing pref, suff, max, width inside the lazy nodes. A lot of lessons with lazy node practice such as create two children every time such as in pointUpdate, how to use pointers safely, and don't store nodes in a vector it is so easy to make bugs. Also in the pointUpdate function for a sparse seg tree as we descend and create nodes, we can create random dummy values see this python solution (TLE): https://cses.fi/problemset/result/15936995/ because as we recurse down those nodes will pull anyway. But the child we don't recurse to in a point update has to be made correct since we will pull on it. With a vector of nodes and each node has indices pointers it was so hard, pushing to a vector and doing Node& node = tree[i] causes issues because the tree vector resizes and tree[i] diverges from the node. Inside a pull of sparse seg tree since we already have the node we should update the values of the node directly to keep parent pointers that point down to us. Okay now for the query this is one of those weird point vs range index tricks. If we have 0 1 2 3 (4) 5 (6) 7 where () means it is lit, our longest path is 4 from 0-4. Even if 4 is lit. There's some edge cases with boundaries being lit and whatnot too. But my point is we cannot just take the longest range of unset points and add 1 to get the longest range. Like the 5 area has an unlit passage of 2 because its boundaries are lit, but the 7 has an unlit passage of 1 since it is at the edge. Easiest implementation is to light 0 and x at the start and everything works. I also did a multiset interval tree style solution but instead of storing ranges we just store the cuts, it makes the implementation easy and is good practice. Also I think we could do a dense segment tree with compression but it seems difficult to manage the widths of things.

Drill Type:

πŸ’» Implement

Implement a sparse seg tree (tricky thing around the pointRecurse function), use pointers in C++ for the seg tree, and remember the point versus span weirdness in this question, also implement the interval tree inverted

Tree Distances I

CSESβ€’5/10‒⭐ Great Problem
πŸ“š Rerooting DP (core) 6/10

Problem Summary:

find the max distance to another node for all nodes

Notes:

Tree rerooting DP. First compute the answer for a max dist below for all nodes, when rooted at 0. Next we reroot. Iterate over adjacent children to solve for them. The max distance for a child is either its down distance (computed in dfs 1), 1 + max distance going up from our node (we will maintain an up array we solve for in dfs2), or 1 + max distance going down, from our node, but not going through this child. To compute that we need the best 2 downward paths (computed in step 1).

Drill Type:

πŸ’» Implement

Solve with reroot

Tree Distances II

CSESβ€’5/10‒⭐ Great Problem
πŸ“š Rerooting DP (core) 2/10

Problem Summary:

for each node in a tree, find the sum of distances to all other nodes

Notes:

First compute the size of a subtree for all nodes (root at 0) and then we can get the distance to all nodes below us too. The root 0 would then be the distance to all other nodes. To transition we run a second dfs, iterate over children of a node. The child answer is the parents answer but all nodes in the child subtree moved 1 closer and all other nodes 1 further.

Drill Type:

πŸ’» Implement

Implement reroot dp

Hotel Queries

CSESβ€’4/10
πŸ“š Segment Tree (core) 3/10

Problem Summary:

support point update and query leftmost >= X

Notes:

use a max seg tree and walk, could use sqrt decomp technically

Drill Type:

πŸ’» Implement

Implement the seg tree walk

Distinct Values Queries II

CSESβ€’6/10‒⭐ Great Problem
πŸ“š Segment Tree (core) 7/10Sorted Containers (core) 4/10Merge Sort TreeSegment Tree Scanning Weirdness

Problem Summary:

Support point updates and range queries [L, R] "are all numbers distinct"

Notes:

We can keep a next[i] array which points to the next occurence of that number or N if no further ones. Build a range min seg tree on this. If the min is > R, all distinct. To support updates we need sorted containers of indices to do bookkeeping. We could also compute # distinct with a merge sort tree on the nxt pointers, find how many values are >R in a range. I think fractional cascading could speed up the merge sort tree when using vectors, and I think order statistic containers could allow updates but not sure... Tin said there is a 2d structure like a fenwick tree on a persistent segment tree not even going to tag this. He said you can range set and count distinct too.

Drill Type:

πŸ’» Implement

Implement this nightmare

Distinct Values Queries

CSESβ€’4/10‒⭐ Great Problem
πŸ“š Coordinate Compression (core) 3/10πŸ“š Mo's Algorithim (core) 3/10πŸ“š Persistent Segment Tree (core)Merge Sort TreeπŸ“š Segment Tree Scanning Weirdness (core) 6/10

Problem Summary:

Support offline [L, R] distinct # of values queries

Notes:

Mo's to sort queries. A hashmap to track frequencies as we slide TLEs due to constant factor inside the hot path. Instead, we compress values to positions and store them in an array, compressed[i] is just the compressed index of arr[i]. We maintain an active count field and increment or decrement as we hit 0 or 1 frequencies. This constant optimization passes. Also when we do Mo's we should expand the boundaries both first, otherwise l could pass r and we start subtracting elements from the range. This would be un-done when we expand r, but bookkeeping (for instance checking things have frequency 0 and deleting them from a hashmap) might get bugged. We can also support online queries in logN with a persistent segment tree. We could also get num distinct with a merge sort tree on nxt[i] pointers and search how many values are >R. I think fractional cascading can speed this up. I think we could even support updates if using order statistic containers. We could also do offline scanning weirdness: Make a first[i] array which is 1 if it is the first occurrence of an number, otherwise 0. Sort queries by L. As we increment L we lose numbers out the left, we need to update inside our window any first occurrence of a number inside that window (can be done with nxt array bookkeeping). To get the # of first elements in our window use fenwick tree diffing to query sum l...r

Drill Type:

πŸ’» Implement

Just implement mo's smoothly along with the other needed optimizations

3762. Minimum Operations to Equalize Subarrays

Leetcodeβ€’Hardβ€’6/10‒⭐ Great Problem
πŸ“š Segment Tree on Values (secondary) 4/10πŸ“š Persistent Segment Tree (core)πŸ“š Median Distances (core) 3/10πŸ“š Square Root Decomp (core)Two Rebalanced Windows (secondary)πŸ“š Mo's Algorithim (core) 7/10πŸ“š Sparse Table (secondary) 6/10πŸ“š Fenwick Trees (core) 3/10

Problem Summary:

Support range queries [L, R]. Given an operation where we can change A[i] by +-k, find minimum # of operations in a range to make every number equal, or -1 if not possible.

Notes:

If all numbers in a range do not have the same remainder mod K, it is not doable. We could make a remainder array and compare the max and min with a sparse table, or many other solutions (Mo's + max/min using two-stacks trick or monodeque, Mo's + hashmap of unique values, # of distinct elements query online or offline on a remainder array, etc). Now we need to find the median and sum of elements <= median and sum >= median and set everything to the median. First, we can use sliding window multisets with Mo's but it is O(n * rootN * logN) and TLEs because of overhead. We can use Mo's + fenwick/seg tree which is O(n * rootN * logN) but it passed with some optimizations. Here are some notes: -Fenwick or seg tree can store values not indices, and we update and remove values as we slide. We need to find the k-th element (can store counts in a node and do a tree walk) to find the median. And the sum of the first k elements if we store sums in nodes. We could also support queries like sum of numbers from [lowerBound, higherBound] by diffing prefixes or using a seg tree. Also we need to coordinate compress we can compress all array values and then use binary search on query values to find the nearest compressed array value, or compress both. Basically a few ways to do things. To AC I used optimizations: -Hilberts -Fenwicks instead of seg tree -Find the k-th element (median) and then sum of the first K, then use O(1) to get the top half sum and count so we only do 2 log operations per Mo. We can also use online range query for median and online range query sum <= median using persistent seg tree for n log n. Tin has some sqrt decomp code that seems to support point updates, range median queries, and range sum queries for l...r in a value range x <= ? <= y too, but not sure how this works: https://leetcode.com/problems/minimum-operations-to-equalize-subarrays/submissions/1844480184/ I bet we can do a merge sort tree with prefix sums too or something crazy.

Drill Type:

πŸ’» Implement

Implement something and also read through notes and different ideas

F. Cherry Tree

Codeforcesβ€’5/10‒⭐ Great Problem
πŸ“š Tree Convolution (core) 5/10

Problem Summary:

The leaves of a tree each have a cherry. We can shake a subtree to remove all cherries in it, but if you shake a leaf that already dropped the cherry the tree broke. Can we clear all cherries with a # of shakes divisible by 3?

Notes:

Standard tree convolution where we look at children and merge them together, however the implementation is a bit more unintuitive here. For instance we normally have a beforeChild and afterChild state and we initialize beforeChild to an identity value, but the identity in this problem was confusing: [True, False, False] meaning we can clear all cherries in this tree with a remainder of 0 shakes. This is vacuously true. Because when we consider just the root node as the tree (before any children) we don't have any leaves inside yet and thus can clear this empty tree. To aggregate, we need to clear the new LHS + RHS and can update newDp[0] [1] and [2]. I ended up implementing it with no identity value and starting my beforeChild dp on the first node, and looping through the second and further node. To merge in the children we do need to update newDp[1] and this isn't considering the chance we can shake the entire subtree at the root. That is more of an edge case that can be done after and we always update newDp[1] at the very end. There is also a second editorial solution I didn't understand

Drill Type:

πŸ’» Implement

I should understand the idea of the identity value in tree convolution and what the exact meaning of the states for before and after child mean for this problem, and how shaking the entire tree is unrelated to the convolution