Watch 6 video solutions for Modify Columns, a easy level problem. This walkthrough by You Data And AI has 601 views views. Want to try solving it yourself? Practice on FleetCode or read the detailed text solution.
DataFrame employees
+-------------+--------+
| Column Name | Type |
+-------------+--------+
| name | object |
| salary | int |
+-------------+--------+
A company intends to give its employees a pay rise.
Write a solution to modify the salary column by multiplying each salary by 2.
The result format is in the following example.
Example 1:
Input: DataFrame employees +---------+--------+ | name | salary | +---------+--------+ | Jack | 19666 | | Piper | 74754 | | Mia | 62509 | | Ulysses | 54866 | +---------+--------+ Output: +---------+--------+ | name | salary | +---------+--------+ | Jack | 39332 | | Piper | 149508 | | Mia | 125018 | | Ulysses | 109732 | +---------+--------+ Explanation: Every salary has been doubled.
Problem Overview: You are given a table-like dataset and need to update values in specific columns according to a rule. The goal is to modify those columns efficiently while keeping the rest of the data unchanged.
Approach 1: Loop and Update Method (O(n * m) time, O(1) extra space)
This approach iterates through each target column and updates its values row by row. You loop over the list of column names, access the column, and apply the transformation directly. The key idea is explicit iteration: read a value, compute the updated value, and write it back. Time complexity is O(n * m) where n is the number of rows and m is the number of columns being modified. Space complexity stays O(1) since the updates happen in place. This approach is straightforward and easy to debug, but Python-level loops can become slower on large datasets.
Approach 2: Vectorized Operations Method (O(n * m) time, O(1) extra space)
This method relies on vectorized operations provided by libraries like Pandas. Instead of iterating through rows manually, you select the target columns and apply the transformation to the entire column at once. Under the hood, the operation runs in optimized C-based routines, which makes it significantly faster than Python loops. The time complexity remains O(n * m), but the constant factors are much smaller due to vectorization. Space complexity stays O(1) if the update is applied in place.
Vectorized operations are common when working with tabular data structures and are widely used in data processing pipelines. Understanding when to replace explicit loops with vectorized expressions is a key performance optimization when working with Python data workflows or large array-like datasets.
Recommended for interviews: Start with the loop-based approach to demonstrate the core logic and correctness. Then move to the vectorized solution to show familiarity with optimized data operations. Interviewers generally expect the vectorized approach when working with tabular structures because it is cleaner, faster, and closer to real-world data processing patterns.
| Approach | Time | Space | When to Use |
|---|---|---|---|
| Loop and Update Method | O(n * m) | O(1) | Good for understanding the transformation logic or when working in environments without vectorized support |
| Vectorized Operations Method | O(n * m) | O(1) | Best for large datasets and production code using Pandas or similar columnar libraries |