DataFrame employees
+-------------+--------+
| Column Name | Type |
+-------------+--------+
| name | object |
| salary | int |
+-------------+--------+
A company intends to give its employees a pay rise.
Write a solution to modify the salary column by multiplying each salary by 2.
The result format is in the following example.
Example 1:
Input: DataFrame employees +---------+--------+ | name | salary | +---------+--------+ | Jack | 19666 | | Piper | 74754 | | Mia | 62509 | | Ulysses | 54866 | +---------+--------+ Output: +---------+--------+ | name | salary | +---------+--------+ | Jack | 39332 | | Piper | 149508 | | Mia | 125018 | | Ulysses | 109732 | +---------+--------+ Explanation: Every salary has been doubled.
The #2884 Modify Columns problem focuses on manipulating values column-wise in a grid or matrix. Since the operation is applied per column, the key idea is to iterate through each column independently while maintaining the constraints required by the problem.
A common approach is to traverse the matrix column by column and analyze the values present in that column. Depending on the requirement (such as making elements equal, ordered, or matching a specific condition), you can compute the minimal number of modifications needed. Using simple counting techniques or frequency tracking with structures like arrays or hash maps often helps determine the optimal value for that column.
Because each cell is typically visited once, the algorithm remains efficient. The overall complexity is generally O(m × n), where m is the number of rows and n is the number of columns, while the extra space required is minimal.
| Approach | Time Complexity | Space Complexity |
|---|---|---|
| Column-wise traversal with counting | O(m × n) | O(1) or O(n) |
Techmicrosoft
Use these hints if you're stuck. Try solving on your own first.
Considering multiplying each salary value by 2, using a simple assignment operation. The calculation of the value is done column-wise.
This approach uses a loop to iterate through each row of the DataFrame and manually updates the 'salary' column by multiplying it by 2. This method is straightforward and helps understand the manipulation of each data point.
Time Complexity: O(n), where n is the number of employees.
Space Complexity: O(1), only a constant space is used for temporary variables.
1import pandas as pd
2
3data = {'name': ['Jack', 'Piper', 'Mia', 'Ulysses'],
4 'salary': [19666, 74754, 62509, 54866]}
5employees = pd.DataFrame(data)
6
7for index, row in employees.iterrows():
8 employees.at[index, 'salary'] = row['salary'] * 2
9
10print(employees)In this Python solution, we use a for loop to iterate over the DataFrame rows using the iterrows() method. For each row, we update the 'salary' column by accessing the current index and multiplying the current value by 2. This changes the original DataFrame in place.
This approach leverages DataFrame vectorized operations, which are optimized and efficient for operations on entire columns without needing explicit loops.
Time Complexity: O(n), where n is the number of employees. This is performed efficiently due to the vectorized operation.
Space Complexity: O(1), as modifications are done in place.
1import pandas as pd
2
3data = {'name': ['Jack', 'Piper',
Watch expert explanations and walkthroughs
Jot down your thoughts, approach, and key learnings
While the exact problem may not always appear, matrix traversal and column-based manipulation are common themes in technical interviews. Practicing problems like Modify Columns helps build strong fundamentals for grid and array-based questions.
Arrays or hash maps are commonly used to count frequencies or track values within a column. These structures help quickly determine the best value to transform a column into while minimizing operations.
The optimal approach is to process the matrix column by column and determine the minimal changes needed for each column. By analyzing column values and applying counting or frequency tracking, you can compute the required modifications efficiently.
Most solutions run in O(m × n) time because each cell in the matrix is visited at least once. Since the work per cell is constant, the algorithm remains efficient even for larger grids.
In this Python solution, we utilize the DataFrame's ability to perform vectorized operations. Here, we directly multiply the entire 'salary' column by 2 using employees['salary'] *= 2. This operation is highly efficient and modifies the DataFrame in place.