DataFrame players:
+-------------+--------+
| Column Name | Type |
+-------------+--------+
| player_id | int |
| name | object |
| age | int |
| position | object |
| ... | ... |
+-------------+--------+
Write a solution to calculate and display the number of rows and columns of players.
Return the result as an array:
[number of rows, number of columns]
The result format is in the following example.
Example 1:
Input: +-----------+----------+-----+-------------+--------------------+ | player_id | name | age | position | team | +-----------+----------+-----+-------------+--------------------+ | 846 | Mason | 21 | Forward | RealMadrid | | 749 | Riley | 30 | Winger | Barcelona | | 155 | Bob | 28 | Striker | ManchesterUnited | | 583 | Isabella | 32 | Goalkeeper | Liverpool | | 388 | Zachary | 24 | Midfielder | BayernMunich | | 883 | Ava | 23 | Defender | Chelsea | | 355 | Violet | 18 | Striker | Juventus | | 247 | Thomas | 27 | Striker | ParisSaint-Germain | | 761 | Jack | 33 | Midfielder | ManchesterCity | | 642 | Charlie | 36 | Center-back | Arsenal | +-----------+----------+-----+-------------+--------------------+ Output: [10, 5] Explanation: This DataFrame contains 10 rows and 5 columns.
To solve #2878 Get the Size of a DataFrame, the key idea is to determine how many rows and columns exist in the given DataFrame. In data manipulation libraries like pandas, a DataFrame internally stores its dimensions, so you do not need to iterate through the data.
The most efficient approach is to access the DataFrame’s built-in property that stores its shape. This property returns a pair representing the number of rows and the number of columns. By retrieving these values directly, you can quickly determine the overall size of the dataset without scanning its contents.
This method is optimal because it simply reads metadata already maintained by the DataFrame structure. As a result, the operation runs in constant time and uses no additional memory. This makes it the preferred and most efficient approach for obtaining a DataFrame’s size.
| Approach | Time Complexity | Space Complexity |
|---|---|---|
| Access DataFrame dimension property (e.g., shape) | O(1) | O(1) |
NeetCode
Use these hints if you're stuck. Try solving on your own first.
Consider using a built-in function in pandas library to get the size of a DataFrame.
This approach utilizes the built-in attributes of a DataFrame in Python to extract the number of rows and columns. Specifically, DataFrame.shape is used to get a tuple containing the dimensions of the DataFrame.
Time Complexity: O(1) because accessing the shape attribute is a constant-time operation.
Space Complexity: O(1), as we only return a fixed-size list.
1import pandas as pd
2
3def get_dataframe_size(df):
4 return list(df.shape)
5
6# Example usage
7players = pd.DataFrame({
8 "player_id": [846, 749, 155, 583, 388, 883, 355, 247, 761, 642],
9 "name": ["Mason", "Riley", "Bob", "Isabella", "Zachary", "Ava", "Violet", "Thomas", "Jack", "Charlie"],
10 "age": [21, 30, 28, 32, 24, 23, 18, 27, 33, 36],
11 "position": ["Forward", "Winger", "Striker", "Goalkeeper", "Midfielder", "Defender", "Striker", "Striker", "Midfielder", "Center-back"],
12 "team": ["RealMadrid", "Barcelona", "ManchesterUnited", "Liverpool", "BayernMunich", "Chelsea", "Juventus", "ParisSaint-Germain", "ManchesterCity", "Arsenal"]
13})
14
15print(get_dataframe_size(players)) # Output: [10, 5]The function get_dataframe_size utilizes the shape attribute of a pandas DataFrame, which returns a tuple with the number of rows and columns.
Another simple way to determine the number of rows and columns is by using DataFrame properties len() and DataFrame.columns to count each explicitly.
Time Complexity: O(1) for both row and column size retrieval since the operations are constant time.
Space Complexity: O(1), as we only store and return a fixed-size list.
1import pandas as pd
2
3def get_dataframe_size(df):
4 num_rows =
Watch expert explanations and walkthroughs
Jot down your thoughts, approach, and key learnings
While the exact question may not always appear, similar tasks related to data manipulation with pandas are common in data engineering and data science interviews. Interviewers often test familiarity with DataFrame operations and properties.
The optimal approach is to access the DataFrame’s built-in dimension property that returns the number of rows and columns. This avoids iterating over the data and directly retrieves the stored metadata.
The problem relies on understanding the structure of a DataFrame, which stores tabular data along with metadata such as its dimensions. Accessing this metadata allows constant-time retrieval of the DataFrame’s size.
The size is retrieved from a property that stores the number of rows and columns internally. Since the values are already maintained by the DataFrame structure, accessing them does not require traversing the dataset.
This approach calculates the number of rows by using the len() function on the DataFrame and the number of columns by counting the length of the columns property of the DataFrame.