Write a solution to create a DataFrame from a 2D list called student_data. This 2D list contains the IDs and ages of some students.
The DataFrame should have two columns, student_id and age, and be in the same order as the original 2D list.
The result format is in the following example.
Example 1:
Input: student_data:[ [1, 15], [2, 11], [3, 11], [4, 20] ]Output: +------------+-----+ | student_id | age | +------------+-----+ | 1 | 15 | | 2 | 11 | | 3 | 11 | | 4 | 20 | +------------+-----+ Explanation: A DataFrame was created on top of student_data, with two columns namedstudent_idandage.
In #2877 Create a DataFrame from List, the goal is to transform a nested list of student information into a structured pandas DataFrame. Each inner list represents a row of data, while predefined column names describe the meaning of each value.
The most straightforward approach is to use the DataFrame constructor provided by the pandas library. By passing the list of lists as the data source and specifying the appropriate column labels, pandas automatically organizes the data into tabular form. This method efficiently maps each inner list to a row and assigns the corresponding column names.
This approach is ideal because pandas is designed for tabular data manipulation and performs the conversion in a single step. The operation scales linearly with the number of rows since each record must be inserted into the DataFrame structure.
Overall, using pandas.DataFrame(data, columns=[...]) provides a clean and efficient solution for constructing structured data from raw Python lists.
| Approach | Time Complexity | Space Complexity |
|---|---|---|
| Create DataFrame using pandas constructor | O(n) | O(n) |
GeeksforGeeks
Use these hints if you're stuck. Try solving on your own first.
Consider using a built-in function in pandas library and specifying the column names within it.
This approach utilizes the Python pandas library, which is specifically designed for data manipulation and analysis. We will simply pass the list to the DataFrame constructor along with column labels.
Time Complexity: O(n) where n is the number of rows in the student_data list.
Space Complexity: O(n) as the data is stored in a DataFrame.
1import pandas as pd
2
3def create_dataframe(student_data):
4 df = pd.DataFrame(student_data, columns=['student_id', 'age'])
5 return df
6
7# Example usage
8student_data = [[1, 15], [2, 11], [3, 11], [4, 20]]
9print(create_dataframe(student_data))This solution uses the pandas library to create a DataFrame. We call the DataFrame constructor with the 2D list and specify the column names. The returned DataFrame has the desired structure.
This method manually constructs a DataFrame-like structure using native programming constructs like lists, arrays, dictionaries, etc. This involves manually setting up structures to represent columns and rows and maintaining the desired format.
Time Complexity: O(n), where n is the number of rows in the student_data array.
Space Complexity: O(1) since we're modifying and printing directly without additional data structures.
1function printDataFrame
Watch expert explanations and walkthroughs
Jot down your thoughts, approach, and key learnings
Pandas provides a highly optimized DataFrame structure for working with tabular data. Using the built-in constructor simplifies the conversion process and ensures the data is organized efficiently.
Problems like this are more common in data engineering or data analysis interviews rather than traditional algorithm interviews. However, understanding pandas operations is valuable for roles involving data processing.
A list of lists works best for this problem because each inner list represents a row of data. Pandas can easily interpret this structure and convert it into a DataFrame with labeled columns.
The optimal approach is to use the pandas DataFrame constructor. By passing the list of records and specifying column names, pandas directly converts the data into a structured table in one step.
This JavaScript solution utilizes the console.log method to display the data formatted as a DataFrame, iterating through the data using forEach.