Remove Sub-Folders from the Filesystem - Solution & Explanation

Q: Is Remove Sub-Folders from the Filesystem easy or hard?

The problem is rated Medium because the key insight involves recognizing how lexicographic sorting groups parent folders with their subfolders. Once that observation is made, the implementation becomes straightforward.

Q: Remove Sub-Folders from the Filesystem Python/Java solution

Python, Java, C++, and JavaScript implementations usually follow the same strategy: sort the folder list, iterate through it, and skip paths that start with the previous kept folder plus '/'. Trie implementations are also possible in Python and C++ for demonstrating hierarchical parsing.

Q: How to solve Remove Sub-Folders from the Filesystem in O(n)?

Strict O(n) relative to the number of folders is difficult because sorting is typically required. However, a Trie-based approach processes each character of every path once, giving O(T) complexity where T is the total length of all paths. This effectively behaves like linear time with respect to input size.

Q: What is the best approach for Remove Sub-Folders from the Filesystem?

The sorting and prefix-check approach is considered the best practical solution. Sort the folder paths lexicographically and keep a folder only if it is not prefixed by the previously accepted folder plus '/'. This reduces the problem to a single pass after sorting and runs in O(n log n) time.

Q: Is Remove Sub-Folders from the Filesystem asked at Google/Amazon/Meta?

Filesystem path problems and prefix filtering questions frequently appear in interviews at large companies such as Amazon, Google, and Meta. They test string manipulation, sorting strategies, and hierarchical data structures like tries.

Q: What data structure is used in Remove Sub-Folders from the Filesystem?

The most common solution uses arrays and string prefix comparison after sorting the paths. Another approach uses a Trie to represent folder hierarchy, allowing efficient detection of parent folders during insertion.

Q: What is the time complexity of Remove Sub-Folders from the Filesystem?

The common optimal solution runs in O(n log n) time due to sorting the folder paths, followed by an O(n) scan to filter subfolders. A Trie-based approach can achieve O(T) time where T is the total number of characters across all folder paths.

MediumArray String Depth-First Search Trie22 min readAsked at: Amazon, Microsoft, Meta +4

Practice this problem

Problem Statement

Given a list of folders folder, return the folders after removing all sub-folders in those folders. You may return the answer in any order.

If a folder[i] is located within another folder[j], it is called a sub-folder of it. A sub-folder of folder[j] must start with folder[j], followed by a "/". For example, "/a/b" is a sub-folder of "/a", but "/b" is not a sub-folder of "/a/b/c".

The format of a path is one or more concatenated strings of the form: '/' followed by one or more lowercase English letters.

For example, "/leetcode" and "/leetcode/problems" are valid paths while an empty string and "/" are not.

Example 1:

Input: folder = ["/a","/a/b","/c/d","/c/d/e","/c/f"]
Output: ["/a","/c/d","/c/f"]
Explanation: Folders "/a/b" is a subfolder of "/a" and "/c/d/e" is inside of folder "/c/d" in our filesystem.

Example 2:

Input: folder = ["/a","/a/b/c","/a/b/d"]
Output: ["/a"]
Explanation: Folders "/a/b/c" and "/a/b/d" will be removed because they are subfolders of "/a".

Example 3:

Input: folder = ["/a/b/c","/a/b/ca","/a/b/d"]
Output: ["/a/b/c","/a/b/ca","/a/b/d"]

Constraints:

1 <= folder.length <= 4 * 10⁴
2 <= folder[i].length <= 100
folder[i] contains only lowercase letters and '/'.
folder[i] always starts with the character '/'.
Each folder name is unique.

Approach Overview

Problem Overview: You receive a list of folder paths in a filesystem. Some folders may be subfolders of others (for example /a/b is inside /a). The task is to return only the top-level folders and remove every path that is a subfolder of another.

Approach 1: Sorting and Checking Prefix (O(n log n) time, O(1) extra space)

Sort all folder paths lexicographically. Once sorted, subfolders naturally appear right after their parent folder because they share the same prefix. Iterate through the sorted list and keep a folder only if it is not prefixed by the last accepted folder plus /. The key insight is that sorting groups related paths together, which eliminates the need to compare every pair. This approach relies heavily on string prefix comparison and works well for interview settings because it is simple and efficient. The overall time complexity is O(n log n) for sorting plus linear scanning, and space complexity is O(1) aside from the output. This approach mainly uses concepts from Array and String processing.

Approach 2: Using Trie Data Structure (O(T) time, O(T) space)

Insert each folder path into a Trie where each node represents a directory name between slashes. While inserting, stop if you reach a node already marked as a complete folder path. That indicates a parent folder already exists, so the current path is a subfolder and should be ignored. If insertion completes, mark the final node as an end node and discard any deeper children. The key idea is that the Trie naturally models hierarchical filesystem structure. Each traversal step corresponds to moving deeper into directories, making it easy to detect when a parent folder already exists. Time complexity is O(T) where T is the total number of characters across all paths, and space complexity is also O(T) due to storing the Trie nodes. This method highlights structured path processing using Trie and recursive traversal patterns similar to Depth-First Search.

Recommended for interviews: The sorting and prefix check approach is typically what interviewers expect. It demonstrates strong reasoning about lexicographic ordering and efficient scanning. The Trie approach shows deeper understanding of hierarchical data structures and filesystem modeling. Mentioning both approaches signals strong problem-solving breadth.

Approach 1: Approach 1: Sorting and Checking Prefix

The main idea of this approach is to first sort the folders alphabetically. This way, any sub-folder will appear immediately after its parent folder. We can then iterate through the sorted list and check if the current folder is a prefix of the next folder (followed by '/'). If it is not a prefix, we add it to the final result list.

This Python solution sorts the folder paths and iterates through them. It maintains a 'prev' variable to store the last added folder to the results, ensuring we only add a folder if it is not a sub-folder of the previous folder in the result list.

Code

Python C++Java JavaScript

Python

C++

Java

JavaScript

Complexity

Time Complexity: O(n log n) due to sorting the list. Space Complexity: O(n) for storing the result list.

Try this approach in the editor →

Approach 2: Approach 2: Using Trie Data Structure

In this approach, we use a Trie (or Prefix Tree) to represent the hierarchy of folders. We insert each folder path into the Trie while ensuring to mark the end of any folder. During insertion, if we find that a folder is already marked as the end of a path in the Trie, we ignore any further paths as sub-folders of this path.

This Python solution builds a Trie for all folder paths. It only marks paths as ends when they are not sub-folders, thus ignoring deeper sub-folders in the structure.

Code

Python C++

Python

C++

Complexity

Time Complexity: O(n * l) where n is the number of folders, and l is the average length of the folder paths. Space Complexity: O(n * l) for the Trie.

Try this approach in the editor →

Approach 3: Sorting

First, we sort the array folder in lexicographical order, then traverse the array. For the current folder f we are traversing, if its length is greater than or equal to the length of the last folder in the answer array, and its prefix includes the last folder in the answer array plus a /, then f is a subfolder of the last folder in the answer array, and we don't need to add it to the answer array. Otherwise, we add f to the answer array.

After the traversal ends, the folders in the answer array are the answer required by the problem.

The time complexity is O(n times log n times m), and the space complexity is O(m). Where n and m are the length of the array folder and the maximum length of the strings in the array folder, respectively.

Code

Python Java C++Go TypeScript JavaScript

Python

Java

C++

TypeScript

JavaScript

Try this approach in the editor →

Approach 4: Trie

We can use a trie to store all the folders in the array folder. Each node of the trie contains a children field, used to store the child nodes of the current node, and a fid field, used to store the index of the folder corresponding to the current node in the array folder.

For each folder f in the array folder, we first split f into several substrings according to /, then start from the root node and add the substrings to the trie in turn. Next, we start from the root node to search the trie. If the fid field of the current node is not -1, it means that the folder corresponding to the current node is a folder in the answer array. We add it to the answer array and return. Otherwise, we recursively search all child nodes of the current node and finally return the answer array.

The time complexity is O(n times m), and the space complexity is O(n times m). Where n and m are the length of the array folder and the maximum length of the strings in the array folder, respectively.

Code

Python Java C++Go TypeScript JavaScript

Python

Java

C++

TypeScript

JavaScript

Try this approach in the editor →

Complexity Comparison

Approach	Complexity
Approach 1: Sorting and Checking Prefix	Time Complexity: O(n log n) due to sorting the list. Space Complexity: O(n) for storing the result list.
Approach 2: Using Trie Data Structure	Time Complexity: O(n * l) where n is the number of folders, and l is the average length of the folder paths. Space Complexity: O(n * l) for the Trie.
Sorting	—
Trie	—

Detailed Complexity Analysis

Approach	Time	Space	When to Use
Sorting + Prefix Check	O(n log n)	O(1)	Best general solution. Simple implementation and commonly expected in interviews.
Trie Data Structure	O(T)	O(T)	Useful when modeling hierarchical paths or when many shared prefixes exist.

Video Solution

Remove Sub-Folders from the Filesystem - Leetcode 1233 - Python • NeetCodeIO • 12,861 views views

Watch 9 more video solutions →

Frequently Asked Questions

Is Remove Sub-Folders from the Filesystem easy or hard?

The problem is rated Medium because the key insight involves recognizing how lexicographic sorting groups parent folders with their subfolders. Once that observation is made, the implementation becomes straightforward.

Remove Sub-Folders from the Filesystem Python/Java solution

Python, Java, C++, and JavaScript implementations usually follow the same strategy: sort the folder list, iterate through it, and skip paths that start with the previous kept folder plus '/'. Trie implementations are also possible in Python and C++ for demonstrating hierarchical parsing.

How to solve Remove Sub-Folders from the Filesystem in O(n)?

Strict O(n) relative to the number of folders is difficult because sorting is typically required. However, a Trie-based approach processes each character of every path once, giving O(T) complexity where T is the total length of all paths. This effectively behaves like linear time with respect to input size.

What is the best approach for Remove Sub-Folders from the Filesystem?

The sorting and prefix-check approach is considered the best practical solution. Sort the folder paths lexicographically and keep a folder only if it is not prefixed by the previously accepted folder plus '/'. This reduces the problem to a single pass after sorting and runs in O(n log n) time.

Is Remove Sub-Folders from the Filesystem asked at Google/Amazon/Meta?

Filesystem path problems and prefix filtering questions frequently appear in interviews at large companies such as Amazon, Google, and Meta. They test string manipulation, sorting strategies, and hierarchical data structures like tries.

What data structure is used in Remove Sub-Folders from the Filesystem?

The most common solution uses arrays and string prefix comparison after sorting the paths. Another approach uses a Trie to represent folder hierarchy, allowing efficient detection of parent folders during insertion.

What is the time complexity of Remove Sub-Folders from the Filesystem?

The common optimal solution runs in O(n log n) time due to sorting the folder paths, followed by an O(n) scan to filter subfolders. A Trie-based approach can achieve O(T) time where T is the total number of characters across all folder paths.

Ready to solve this problem?

Practice Remove Sub-Folders from the Filesystem with our built-in code editor and test cases.

Practice on FleetCode

Two Sum

Median of Two Sorted Arrays

Problem Info

DifficultyMedium

Acceptance78.6%

Approaches4

Reading time22 min

Asked at

Amazon Microsoft Meta Snowflake Google

Practice this problem

Open in Editor

Remove Sub-Folders from the Filesystem - Solution & Explanation

Problem Statement

Approach Overview

Approach 1: Approach 1: Sorting and Checking Prefix

Code

Complexity

Approach 2: Approach 2: Using Trie Data Structure

Code

Complexity

Approach 3: Sorting

Code

Approach 4: Trie

Code

Complexity Comparison

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Ready to solve this problem?

Problem Info

Table of Contents

Remove Sub-Folders from the Filesystem - Solution & Explanation

Problem Statement

Approach Overview

Approach 1: Approach 1: Sorting and Checking Prefix

Code

Complexity

Approach 2: Approach 2: Using Trie Data Structure

Code

Complexity

Approach 3: Sorting

Code

Approach 4: Trie

Code

Complexity Comparison

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Ready to solve this problem?

Problem Info

Table of Contents

Problem Statement

Approach Overview

Approach 1: Approach 1: Sorting and Checking Prefix

Code

Complexity

Approach 2: Approach 2: Using Trie Data Structure

Code

Complexity

Approach 3: Sorting

Code

Approach 4: Trie

Code

Complexity Comparison

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Related Problems

Ready to solve this problem?

Problem Info

Table of Contents

Problem Statement

Approach Overview

Approach 1: Approach 1: Sorting and Checking Prefix

Code

Complexity

Approach 2: Approach 2: Using Trie Data Structure

Code

Complexity

Approach 3: Sorting

Code

Approach 4: Trie

Code

Complexity Comparison

Detailed Complexity Analysis

Video Solution

Frequently Asked Questions

Related Problems

Ready to solve this problem?

Problem Info

Table of Contents