Not calling os.MkdirAll all the time

Inspired by this post:

I wonder whether maybe it would also make sense to cache information about existing directories, so that a node wouldn’t do the equivalent of mkdir -p for each single upload, trash piece deletion or GC file move operation. With Python-like pseudocode:

directories_we_know_exist = set()

# replace each existing call to os.MkdirAll with:
if directory not in directories_we_know_exist:
    os.MkdirAll(directory, …)
    directories_we_know_exist.add(directory)

As storage nodes never removes directories which would ever be used again, we wouldn’t have to worry about cache invalidation. And given the number of directories used by nodes is very small, it could be a simple RAM-based cache, no need to persist it.

If not calling stat() in the linked thread helped despite no clear explanation why system cache doesn’t help, not calling mkdir() might help as well… I can’t test this hypothesis, as on my machine there’s no difference in either stat() or mkdir(). But it seems thepaul has access to an environment that might have the necessary qualities to test it.

2 Likes

I’ve been thinking, instead, that we should just omit the mkdir -p call entirely. Only when the open/rename fails with ENOENT would we do the mkdir and then try again. The directory is already present the vast majority of the time, I would think.

Something like this:

f, err := os.OpenFile(destPath, os.O_WRONLY|os.O_CREATE|os.O_EXCL, 0o644)
if err != nil {
    if errors.Is(err, fs.ErrNotExist) {
        err = os.MkdirAll(filepath.Dir(destPath), 0o700)
        if err != nil {
            return err
        }
        f, err = os.OpenFile(destPath, os.O_WRONLY|os.O_CREATE|os.O_EXCL, 0o644)
        if err != nil {
            return err
        }
    } else {
        return err
    }
}

Does that sound like a good alternative?

Although, if the mkdir doesn’t cost anything on your machine, and your machine is more representative of nodes in the community, maybe it’s not worth doing.

3 Likes

@thepaul, I see what you’re aiming for, and it makes sense since the directory is usually there. However, using exceptions for regular control flow—like relying on ENOENT to create directories—is known as the Exception as Control Flow anti-pattern. While it can work, it can also make the code harder to maintain over time.

That said, if directory creation is rare, this approach might save some time! Just something to keep in mind. :blush:

Sure!

Yet we should not pass up our opportunities in that critical 3%

(Knuth)

2 Likes
$ storagenode setup
Getting things ready...
Creating satellite directories...
US1 satellite: Creating gf/77 directory...
2 Likes

Folder not existing is an exception. On the mature node all these folders are expected to exist.

Another approach is to pre-create all those folders.

On the third hand I don’t believe mkdir is a bottleneck worth focusing on. Did anyone actually measured time mkdir takes? I would expect it to check existence of the directly path and bail out.

1 Like

Thanks for your input! While it’s true that it’s an exception when the folder does not exist, the point I’m making is more about good coding practices. It’s generally better to avoid using exception handling to control regular logic flow, such as creating directories in this case. Exceptions should ideally handle logging for example, rather than being part of the normal flow of logic.

At the time of setup, I don’t think the setup command knows for sure which satellites the node operator plans on using now or later on. There could be new satellites in the future. Might want to create all directories for a satellite the first time a user connects to that satellite.

2 Likes

I’m not sure I understand what are you describing here.

Why?

If an expected, uncexceptional, part-of-normal-operation event A happens 0.001% of the time, but the check for it will be executed 100% of the time, it’s perfectly fine to use exception mechanics to handle case A. Slowing down the 100% of cases just so that the GOF is happy is highly counterproductive. These patterns are nice to ponder at in the academic setting, but the reality is very different.

It can also make the code easier to maintain: the code will reflect only business logic, and the setup (e.g. folder creation) will be handled elsewhere.

In fact, it’s a very useful approach – write simple code assuming all prerequisites are satisfied, and handle missing dependencies in the exception handler. Makes the code very readable and simple. Simple → good.

Again, the fact that some books don’t like it is a problem entirely in the heads of the authors and does not require solutions, let alone anyone changing their behavior, outside of it.

(disclaimer, I’ve read that book at the early stage of my career, and 25 years later I strongly consider it 100% waste of time. Real world problems defined by real world constraints rarely never fit into pure academic constructs.)

1 Like

Satellites are supposed to come and go in future.

That was exactly my thought about the stat call in the original thread, yet it seems for some environments it does matter.

1 Like

Maybe then folder creation can be tied to trust/untrust.

I have measured, not for the Go code and not for the node, but well… I use mkdir -p for years to do not break my scripts just because someone (for example - me) deleted a required folder.
From my tests those days if [ ! -d "dir" ]; then mkdir -p "dir"; fi was a much slower on amounts after 1,000. And I needed a very similar directory structure.

1 Like

I’ve also heard rumors of the heat death of the universe at some point.

I’m 100% sure a simple pull of the trusted satellite list on node restart and a forked process to pre-create the directories is easy to implement.