And a bit of my observations report as a sno and not as a user

from what copilot told me, it turns out that my node has egress of around 2.5-3x per tb stored compared to the overall average.
and i have only 100mbps net link for the node and on a not-very-fast nas-oriented mechanical hdd not ssd!
i was a bit of impressed so i tried to remind myself what i did to have this.

so the biggest contribution to this is using hashstore with its quick in-ram lookups instead of the old piecestore, or at least it is so that i think.
of course there are also a lot of small optimisations like having a 10g quick in-ram hdd cache with primocache, and many storage tweaks like using ReFS on a storage spaces -formatted volume and not on a raw partition, and some sata-related windows tweaks that i myself already dont remember exactly.
but i still think the biggest contribution is from the hashstore.

storj node is in a vm with dynamic memory that currently uses 4784mb ram for the whole vm, with around 430mb going to storagenode.exe
all this for 1.85tb currently stored.

from other observations i can tell that storagenode can have a very delayed start if the computer was abruptly turned off after a boo-boo of a thermal shutdown which i may sometimes have when experimenting with intense tasks in other virtual machines.
this is probably because it checks the integrity of the store, correct?

there were no other hiccups with hashstore, at least not what i may have noticed.

so hashstore doesnt eat very much ram for its lookup tables, at least not as much as i expected it would.
but the performance gain is really noticeable.

ReFS can be a problem: refs
Why are you using a virtual machine, especially one running Windows? It’s a waste of resources, and optimizing the file system on the host won’t help the OS in the virtual machine, because the virtual machine doesn’t have access to the host. The host’s filesystem manager also doesn’t have access to services in the VM. They are independent.

first of all because i prefer to have stuff separately for one isp link and for another, or else i have to do very complicated routing rules.
second, it eases the inspection of things when there are surprise spikes in activity.

and refs does not seem to be a problem anymore.
although it was a very big problem when i enabled defer-writes in primocache and lost the volume data several times after sudden power-offs.
not using defer-writes anymore and all is okay.
a bit of a pity, but c’est la vie.

to sum up, i am already quite satisfied with the 2.5-3x egress compared to the overall average node, despite my not-very-fancy hardware and this configuration.
of course if the datastore would be local, it might be even faster.

But why Windows? If you want to use VM, it’s more logical to use Linux, it at least uses less resources. However I still do not understand why not run docker directly on the host, the networking can be configured for it. Or is your host run Windows too?

host is on windows because i use GPU-P in its hyper-v.
and the default host’s link is my newer and faster isp, but which does not permit static ipv4, so the old 100mbps link remained only to this vm.
and linux uses less resources only if it is without a desktop environment, but i have enough resources and this is not critical for me personally.

the surprise was in the hashstore performance compared to the default piecestore and this is what i advise other snos to consider in the first place.

Then it’s still a waste of resources to solve such a simple problem, especially when it comes to resources required by Windows in this virtual machine.
Since your Windows host is Hyper-V, you can install Docker; it still uses a Linux virtual machine—either Hyper-V or WSL2. Or simply install a Linux virtual machine and install the Docker version of storagenode or a binary version with two services—one for storagenode and one for storagenode-updater. This will help free up at least 50% of the virtual machine’s resources currently being used by Windows.

yes, so you can imagine how faster it would be if you remove additional useless level :slight_smile:

The solution for the networking is very simple: you need just provide the IP of your second network adapter in this config option:

server.address: 192.168.1.11:28967

So you can install storagenode directly on your host, no VMs, no waste of resources.

not so simple, because that cannot be solved on windows host to have the reply packets go to the same interface the requests came from.
solvable on linux but not on a windows host.
this is that so far my only problem with windows host, all the rest doesnt matter much and is solvable.

and if the incoming comes from the old isp link with static ipv4, and the replies go to the new isp, it drops the reply packets because they dont seem to originate from its network and is considered spoof/abuse.

so i cannot just put storagenode directly on the host, unfortunately.

the only solution to put storagenode directly on host would probably be proxifying the tcp and udp from the vm that has that static ipv4.
i know how to do it for tcp but not for udp.
any insight on this and is it at all possible, given that udp does not have the notion of connection?

is this an issue to worth the hassle, given that even in this configuration it already performs much above average? :slight_smile:

and if to analyze it even more coldly, what would i cut?
just giving filesystem operations closer to storagenode.exe and eliminating the networked layer between vm and host?
that is microseconds, not even milliseconds!
oh do i have such a load with storagenode to worth it?
does it make even a dent given that hardware is hdd and not ssd?

but switching from piecestore to hashstore is considerable and i advocate for it.
the only question that remains for me is if this is what rpi/nas nodes could afford…

I understand hashstore benefits… in theory. But I’ve also read enough posts about the hashtable/datalog corruptions and rebuilds that I’ll wait for Storj to decide when to migrate. Asia is half-done and I guess Europe will be next?

Every node release seems to have some hashstore tweaks: so I’ll wait for as many of them as possible :slight_smile:

that probably is also influenced by the underlying fs.

and i also read enough about data reported and actually used, which sometimes dont match.
but last ive checked recently it seems close to all okay.
well maybe for some gaps in data files from data that was deleted, but that already seems in the reasonable limits.

well that already depends whether you have lots of data or not so much (like me) and can afford to risk losing that then waiting for it to replentish.

i experimented with hashstore early since its launch and seen many bumps along the road.
but now it seems all peaceful.

lost my node many times, but mostly not because of hashstore but because of my attempts to force-cache writes to the refs volume.

pretendably resilient filesystem, but when some data misses being actually written due to some hardware glitch (or VRM overheat and thermal shutdown as in my case) it is poof the entire volume all gone!

Not exactly resilient, right?

ReFS is terrible at least that was my experience:

Since then I can live without it.

the most resilient fs in my experience was HPFS on os/2.
btw, it was also made by a microsoft employee (named Gordon Letwin) back in times of os/2 1.x when msft and ibm still worked together on that.
(wink to Alexey: correct?) :slight_smile:

of course there may be life without it. :slight_smile:
in my case, all seems normal after i renounced to defer-write with primocache.

btw, what i personally gain from refs is the deduplication.
ntfs dedupe works miserably and very very slowly so then i decided to try refs dedupe.
thats not for storj, but for my vm vhdx files given that i have many similar vms on the host.
it successfully deduped over a terabyte.

of course storjdata folder is excluded from dedupe and also several other ones.

but vms live on an ssd (also refs formatted), hdd is used only for temporary vm data export prior to archiving and uploading onto storj.
decided that this job will go to the hdd because i want to avoid ssd write wearing; i backup vms quite often.

will observe for some time how this config behaves and will draw conclusions after that.

most likely is that i will migrate all dedupable stuff to the ssd and if refs will belly-up on hdd again, i will reformat the hdd to ntfs.

of course life would be easy if there would be only storj on the hdd, but here we are, there are some archives on it too, including some system recovery things that i might need even when i dont have access to cloud storage so i must keep this locally accessible at all times.

With the improvements in hashstore… most of the network will probably just slap their nodes on ext4… and then ignore it and go live their lives. The slowest part is normally a persons Internet connection: the storage setup doesn’t need to be fancy.

i dunno for others, but in my particular case the lazy-crazy piecestore walker made hdd heads move almost all the time.
of course no such artifact with hashstore.

A great feature. Normally. But if ReFS leaves nothing to deduplicate then it does not help much.

maybe you wanted to say storj, not refs?
storj has nothing to dedupe, but my puter’s life ain’t only storj. :slight_smile: