I have doubt when I ran the script I have received below output, When I saw Total in Disk usage it took only :Disk Average Month" only, let me know If I am wrong, Suppose it has to show 310.35 GBh
Note : I have started my node just 5 days back, my output as below :
That’s exactly right. GBh and MBm express the same storage, but in a different unit. MBm can be seen as the average amount stored for the entire calendar month. And since the payouts are expressed as $1.50 per TB per month, I figured for payout calculation purposes this number is more meaningful than GBh, which can go into massive numbers that don’t mean much to the average person.
Since I like being able to run stuff through Docker and avoid downloading scripts to the host machine, I’ve published a Docker image (source) that will run this script against /data. You can invoke it like so:
docker run --rm -v /path/to/storjdata/storage:/data cdhowie/storj_earnings:latest
@BrightSilence, I’ve subscribed to releases on your repo. If you make a Github release when you publish updates, I will be notified to update the Docker image – or you can feel free to merge my docker branch (linked above).
This script does not write to the database, so there is no possibility of corruption. I would be very interested to see evidence to the contrary.
And nothing says you can’t make a copy before invoking docker run. (Side note: making a copy can cause the copy to be corrupt since the directory listing and file copies are not a single atomic operation, so you may have to perform the copy multiple times before you get a non-corrupt database.)
Also note that nothing prevents creation of a Docker image that copies the databases before running the script. It’s just unnecessary FUD to claim that it’s safer to do this.
It is definitely not. Nodes have been lost on both windows and MacOS as a result of using databases on live nodes.
I’m now fairly certain this isn’t an issue on Linux. But the only way this script should be used on windows and MacOS is by stopping the node, copying the database files and starting the node again and using the copy with this script.
I’ve considered including the copy in the script, but then you get into different setups, different commands to stop and start and since this isn’t really needed on Linux, it’s unnecessarily disruptive there. I don’t think the docker container adds enough to merge it into my repo and I’m not sure all features like getting historic months work as well, but nonetheless it’s a nice 0 setup way to use it on Linux especially.
If you think that it can cause corruption, then the docker container can be run with the volume mounted read-only:
docker run --rm -v /path/to/storjdata/storage:/data:ro cdhowie/storj_earnings:latest
If a journal is present when the script goes to read, it will likely fail because it won’t be able to recover the journal. In that case it can simply be run again until it works.
It still surprises me that Storj claims on one hand that SQLite is fine and there’s no need for a Postgres/MySQL implementation, and then on the other hand claims that running a script against the database can kill it. Either SQLite is reliable enough to be used in production or it’s not.
FYI @BrightSilence is not a Storj Employee. And it is up to him to decide if he wants to add stuff to his scripts or not. I for one am happy that we have volunteers that share their scripts with the other users.
Let me try to shortcut this discussion by saying both can actually be kind of true. I believe that SQLite is perfectly reliable as long as it’s used on direct attached storage. SQLite documentation specifically states it should not be used over SMB or NFS like protocols. Unfortunately I believe the way docker mounts work on windows and macos cause similar problems to these network protocols.
Storj is trying to move away from Docker towards platform specific binary releases. I have personally not tried using this script on a live node with the Windows GUI version of a storagenode, but I would not be surprised if the problem no longer exists for those nodes. Try this at your own risk though.
So on the one hand, we have an employee stating that it’s safer to make a copy of the database before running a read-only script against it because that could corrupt the database. This seems to indicate that SQLite is pretty fragile, as this shouldn’t break a database. (I suspect the base cause of this would be journal recovery during writes by another process as I don’t think flock() advisory locks on aufs propagate to the host or other aufs mounts that use the same directory.)
On the other hand, we have another employee claiming that any database can get corrupted (true but we were/are talking about likelihood and not possibility) as an argument that SQLite is fine and Postgres/etc would not solve the problem.
Only one of those can really be true in this specific situation – database servers are specifically designed to deal with multiple connections and concurrent readers/writers. They would increase reliability in exactly these kinds of situations.
SQLite particularly in combination with Docker’s lack of flock() propagation with shared volumes (due to the aufs limitation) makes it a ticking time bomb for anything that tries to use the database concurrently, even though a reasonable person without knowledge of aufs (as I was before researching this specifically because of this issue) would assume that SQLite’s locking mechanism should handle the concurrent access.
As a side note, this is the same argument I have in regards to the advice to use -t 300 when stopping the container to avoid database corruption: a database should not be corrupted by unclean shutdown. That’s what journals are for. So either we don’t have to use -t 300or SQLite is too fragile to be used in production like this, but the advice from Storj is conflicting on this point. (“SQLite is fine” on the one hand but "use -t 300" on the other – pick one, it can’t be both.)
It seems rather clear at least to me that more robust options should be offered in addition to SQLite – or the flock() problem should be solved by abandoning Docker.
At any rate, all of this discussion aside, it should 100% be safe to run the script against the database concurrently within the storagenode container since everything would be using the same aufs mount and hence be able to see each other’s locks. Perhaps the advice for using these scripts should be to docker cp scripts into the storagenode container and docker exec them there.
First statement you quoted you made yourself, not a Storj employee. Second statement is not stating anything other than asking you to prove your statement, third quote is from someone who is not a Storj employee.
This was setting up the context for the second quote.
Hmm? Then what is the forum designation for an employee? Not “Storjling?”
At any rate, even if the third quote is not a Storj employee:
This discussion appears to once again show that SQLite is not reliable enough to be used in this configuration which is (to me) proof enough that other options should be considered.
I have still never heard any reply to my comment about the -t 300 recommendation. Is it the position of Storj that using -t 300 is required not to corrupt the database?
I’m still baffled by the assertion that SQLite is good enough for production and then we have threads like this that demonstrate quite clearly that it’s way too easy to break.
(Perhaps this needs to be split into another topic.)
Edit: Note that my intention is not to be argumentative or hostile. But I feel like I’m being told two different things: “SQLite is fine” – “tiptoe around SQLite because you might break it.” As an SNO, this does not give me confidence.
The second quote is not claiming any of the things you stated, nor validating your quote above it. We currently do not have special labels to identify the actual Storj employees vs external contractors/forum mods. In any case, the statement “I think executing the script inside different folder is safer” is not making the claims you stated before in your earlier post, besides which this user specifically qualified this as being their own opinion. I think it is not constructive to continue this particular conversation at this point.
Besides, I would like to add that I appreciate the usually very open communication by Storj employees. With this comes that different people can have different opinions and we shouldn’t treat Storj as a giant monolith. I think there is a lot of value to having this differing opinions out in the open, so I for one applaud any apparent contradictions rather than throwing them back in their face.
For me personally, I warn people out of an overabundance of caution stand point. I felt really bad when other SNO’s got their node in trouble as a result of using my scripts. So I take a better safe than sorry approach in my instructions. This should not be taken as a sign that SQLite is not reliable. I think it definitely is when used in the correct context.
Can someone help me interpret how to read the output of the earnings calculator? I don’t understand it. The totals don’t make a lot of sense to me. I really only make $2.84 for the entire month of November? And why does it say -not paid- for so upload and disk usage? Sorry if this sounds like a dumb question. Is there a key to these metrics somewhere?
Ingress traffic (uploads to your node) aren’t paid because you will already be paid for storing that data. Hence ingress is already good for you. Paying for both would be like paying you double for the same thing.
The bottom 2 items are basically the same thing displayed in different units. They both express how much data you’ve stored throughout the month. I find the GBm unit a bit more clear as that is just how much data was on your node on average throughout the month, but I included the TBh unit as well as that is what the web dashboard shows as well. I could list it as 1.11 USD as well, but than the total wouldn’t match anymore.
That said, if this is your first month, you should keep in mind that new nodes are in vetting phase on each satellite for about a month (until they have at least 100 successful audits on that sattelite). And during that time a node gets about 5% of the traffic it normally would. This is done to gauge whether a node is reliably holding the data it received before the network makes full use of it. This way bad nodes can be rejected before it would lead to massive loss of pieces and lots of costly repair.
I’m still confused though. For October, I supposedly made only $2.00. However, I received quite a bit more than that in Storj tokens. In September I only made $1.45 according to the earnings calculator, but also received a lot more than that in tokens. Could the disparity be explained by the “surge” pricing?
(Aside: for November I’ve only received 0.13359354 Storj so far. Weird.)
The other thing I’m wondering about is how this squares with the earnings estimator located here: https://storj.io/storage-node-estimator/ ? I have a dedicated Storj node, modeled after the nodes that the Storj team used to boostrap the network, directly wired into my router, with symmetric gigabit fiber that has been continuously online since August (or maybe June 2019). Is there a way to make my node more profitable? Are other people making more than me or is this simply par for the course at this point?