Earnings calculator (Update 2019-10-13: v7.0.1)

Since I like being able to run stuff through Docker and avoid downloading scripts to the host machine, I’ve published a Docker image (source) that will run this script against /data. You can invoke it like so:

docker run --rm -v /path/to/storjdata/storage:/data cdhowie/storj_earnings:latest

@BrightSilence, I’ve subscribed to releases on your repo. If you make a Github release when you publish updates, I will be notified to update the Docker image – or you can feel free to merge my docker branch (linked above).

@cdhowie I think executing the script inside different folder is safer and Storj will incorporate earnings feature eventually. The time to download scripts < fixing a corrupted database.

2 Likes

This script does not write to the database, so there is no possibility of corruption. I would be very interested to see evidence to the contrary.

And nothing says you can’t make a copy before invoking docker run. (Side note: making a copy can cause the copy to be corrupt since the directory listing and file copies are not a single atomic operation, so you may have to perform the copy multiple times before you get a non-corrupt database.)

Also note that nothing prevents creation of a Docker image that copies the databases before running the script. It’s just unnecessary FUD to claim that it’s safer to do this.

It is definitely not. Nodes have been lost on both windows and MacOS as a result of using databases on live nodes.

I’m now fairly certain this isn’t an issue on Linux. But the only way this script should be used on windows and MacOS is by stopping the node, copying the database files and starting the node again and using the copy with this script.

I’ve considered including the copy in the script, but then you get into different setups, different commands to stop and start and since this isn’t really needed on Linux, it’s unnecessarily disruptive there. I don’t think the docker container adds enough to merge it into my repo and I’m not sure all features like getting historic months work as well, but nonetheless it’s a nice 0 setup way to use it on Linux especially.

If you think that it can cause corruption, then the docker container can be run with the volume mounted read-only:

docker run --rm -v /path/to/storjdata/storage:/data:ro cdhowie/storj_earnings:latest

If a journal is present when the script goes to read, it will likely fail because it won’t be able to recover the journal. In that case it can simply be run again until it works.

It still surprises me that Storj claims on one hand that SQLite is fine and there’s no need for a Postgres/MySQL implementation, and then on the other hand claims that running a script against the database can kill it. Either SQLite is reliable enough to be used in production or it’s not.

FYI @BrightSilence is not a Storj Employee. And it is up to him to decide if he wants to add stuff to his scripts or not. I for one am happy that we have volunteers that share their scripts with the other users.

2 Likes

I was not directing my comment about Storj’s employee’s statements at him.

I agree.

In that case please show the specific comments where “Storj” has made those statements so we can follow up.

Let me try to shortcut this discussion by saying both can actually be kind of true. I believe that SQLite is perfectly reliable as long as it’s used on direct attached storage. SQLite documentation specifically states it should not be used over SMB or NFS like protocols. Unfortunately I believe the way docker mounts work on windows and macos cause similar problems to these network protocols.

Storj is trying to move away from Docker towards platform specific binary releases. I have personally not tried using this script on a live node with the Windows GUI version of a storagenode, but I would not be surprised if the problem no longer exists for those nodes. Try this at your own risk though.

So on the one hand, we have an employee stating that it’s safer to make a copy of the database before running a read-only script against it because that could corrupt the database. This seems to indicate that SQLite is pretty fragile, as this shouldn’t break a database. (I suspect the base cause of this would be journal recovery during writes by another process as I don’t think flock() advisory locks on aufs propagate to the host or other aufs mounts that use the same directory.)

On the other hand, we have another employee claiming that any database can get corrupted (true but we were/are talking about likelihood and not possibility) as an argument that SQLite is fine and Postgres/etc would not solve the problem.

Only one of those can really be true in this specific situation – database servers are specifically designed to deal with multiple connections and concurrent readers/writers. They would increase reliability in exactly these kinds of situations.

SQLite particularly in combination with Docker’s lack of flock() propagation with shared volumes (due to the aufs limitation) makes it a ticking time bomb for anything that tries to use the database concurrently, even though a reasonable person without knowledge of aufs (as I was before researching this specifically because of this issue) would assume that SQLite’s locking mechanism should handle the concurrent access.

As a side note, this is the same argument I have in regards to the advice to use -t 300 when stopping the container to avoid database corruption: a database should not be corrupted by unclean shutdown. That’s what journals are for. So either we don’t have to use -t 300 or SQLite is too fragile to be used in production like this, but the advice from Storj is conflicting on this point. (“SQLite is fine” on the one hand but "use -t 300" on the other – pick one, it can’t be both.)

It seems rather clear at least to me that more robust options should be offered in addition to SQLite – or the flock() problem should be solved by abandoning Docker.

At any rate, all of this discussion aside, it should 100% be safe to run the script against the database concurrently within the storagenode container since everything would be using the same aufs mount and hence be able to see each other’s locks. Perhaps the advice for using these scripts should be to docker cp scripts into the storagenode container and docker exec them there.

1 Like

First statement you quoted you made yourself, not a Storj employee. Second statement is not stating anything other than asking you to prove your statement, third quote is from someone who is not a Storj employee.

This was setting up the context for the second quote.

Hmm? Then what is the forum designation for an employee? Not “Storjling?”

At any rate, even if the third quote is not a Storj employee:

  • This discussion appears to once again show that SQLite is not reliable enough to be used in this configuration which is (to me) proof enough that other options should be considered.
  • I have still never heard any reply to my comment about the -t 300 recommendation. Is it the position of Storj that using -t 300 is required not to corrupt the database?

I’m still baffled by the assertion that SQLite is good enough for production and then we have threads like this that demonstrate quite clearly that it’s way too easy to break.

(Perhaps this needs to be split into another topic.)

Edit: Note that my intention is not to be argumentative or hostile. But I feel like I’m being told two different things: “SQLite is fine” – “tiptoe around SQLite because you might break it.” As an SNO, this does not give me confidence.

The second quote is not claiming any of the things you stated, nor validating your quote above it. We currently do not have special labels to identify the actual Storj employees vs external contractors/forum mods. In any case, the statement “I think executing the script inside different folder is safer” is not making the claims you stated before in your earlier post, besides which this user specifically qualified this as being their own opinion. I think it is not constructive to continue this particular conversation at this point.

Besides, I would like to add that I appreciate the usually very open communication by Storj employees. With this comes that different people can have different opinions and we shouldn’t treat Storj as a giant monolith. I think there is a lot of value to having this differing opinions out in the open, so I for one applaud any apparent contradictions rather than throwing them back in their face.

For me personally, I warn people out of an overabundance of caution stand point. I felt really bad when other SNO’s got their node in trouble as a result of using my scripts. So I take a better safe than sorry approach in my instructions. This should not be taken as a sign that SQLite is not reliable. I think it definitely is when used in the correct context.

3 Likes

Can someone help me interpret how to read the output of the earnings calculator? I don’t understand it. The totals don’t make a lot of sense to me. I really only make $2.84 for the entire month of November? And why does it say -not paid- for so upload and disk usage? Sorry if this sounds like a dumb question. Is there a key to these metrics somewhere?

november

Yes, you will be paid around $2.84 for November.

Ingress traffic (uploads to your node) aren’t paid because you will already be paid for storing that data. Hence ingress is already good for you. Paying for both would be like paying you double for the same thing.

The bottom 2 items are basically the same thing displayed in different units. They both express how much data you’ve stored throughout the month. I find the GBm unit a bit more clear as that is just how much data was on your node on average throughout the month, but I included the TBh unit as well as that is what the web dashboard shows as well. I could list it as 1.11 USD as well, but than the total wouldn’t match anymore.

That said, if this is your first month, you should keep in mind that new nodes are in vetting phase on each satellite for about a month (until they have at least 100 successful audits on that sattelite). And during that time a node gets about 5% of the traffic it normally would. This is done to gauge whether a node is reliably holding the data it received before the network makes full use of it. This way bad nodes can be rejected before it would lead to massive loss of pieces and lots of costly repair.

I hope that helps.

2 Likes

Hi, thanks for helping me understand this.

I’m still confused though. For October, I supposedly made only $2.00. However, I received quite a bit more than that in Storj tokens. In September I only made $1.45 according to the earnings calculator, but also received a lot more than that in tokens. Could the disparity be explained by the “surge” pricing?

(Aside: for November I’ve only received 0.13359354 Storj so far. Weird.)

The other thing I’m wondering about is how this squares with the earnings estimator located here: https://storj.io/storage-node-estimator/ ? I have a dedicated Storj node, modeled after the nodes that the Storj team used to boostrap the network, directly wired into my router, with symmetric gigabit fiber that has been continuously online since August (or maybe June 2019). Is there a way to make my node more profitable? Are other people making more than me or is this simply par for the course at this point?

Thanks!

November payouts are not done yet. Please hold off with making comparison analyses with last month’s payouts until all satellites have finished getting their payouts done. Thank you. Also, if you are modeling your node hardware after the nodes running on Raspberry pi 3 by Storj team members, I am sure there is other hardware that would give you better performance. However, it is not recommended you run out to buy new hardware just to build another Storj node.

1 Like

The months you mentioned did indeed see surge payouts. Also keep in mind the escrow being held back in the first 9 months.

As for performance, you might want to look at this thread.

The script used in that thread can be found here.

@BrightSilence I think a simple change to your script would make it safe to use against a database that is open in another container: open the databases as a file: URI with the ?immutable=1 query string (requires Python 3.4+). From the sqlite open docs:

immutable: The immutable parameter is a boolean query parameter that indicates that the database file is stored on read-only media. When immutable is set, SQLite assumes that the database file cannot be changed, even by a process with higher privilege, and so the database is opened read-only and all locking and change detection is disabled. Caution: Setting the immutable property on a database file that does in fact change can result in incorrect query results and/or SQLITE_CORRUPT errors. See also: SQLITE_IOCAP_IMMUTABLE.

This guarantees that sqlite will not perform any writes to the database (even journal recovery) but it does mean an error could be returned to your script if storagenode makes a concurrent write to the database. The script could capture these errors and retry the query until they succeed.

For an even higher degree of safety, the earnings script could be run from Docker with the storagenode databases provided as a read-only volume (:ro option to -v). This way there’s two layers of protection: sqlite in immutable mode, and if there is a sqlite bug even in this mode that tries to write to the database, aufs will reject the write to an overlay filesystem it has configured as read-only.

Obviously erroneous results could still be returned. However, this may be a happy medium between “you have to shut down your storagenode to run this script” and “you don’t shut it down but you could corrupt your database.” No shutdown and no possibility of corruption, but the very slim chance that the results are inaccurate.

1 Like