Another AI-generated web dashboard

Sneak peak: Latency analisys

2025-10-04 21:58:07 [WARNING] [StorjMonitor.LogProcessor] [Node 1] High calculated duration: 34704ms for PUT, piece=YUCUZCIA5QETVWGC...
storj# grep YUCUZCIA5QETVW  /var/log/storj.log
2025-10-04T21:57:32-07:00	DEBUG	piecestore	upload started	{"Process": "storagenode", "Piece ID": "YUCUZCIA5QETVWGCFJPDZONQBZSWUUB6CEKDNIKBTZVTUD66VYEQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "PUT", "Remote Address": "79.127.219.42:35642", "Available Space": 14543107381248}
2025-10-04T21:58:07-07:00	INFO	piecestore	uploaded	{"Process": "storagenode", "Piece ID": "YUCUZCIA5QETVWGCFJPDZONQBZSWUUB6CEKDNIKBTZVTUD66VYEQ", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "PUT", "Remote Address": "79.127.219.42:35642", "Size": 656640}

See what happened here? upload took 35 seconds!? Holy moly!

Or look at thsi crap:

storj# grep PU5JU5J47IS53KSVQM  /var/log/storj.log
2025-10-04T22:33:14-07:00	DEBUG	piecestore	upload started	{"Process": "storagenode", "Piece ID": "PU5JU5J47IS53KSVQMA6WYD75YLASWR4IZUSYPV3OB3B7NJDG72Q", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Action": "PUT", "Remote Address": "157.48.251.24:51198", "Available Space": 14541764176384}
2025-10-04T22:34:18-07:00	INFO	piecestore	uploaded	{"Process": "storagenode", "Piece ID": "PU5JU5J47IS53KSVQMA6WYD75YLASWR4IZUSYPV3OB3B7NJDG72Q", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Action": "PUT", "Remote Address": "157.48.251.24:51198", "Size": 2319360}

Are you saying the client was uploading 2MB of data for 64 seconds!?

(There was a dicrepancy when DEBUG and INFO messges woudl arrive delayed by different amount, so time-of-arrival could not be used, and timestamp in the log is is rounted to 1 sec. Now it uses time of arrival if it is within 4 sec, othersiee – timestamp. Hopfeully timestamp in the log is reliable)

1 Like

Hello,

i pulled the newest update from GitHub (The Modular one). And while the Hashstore Compaction list works perfectly fine now, the problem with the calculated time still persist.
If i select one node speceficly, the time gets updated 2-3 Seconds later and then the stats are fine, but if i select all or specific ones, the stats and times are old and random. And a can assure, that the forwarder is not crashed, because the dashboard still recieves the data.
Could it maybe be because the events are in memory, and if you have the dashboard open for too long, the memory wont get refreshed at some time?
image

image

image
(all taken at 16:25)

1 Like

I’m not sure if you’ve seen this yourself, but this seems like it might be screwing overall stats:

I think the top one should be handled as a “Pure reclaim” if it’s less than x size?

1 Like

I haven’t seen this. Can you find the corresponding log message? Did node actually report doing compaction to save 7 bytes?! Or did the tool fail at parsing.

Modular branch is merged into main and no longer exists.

This whackery with data and ranges now fixed in monstroucity branch.

I also cherry-picked it into main.

So if you checkout main, you shall essentially get only that fix on top of modular.

If you want to try monstrocity branch (not really production quality) you’ll need to delete the database.

1 Like

i did a clone of main now, to get the new updates. I will report back as soon as i find another bug :slight_smile:

is the monstroucity branch your dev branch?

No :wink: Monstrocity branch is a result of me asking Sonnet “This is a Storj Storagenode log anlyzer and realtime dashboard. Think about what other information can user benefit from being surfaced and visualized here, brainstorm any possible improvements, plan, and implement them”. $50 later it’s quite monstrous, but buggy… It now tries to detect and talk to API endpoint, track a bunch of stuff, support alerts and notifications, and a buch more that I’m yet to discover.

Alert example:

you could git checkout main followed by git pull

2 Likes

I switched as well to ‘new main’ and go public if anyone is interested:
hwm.land StoragePro Monitor
3 wishes:

  1. On the https://nodes.arrogantrabbit.com/ you have separated repair trafic on the by satelite chart. Will it be available?
  2. Would be cool to have upload and download side-by-side, not stacked there
  3. On the Data Transfer Size Distribution it would be cool to see Size as well per bucket size (not sure how much work it is to sum-up real piece sizes instead of just counting them)
2 Likes

Hello,
I went ahead and opened a pull request for task.py. it fixed the problem, that aggregated results get stuck on specific time. I hope that is ok for you and you want to use it :slight_smile:

2 Likes

Thank you, merged.

I’m having LLM iterate on the monstrocity branch, does it also have the same issue? I think it may be fixed there.

It’s in the monstrocity branch

Agreed

That one seems to be broken.. There was an option to see sizes, but I guess it snealily removed it.

1 Like

It looks like a parsing issue:
Skærmbillede 2025-10-07 kl. 08.12.29

Logs | 2025-10-05T14:46:17Z

2025-10-05T14:46:17Z INFO hashstore finished compaction {"Process": "storagenode", "satellite": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "store": "s1", "duration": "1m10.922556539s", "stats": {"NumLogs":132,"LenLogs":"118.0 GiB","NumLogsTTL":14,"LenLogsTTL":"0.8 GiB","SetPercent":0.8106540900433342,"TrashPercent":0.05277408316993322,"TTLPercent":0.009498393560939502,"Compacting":false,"Compactions":3,"Today":20366,"LastCompact":20366,"LogsRewritten":15,"DataRewritten":"7.05 GB","DataReclaimed":"11.6 GiB","DataReclaimable":"22.3 GiB","Table":{"NumSet":580347,"LenSet":"95.6 GiB","AvgSet":176927.21983227276,"NumTrash":31625,"LenTrash":"6.2 GiB","AvgTrash":211366.90188142293,"NumTTL":3108,"LenTTL":"1.1 GiB","AvgTTL":387093.5804375804,"NumSlots":2097152,"TableSize":"128.0 MiB","Load":0.276731014251709,"Created":20366,"Kind":0},"Compaction":{"Elapsed":0,"Remaining":0,"TotalRecords":0,"ProcessedRecords":0}}}

Here’s two more i found:
Skærmbillede 2025-10-07 kl. 08.13.37

Logs | 2025-10-07T05:02:10Z

2025-10-07T05:02:10Z INFO hashstore finished compaction {"Process": "storagenode", "satellite": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "store": "s1", "duration": "15m5.961276795s", "stats": {"NumLogs":3311,"LenLogs":"3.2 TiB","NumLogsTTL":31,"LenLogsTTL":"10.0 GiB","SetPercent":0.8468547844512657,"TrashPercent":0.013782983214654772,"TTLPercent":0.002962680243777753,"Compacting":false,"Compactions":3,"Today":20368,"LastCompact":20368,"LogsRewritten":272,"DataRewritten":"181.90 GB","DataReclaimed":"265.5 GiB","DataReclaimable":"504.2 GiB","Table":{"NumSet":12537497,"LenSet":"2.7 TiB","AvgSet":238783.80933467022,"NumTrash":142706,"LenTrash":"45.4 GiB","AvgTrash":341434.78693257464,"NumTTL":121792,"LenTTL":"9.8 GiB","AvgTTL":85994.91905872307,"NumSlots":33554432,"TableSize":"2.0 GiB","Load":0.3736465275287628,"Created":20368,"Kind":0},"Compaction":{"Elapsed":0,"Remaining":0,"TotalRecords":0,"ProcessedRecords":0}}}

Skærmbillede 2025-10-07 kl. 08.20.33

Logs | 2025-10-02T17:35:52Z

2025-10-02T17:35:52Z INFO hashstore finished compaction {"Process": "storagenode", "satellite": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "store": "s0", "duration": "4m39.570984638s", "stats": {"NumLogs":940,"LenLogs":"0.9 TiB","NumLogsTTL":39,"LenLogsTTL":"1.0 GiB","SetPercent":0.9311088597720516,"TrashPercent":0.04804234429559273,"TTLPercent":0.008117497721108171,"Compacting":false,"Compactions":0,"Today":20363,"LastCompact":20363,"LogsRewritten":9,"DataRewritten":"4.63 GB","DataReclaimed":"8.0 GiB","DataReclaimable":"62.1 GiB","Table":{"NumSet":3558934,"LenSet":"0.8 TiB","AvgSet":253384.1922238513,"NumTrash":108909,"LenTrash":"43.3 GiB","AvgTrash":427227.7098862353,"NumTTL":7183,"LenTTL":"7.3 GiB","AvgTTL":1094498.828901573,"NumSlots":8388608,"TableSize":"512.0 MiB","Load":0.4242579936981201,"Created":20363,"Kind":0},"Compaction":{"Elapsed":0,"Remaining":0,"TotalRecords":0,"ProcessedRecords":0}}}

They are on 3 different nodes.

1 Like

It was parsing GiB, but not GB :smiley: (I peeked at a diff, as an exception to my treating it as a blackbox rule). Fixed.

Done.

Done.

Also fixed the “Live Performance” graph that was completely broken.

3 Likes

Aaaaa. Perfect, I switched there and now it’s up and running. Thank you for other fixes/changes you made. However I found bug there preventing start when you have no ‘previous’ DB.
In the file database.py you need to move section

    # PERFORMANCE OPTIMIZATION: Add index for storage snapshot queries
    cursor.execute("SELECT 1 FROM sqlite_master WHERE type='index' AND name='idx_storage_earnings'")
    if not cursor.fetchone():
        log.info("Creating storage earnings index...")
        cursor.execute('CREATE INDEX idx_storage_earnings ON storage_snapshots (node_name, timestamp DESC, used_bytes);')
        log.info("Storage earnings index created.")

down under table creation

    # --- Storage Snapshots Table (Phase 2.2) ---
    cursor.execute('''
        CREATE TABLE IF NOT EXISTS storage_snapshots (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            timestamp DATETIME NOT NULL,
            node_name TEXT NOT NULL,
            total_bytes INTEGER,
            used_bytes INTEGER,
            available_bytes INTEGER,
            trash_bytes INTEGER,
            used_percent REAL,
            trash_percent REAL,
            available_percent REAL
        )
    ''')

With this change it started like a charm.

Lol. I’ll ask it to write comprehensive tests. Issues like these keep happening.

(Browser page crashes). You are absolutely correct, I’m calling a function that does not exist yet. Here is the final version

(Now app does not start, missing import). You are absolutely correct, while fixing the page I accidentally removed an import that should not be removed. Fixed.

(…)

Not me. I did not make anything. Can’t stress this enough. The ai did. I commissioned the work, if you will, by paying with money for api calls, and for small changes - with electricity running GPU on my Mac mini fully loaded at 25 Watts for hours.

I don’t own the code, I don’t read this code, and I don’t edit this code. This is not a reflection of how I would design, implement, or test anything.

At this point I’m pretty much just passing forum comments and git commit/git push the result.

2 Likes

even still, I’m sure you’re spending or have spent well over 8 hours on this and so you spent time and effort whatever form it may be.

I will add, I think saving some logs and having it write automated tests, and have those tests auto run each time you start the server and throw big red errors if found would help catch these issues.
Was going to be my next suggestion anyways

Yeah, good idea. I should have asked it to write tests upfront before starting implementing anything. All add that. Tests, and linters. I’m not sure if any good linters exist for javascript code – but I’ll ask it to find out.

It definitely works better with more systematic approach, compare to one-shot prompting – see files in the docs folder: roadmaps it generated in Architect mode, I’m using them to guide further interactions: “Implement the stage 6.1 from the document xxx”.

I’m yet to find a good open source model that works well with agentic coding tools and does not lose its mind. No amount of “you MUST not repeat yourself” prevents some of them from repeating themselves and spiraling into madness..

Any recommendations? I have 64GB of unified memory, I think 48 of it can be wired. At least 64k context window is needed, and everything shall fit into 48GB… Nothing I tried so far comes even remotely close to Sonnet.

Hello,
i am checking out the monster branch and have one question:
Why is Trash not counted as used space?
And i am wondering why stats are that much different:
Local dashboard:


Multinode Dashboard:

Your Dashboard:

Now is the question: Who is right? :joy:

Oh yeah and how can the Dashboard send an alarm:


when it even doesnt know how long it takes itself?:

Lol :slight_smile:

That data comes from node API. For the alert — it tries to predict based on growth rate but if you have very few data points variance will be massive. We shall fix it to avoid making wild claims unless sufficient amount of data has been accumulated. I had to tell it to do that for the dashboard data — that’s why it’s showing N/A but I guess alerts did not get a memo

If you’re willing to go into the rabbit hole… building some tooling around execution of LLMs enhances them immensely. Caveat is, it’s hard to make this kind of tooling generic enough for all types of coding. Some things I’ve built myself and/or seen folks doing:

  • Force red-green-refactor cycle as much as possible. In my case, I explicitly request separate git commits with red, green, and refactor steps.
  • Use git commit hooks for verifying correctness. E.g., main agent should not be able to commit a green step unless the commit passes all unit tests.
  • Use subagents as reviewers in commit hooks, and make it so if the subagent sees something wrong, commit is stopped. E.g., request review of red commits that checks that only added tests fail. Fun stuff: I’ve initially seen the main agent sometiems ignore reviews. Then I asked the review to be cheerful and encouraging. Suddently, the main agent was no longer questioning any reviews :rofl:
  • Use a parallel agent that is fed all partial responses of the main agent and is asked to review if the main agent doesn’t exhibit hallucinations, say anything about doing ugly workarounds, or goes in cycles.
  • Run 2-3 attempts for each task: Each time asking the model to summarize what went well and what didn’t. Then, start with a fresh context adding this summary to the prompt.

I don’t have much coding experience with open-weight models, but these items have let sonnet 3.7/4 be quite a bit smarter. And if you want to save on tokens, at least some of these non-coding subagents should probably work well on, let say, gpt-oss:20b.

2 Likes