Not all images are loading on Mastodon instance, and some new images cannot be uploaded

We have some images not load locally on our Mastodon instance using Storj as an object storage provider.

Now we also seem to have an issue with members sometimes being unable to upload new images. This thread is from a member and an admin discussing a 503 error: renkotsuban: "I can't upload any pictures to tech.lgbt at the m…" - LGBTQIA+ Tech Mastodon

It looks like some requests are timing out because the responses aren’t made in time (seems to be about 10s), which can be either of 3 things:

  • The network is too slow (this doesn’t seem likely, else we’d be seeing a lot more errors across the board, rather than every so often)
  • the requests, when they come in to Storj, are being rate-limited internally
    • which causes some requests to wait longer than others, ideally all requests should wait a little longer to make sure that only a few at a time are being processed internally.
    • however, this doesn’t work for our application, we don’t have backpressure; if a request fails, we often have a permanent error state, or we just retry, meaning the amount of requests aren’t going to decrease based on the amount of requests that are being sent out
  • Storj internal services are being a bit slower, and since we have a tighter timeout (of 10s, instead of HTTP’s default 30s), we abandon the requests prematurely

Here are some hopefully relevant logs:

# journalctl --since "13:15" | grep d5a705d1-b8fe-4e38-bf33-612941d2a401
[paperclip] Trying to link /tmp/05767ab1ac6bba1b7ee2fa24eb89b71e20230301-1823534-vxkbdt.png to /tmp/ab24d7d88374162b623d756e4800aae820230301-1823534-or12sg.png
Command :: file -b --mime '/tmp/ab24d7d88374162b623d756e4800aae820230301-1823534-or12sg.png'
[paperclip] Trying to link /tmp/05767ab1ac6bba1b7ee2fa24eb89b71e20230301-1823534-vxkbdt.png to /tmp/7cb9672d60d2bcfc93d958819be31d9620230301-1823534-36pffs.png
Command :: identify -format %m '/tmp/7cb9672d60d2bcfc93d958819be31d9620230301-1823534-36pffs.png[0]'
Command :: convert '/tmp/7cb9672d60d2bcfc93d958819be31d9620230301-1823534-36pffs.png[0]' -auto-orient -resize "640x360>" -quality 90 +profile "!icc,*" +set modify-date +set create-date '/tmp/52ce507fa89e0750d6a0e36201a6027520230301-1823534-58eilm'
Command :: convert '/tmp/52ce507fa89e0750d6a0e36201a6027520230301-1823534-58eilm' -depth 8 RGB:-
[paperclip] Trying to link /tmp/52ce507fa89e0750d6a0e36201a6027520230301-1823534-58eilm to /tmp/1b6395a57d039525cce70fe21d86e85420230301-1823534-u14deq
[paperclip] saving cache/media_attachments/files/109/947/681/633/477/739/original/8c8b72bfd3fb777e.png

Storage server error: Net::ReadTimeout with #<TCPSocket:(closed)>
method=GET path=/media_proxy/109947681633477739/original format=*/* controller=MediaProxyController action=show status=503 duration=6967.98 view=8.30 db=4.49

We’ve tried adjusting S3_READ_TIMEOUT and S3_OPEN_TIMEOUT from their defaults of 5 to 15 and restarted the mastodon-web and sidekiq services, but it does not appear to solve the problem.

1 Like

Thank you for pointing out these issues. We have escalated this problem to our dev team and they are currently working on how to best address this. Thank you for your patience.

4 Likes

I’m glad someone looks into this, as I also run both our Mastodon and Pixelfed instance with StorJ integration. And this has been one of the issues I’ve noticed as well. Even the day before yesterday, the uplink crashed out of the blue. I had to reboot it all over again. Until now, I haven’t found out why this happened, and I hope it doesn’t happen again. :man_shrugging:t2: