Logs: liberachat/#haskell
| 2025-10-11 09:54:10 | → | flukiluke joins (~m-7humut@2603:c023:c000:6c7e:8945:ad24:9113:a962) |
| 2025-10-11 09:54:21 | → | tromp joins (~textual@2001:1c00:3487:1b00:409c:634b:fec4:4fe) |
| 2025-10-11 09:56:05 | → | merijn joins (~merijn@host-vr.cgnat-g.v4.dfn.nl) |
| 2025-10-11 10:01:07 | × | merijn quits (~merijn@host-vr.cgnat-g.v4.dfn.nl) (Ping timeout: 255 seconds) |
| 2025-10-11 10:11:52 | → | merijn joins (~merijn@host-vr.cgnat-g.v4.dfn.nl) |
| 2025-10-11 10:16:33 | × | merijn quits (~merijn@host-vr.cgnat-g.v4.dfn.nl) (Ping timeout: 256 seconds) |
| 2025-10-11 10:26:34 | × | flukiluke quits (~m-7humut@2603:c023:c000:6c7e:8945:ad24:9113:a962) (Remote host closed the connection) |
| 2025-10-11 10:26:54 | → | flukiluke joins (~m-7humut@2603:c023:c000:6c7e:8945:ad24:9113:a962) |
| 2025-10-11 10:27:24 | → | merijn joins (~merijn@host-vr.cgnat-g.v4.dfn.nl) |
| 2025-10-11 10:32:25 | × | merijn quits (~merijn@host-vr.cgnat-g.v4.dfn.nl) (Ping timeout: 264 seconds) |
| 2025-10-11 10:37:41 | → | jespada joins (~jespada@2800:a4:235c:ac00:7055:4dec:d47b:1a6e) |
| 2025-10-11 10:41:18 | → | merijn joins (~merijn@host-vr.cgnat-g.v4.dfn.nl) |
| 2025-10-11 10:44:36 | × | nckx quits (~nckx@libera/staff/owl/nckx) (Ping timeout: 256 seconds) |
| 2025-10-11 10:46:51 | × | merijn quits (~merijn@host-vr.cgnat-g.v4.dfn.nl) (Ping timeout: 252 seconds) |
| 2025-10-11 10:50:42 | <[exa]> | tomsmeding: just curious, how much memory does the paste.tomsmeding.com thingy occupy normally? I wanted to deploy it on a tiny VM so kinda wondering if it's gonna get it OOM'd |
| 2025-10-11 10:51:43 | <tomsmeding> | [exa]: uh, it's currently at 1.8M RES |
| 2025-10-11 10:51:48 | <tomsmeding> | I can monitor it for a while if you want |
| 2025-10-11 10:51:55 | <tomsmeding> | no |
| 2025-10-11 10:51:58 | <tomsmeding> | I lie |
| 2025-10-11 10:52:02 | <tomsmeding> | 42.8M RES |
| 2025-10-11 10:52:11 | <tomsmeding> | the 1.8M is the bash script that launches it. lol |
| 2025-10-11 10:52:14 | <[exa]> | sounds okay |
| 2025-10-11 10:52:36 | <[exa]> | like, I expect it's gonna spike randomly at some point but occasional OOM kill is m'kay |
| 2025-10-11 10:52:37 | <tomsmeding> | it doesn't do very much |
| 2025-10-11 10:52:48 | <tomsmeding> | why would it spike? |
| 2025-10-11 10:52:57 | <tomsmeding> | if you DOS it then it'll spike I guess, yeah |
| 2025-10-11 10:53:13 | <[exa]> | no idea... I got forgejo on the same server and that one spikes :D |
| 2025-10-11 10:53:23 | <tomsmeding> | this thing does almost nothing |
| 2025-10-11 10:53:31 | <[exa]> | but mainly because of being DoSed by llm folk |
| 2025-10-11 10:53:36 | <tomsmeding> | yeah |
| 2025-10-11 10:53:43 | <[exa]> | are there any safeguards (ratelimit etc?) |
| 2025-10-11 10:53:49 | <tomsmeding> | no |
| 2025-10-11 10:54:15 | <tomsmeding> | if you have any sensible ideas for how to add those safeguards, please tell me |
| 2025-10-11 10:54:16 | <[exa]> | I might send a PR if I decide |
| 2025-10-11 10:54:42 | <tomsmeding> | IP-based is pointless in a world with cgNAT and IPv6 |
| 2025-10-11 10:54:53 | <tomsmeding> | and absent accounts there's little else I can do |
| 2025-10-11 10:55:04 | <tomsmeding> | but read access is stupid fast, it's just reads from a sqlite db |
| 2025-10-11 10:55:13 | → | chromoblob joins (~chromoblo@user/chromob1ot1c) |
| 2025-10-11 10:55:23 | <[exa]> | yeah the main point is to avoid people from overfilling it with nonsense |
| 2025-10-11 10:55:34 | <tomsmeding> | I know |
| 2025-10-11 10:55:43 | <tomsmeding> | as I said, please do enlighten me if you have good ideas |
| 2025-10-11 10:55:59 | → | gustrb joins (~gustrb@191.243.134.87) |
| 2025-10-11 10:56:02 | <[exa]> | if you did traffic scheduling, there's SFQ and TBF algorithms and these pretty much do it |
| 2025-10-11 10:56:19 | <tomsmeding> | is that network-level? |
| 2025-10-11 10:56:23 | <[exa]> | (man 8 tc-sfq tc-tbf) |
| 2025-10-11 10:56:29 | <tomsmeding> | right |
| 2025-10-11 10:56:31 | <[exa]> | that's for packets but generally applicable to anything |
| 2025-10-11 10:56:43 | <[exa]> | I'd just reimplement on the incoming pastes |
| 2025-10-11 10:57:01 | → | nckx joins (~nckx@libera/staff/owl/nckx) |
| 2025-10-11 10:57:09 | <tomsmeding> | but how does that prevent someone from spawning 1e9 POSTs from a VPS with >1e9 IPv6 addresses? |
| 2025-10-11 10:57:23 | <tomsmeding> | I guess IP-based rate limiting raises the effort floor a little bit |
| 2025-10-11 10:57:38 | <[exa]> | you cut the v6 addresses on prefix usually |
| 2025-10-11 10:58:32 | <tomsmeding> | if you know something about this and have time to implement it, feel free :p |
| 2025-10-11 10:58:44 | <tomsmeding> | I'm more concerned about the playground, honestly |
| 2025-10-11 10:59:16 | → | merijn joins (~merijn@host-vr.cgnat-g.v4.dfn.nl) |
| 2025-10-11 10:59:17 | <tomsmeding> | there's little I can do there with run requests, because I don't want to make, say, a school class behind a NAT ratelimit each other |
| 2025-10-11 10:59:34 | <tomsmeding> | but if you hack something for pastes then the same could apply for saves perhaps |
| 2025-10-11 11:00:04 | × | caconym747879 quits (~caconym@user/caconym) (Quit: bye) |
| 2025-10-11 11:00:32 | × | gustrb quits (~gustrb@191.243.134.87) (Ping timeout: 240 seconds) |
| 2025-10-11 11:00:38 | <tomsmeding> | [exa]: what do you want to use the thing for? |
| 2025-10-11 11:02:07 | → | caconym747879 joins (~caconym@user/caconym) |
| 2025-10-11 11:02:13 | <tomsmeding> | [exa]: I may be misunderstanding, but isn't stuff like SFQ for DOS protection on the network level? |
| 2025-10-11 11:02:29 | <tomsmeding> | if you saturate the network with paste store POST requests, disk will be full within, like, 10 seconds |
| 2025-10-11 11:03:33 | <tomsmeding> | for something like this to be useful as rate limiter on writes, you'd need to set a limit on bandwidth that is RIDICULOUSLY low in the context of normal network traffic management |
| 2025-10-11 11:04:09 | × | merijn quits (~merijn@host-vr.cgnat-g.v4.dfn.nl) (Ping timeout: 256 seconds) |
| 2025-10-11 11:04:11 | <tomsmeding> | and that doesn't sound like a good idea |
| 2025-10-11 11:04:43 | × | lxsameer quits (~lxsameer@Serene/lxsameer) (Ping timeout: 256 seconds) |
| 2025-10-11 11:06:34 | × | chromoblob quits (~chromoblo@user/chromob1ot1c) (Read error: Connection reset by peer) |
| 2025-10-11 11:06:54 | → | chromoblob joins (~chromoblo@user/chromob1ot1c) |
| 2025-10-11 11:08:14 | <tomsmeding> | [exa]: I wrote this at some point but then as I said, I took it out because it's not helpful as-is https://github.com/haskell/play-haskell/blob/master/snap-server-utils/src/Snap/Server/Utils/SpamDetect.hs |
| 2025-10-11 11:12:41 | → | gmg joins (~user@user/gehmehgeh) |
| 2025-10-11 11:15:01 | → | merijn joins (~merijn@host-vr.cgnat-g.v4.dfn.nl) |
| 2025-10-11 11:19:19 | × | merijn quits (~merijn@host-vr.cgnat-g.v4.dfn.nl) (Ping timeout: 240 seconds) |
| 2025-10-11 11:21:49 | → | fp joins (~Thunderbi@37-33-224-33.bb.dnainternet.fi) |
| 2025-10-11 11:22:56 | <[exa]> | tomsmeding: yeah SFQ makes a rate limiter which doesn't cut off everyone in case the rate is hit |
| 2025-10-11 11:23:36 | <[exa]> | TBF alg is a rate limiter which does the cutoff above certain rate, but there's a "burst" allowed which helps to prevent stuff like "you sent 1 api call, now wait 10 seconds!" |
| 2025-10-11 11:23:51 | <tomsmeding> | right |
| 2025-10-11 11:23:55 | <[exa]> | anyway in this case you might like tarpitting, that's from e-mail and makes wonders :D |
| 2025-10-11 11:24:51 | <tomsmeding> | the thing I linked just now is a strictly per-IP rate limiter that works by accumulating a "spam score" per IP address that decreases exponentially and has a certain limit; if it comes above that limit, the request is rejected |
| 2025-10-11 11:24:57 | <[exa]> | you KINDA have a tarpit there already ("but the account is still incremented"), the idea of tarpitting is that the "still incremented" increments more if people ignore the fact they've been ratelimited |
| 2025-10-11 11:24:59 | × | fp quits (~Thunderbi@37-33-224-33.bb.dnainternet.fi) (Client Quit) |
| 2025-10-11 11:25:07 | <tomsmeding> | that results in a burst allowance, but bursting makes you need to wait longer |
| 2025-10-11 11:25:18 | → | fp joins (~Thunderbi@37-33-224-33.bb.dnainternet.fi) |
| 2025-10-11 11:25:26 | <tomsmeding> | works nicely but the per-IP nature makes it suck |
| 2025-10-11 11:25:42 | <tomsmeding> | ah |
| 2025-10-11 11:26:02 | <[exa]> | yap that's essentially TBF; you can hash the source IPs to say 1024 buckets and you have SFQ on top of that, and it's hard to do anything much better |
| 2025-10-11 11:26:40 | <[exa]> | re ipv6 problem -- just take the part of the address that people can't fake easily (like a first third or so) |
| 2025-10-11 11:26:50 | <tomsmeding> | what about false positives? |
| 2025-10-11 11:26:52 | → | CiaoSen joins (~Jura@2a02:8071:64e1:da0:5a47:caff:fe78:33db) |
| 2025-10-11 11:27:10 | <tomsmeding> | I guess for a pastebin that's not so deadly |
| 2025-10-11 11:27:31 | <[exa]> | show nice error message, you can't do anything. SFQ is there exactly to limit the impact of the false positives |
| 2025-10-11 11:28:00 | <tomsmeding> | SFQ gives you a probability to still get through even if you're rate-limited? |
| 2025-10-11 11:28:12 | <[exa]> | (maybe I sholdn't call this SFQ in this context since you essentially don't queue stuff, this is just buckets) |
| 2025-10-11 11:28:36 | <[exa]> | yeah with SFQ if there's a ratelimited IP(or bucket), the other buckets are unaffected |
| 2025-10-11 11:28:58 | <tomsmeding> | no what I mean with false positives is that with IP-based bucketing, you're going to bucket bucketloads (heh) of people in the same bucket |
| 2025-10-11 11:29:13 | <[exa]> | ah yes that's unavoidable |
| 2025-10-11 11:29:20 | <tomsmeding> | schools with NAT, ISPs with cgNAT or handing out small IPv6 prefixes |
| 2025-10-11 11:29:23 | <[exa]> | but everyone does that (even google) and it's OK |
| 2025-10-11 11:29:42 | <tomsmeding> | s/small/long/ |
| 2025-10-11 11:30:06 | <[exa]> | the usual solution is to require a bit more auth at that point (captcha?) |
All times are in UTC.